CS计算机代考程序代写 database algorithm Breaking Image CAPTCHA for fun

Breaking Image CAPTCHA for fun
Frank Tse, Nexusguard

Agenda
1
CAPTCHA and web services
2
General CAPTCHA breaking method
3
Alternative form
4
Analytic and optimized method
Page§2

About us
§ We handle DDoS attack everyday
§ We face and fight with bots everyday § Research in cryptography, imaging and coding § Research both attack and defence methods
Page§3

CAPTCHA and web services
§ Puzzle for machine
Page§4

CAPTCHA and web services
§ Puzzle for human
Our target “super star” todayà

Security king ?
Security
Washing Machine
Smart Phone
Spaceship
Functionality
Ease of use
T-shirt

CAPTCHA in our eyes
Security
Security Professionals
Programmer
Functionality
Ease of use
End users

Slide-to-fit Captcha
§ The good
– Similar to ‘slide-to-unlock’ type authentication
– It’s user-friendly and with higher successful rate – Works fine with HTML5 without Flash
– I pick it because it responses to attackers
– Opportunity for advertisers and sponsors
§ The bad
– Heavy traffic loading ( ~30 Images)
– Easy to break by nature
– Single tier, single image transformation type

General CAPTCHA breaking method
§ Lock breaking – Bypass
– Skill
– Brute force
http://paxtonlocksmithing.com/blog/2012/02/20/credit-cards-used-to-open-doors/

4 Top LockPicking Tools


http://toool.nl/blackbag/images/itl2000.jpg

General CAPTCHA breaking method § CAPTCHA breaking
– Bypass
– Alternative form
– Skill
– OCR
– Statistic
– Curve-fitting (FFT) – Analytic
– Brute force
– Database matching – Effective brute force

Some academic stuffs
§ Fast Fourier Transform (FFT)
– Calculate how ‘blur’ the image is
§ Histogram
– Distribution of data by frequency (photo lighting) – Used to detect artificial background
§ Longest path-finding
– Opposite to ‘shortest path’ by Dijkstra’s Algorithm – Used to detect how serious the image was twisted

Image CAPTCHA evolution
• No padding • No Blur
Ver 1
Ver 2
• Padding \00 • No Blur
• Padding \00 • Blur
Ver 3
Attack Method
Effectiveness
Alternative Form
Good
Good
Good
Simple Statistic
Great
Poor
Poor
Modified statistic
Great
Great
Poor
FFT
Great
Poor
Poor
Analytic (Path, BG)
Great
Great
Great

Alternative form
§ According to W3C Web Content Accessibility Guide (WCAG 2.0) aka ISO/IEC 40500:2012
§ Guideline 1.1 Text Alternatives
§ 1.1.1 Non-text Content: All non-text content that is presented to the user has a text alternative that serves the equivalent purpose, except for the situations listed below. (Level A)
§ CAPTCHA: If the purpose of non-text content is to confirm that content is being accessed by a person rather than a computer, then text alternatives that identify and describe the purpose of the non-text content are provided, and alternative forms of CAPTCHA using output modes for different types of sensory perception are provided to accommodate different disabilities.
§ Attack on the weakest alternative form

Alternative form
• GoogleVoiceAPI
• Pre-recordedfemalevoice
• Indicatesthedirectionofcorrect image
• Slideright/left
• Slideslightlyright/left
• Youareontherightimage
• Voiceisveryuser-friendly
• Voicecanberecognizedby Google Speech-to-text and convert to textJ

Image File Size
33000
31000
29000
27000
25000
23000
21000
19000
17000
15000
Set1 Set2 Set3
1 3 5 7 9 11131517192123252729

Optimizing the algorithm § The Key-space
– Traditional CAPTCHA: 1 out of ~36n – (0.00006 % for brute force when n=4)
– Slide-to-fit : 5 out 31
– 16% by blind brute-force
– Correct image at border (1-3 or 28-31) is about 7%
§ Use HTTP HEAD instead of GET
– Image size was included in header – Bandwidth saved for 99%
§ Get only partial of the whole image set
– Getting min of 5 sample images, 95% of answers are correct – All linear transformation can be solved by shortcut

Image File Size with \00 Padding
35000
30000
25000
20000
15000
10000
5000
0
1 3 5 7 9 11131517192123252729
Total Size Image Data \00 Padding

Contrast Detection

Contrast Detection § Rule #1
– Contrast of an image will reduce when it’s processed with lossy- compression
§ Rule #2
– Contrast is calculated by differences of adjacent image points
§ Rule #3
– Contrast didn’t care about color
§ Rule #4
– Image with higher sum of contrast is usually sharp

Contrast Detection
Inspected images
Contrast

Well, we make the correct image “not that contrast”
by lossy JPEG compression

Image File Size with \00 Padding & not that contrast
35000
30000
25000
20000
15000
10000
5000
0
1 3 5 7 9 11131517192123252729
Total Size Image Data \00 Padding

Image File Size with \00 Padding & not that contrast
500 300 100
-100 -300 -500 -700
1 3 5 7 9 11 131517192123252729
Image Data delta(Image Data)

Well, we make the ALL images
in similar size
by lossy JPEG compression with target size

// Generate JPG file with targeted file size
// jpg_size.py
target_size = sys.argv[1]
jpg_ql = 0
jpg_qh = 100
ε = 200 // bytes
steps = 10
while (steps >0):
current_quality = (jpg_ql+jpg_qh)/2
current_size = sizeof(jpg_compress(img, current_quality))
if ( abs(current_size – target_size) < ε ): break if ( current_size > target_size ): jpg_qh = current_quality
if ( current_size < target_size ): jpg_lh = current_quality steps-=1 output = jpg_compress(img,current_quality) // Generate JPG file with ranged random target file size μ = 80000 // mean of target size σ = 400 // standard deviation of target size target_size[i] = μ + σ*(random.random()) 18.868K/20K 68.856K/70K 79.479K/80K Org: 679.54K Target size 80K w/ sd 400 80400 80300 80200 80100 80000 79900 79800 Image Data Image Data 1 3 5 7 9 11131517192123252729 Analytic § Solution #1 - The background - Background need to be filled when twisted - Complementary color or patterns can be detected § Solution #2 - The boundary - Twisted image got longer boundary § Solution #3 - The differences - Side images are tended to converge to original image, - Σ(|Δ(img[i]–img[i+1])|)convergestominimum near correct image - Compare data uses all colour data Do You Have Any Questions? Contact us at: contact@nexusguard.com Page § 30