[CB19] Shattering the dark: uncovering vulnerabilities of the dark web by Takahiro Yoshimura, Ken-ya Yoshimura

SHATTERING THE DARK
CODE BLUE 2019, BLUEBOX 4

TEXT
WHO WE ARE
▸ Ken-ya YOSHIMURA (@ad3liae)
▸ Takahiro YOSHIMURA (@alterakey)
▸ Security researchers
▸ Monolith Works Inc. CEO/CTO
https://moonlithworks.co.jp/

TEXT
WHAT WE DO
▸ Security research and development
▸ iOS/Android Apps
→Financial, Games, IoT related, etc. (>200)
→trueseeing: Non-decompiling Android Application Vulnerability Scanner
[2017]
▸ Windows/Mac/Web/HTML5 Apps
→POS, RAD tools etc.
▸ Network/Web penetration testing
→PCI-DSS etc.
▸ Search engine reconnaissance
(aka. Google Hacking)
▸ Whitebox testing
▸ Forensic analysis
▸ Research
→Clairvoyance: concurrent lip reader [2019]

TEXT
WHAT WE DO
▸ CTF
▸ Enemy10, Sutegoma2
▸ METI CTFCJ 2012 Qual.: 1st
▸ METI CTFCJ 2012: 3rd
▸ DEF CON 21 CTF: 6th
▸ DEF CON 22 OpenCTF: 4th
▸ Talks:
DEF CON 25 Demo Labs
CODE BLUE 2017
DEF CON 27 AI Village etc.
DEFCON 2016 by Wiyre Media on flickr, CC-BY 2.0

TEXT
RELATED WORKS
▸ Web application vulnerability scanners
▸ Manual: Burp Suite, ZAP etc.
▸ Automatic: WebInspect etc.

TEXT
WHAT IS THE DARK WEB?
▸ Anonymized Web on (mostly) Tor
▸ Pure freedom and anarchism
▸ Hard-ish to identify users
→ CAPTCHAs are often deployed
▸ Trafﬁc routes are randomized
→ Rather high TTLs
Onions by Mike Mozart on flickr, CC-BY 2.0

TEXT
JOKER’S STASH
▸ Fake credit card market?

TEXT
PREPARATION - TRADITIONAL
▸ Manual
▸ Crawl and build data ﬂows:
Tedious, error-prone, and not repeatable
▸ Automatic
▸ Spider:
Not so comprehensive — insufﬁcient
coverages

TEXT
SHATTER: THE IN-BETWEEN BEAUTY
▸ Our answer: Shatter
▸ Semi-automatic
▸ Repeatable
▸ Comprehensive
Shattering by chiaralily on flickr, CC-BY-NC 2.0

TEXT
PREPARATION - SHATTER
▸ Manually crawl, mark, and map
→ “Target maps”
▸ Edit target maps and go
▸ Target maps describe scans
▸ Marked requests will be recognized as
“targets”
▸ Data ﬂows are mostly automatically deduced
— thus semi-automatic
▸ Same map gives same scan — repeatable
Planning by Jeremy Keith on flickr, CC-BY 2.0

TEXT
SHATTER TARGET MAP
▸ Are terse and readable YAMLs
▸ Comprised of:
▸ Analysises: What should we do
▸ Sessions: How should we do
▸ Identities: Who should we are
▸ Targets: Whom we approach to
▸ Flows: How we deduce parameters (opt.)
▸ Exploits: What we should do on ﬁndings

TEXT
ATTACK PLAN / EXECUTE
▸ Data ﬂow map
▸ Flows are wholly deduced
▸ Massive parallel scan
→combats high TTLs
▸ Scanner is ZAP-compatible
(for now)

DEMO 1
AUTOMATIC
EXPLOITATION ATTEMPTS

TEXT
AFTERMATH
▸ Insanely old middleware
→Automatic exploitation attempt gave 500
▸ Operator identity:
“Evgenij Sokolov”,
“Bertrand Rasse”, possibly etc.
omerta.sup@gmail.com
▸ Operator works:
http://omerta.wf/ etc.
▸ cf. omerta (n)
1: a code of silence practiced by the Maﬁa; a refusal
to give evidence to the police about criminal activities

TEXT
NIGHTMARE
▸ Black market
▸ Afterwork of Dream Market?

TEXT
PREPARATION - TRADITIONAL
▸ CAPTCHA
▸ Potential showstopper

TEXT
PREPARATION - SHATTER
▸ CAPTCHA
▸ Parameters can be deduced with code-
blocks
→ NN-based solvers can be attached!

CAPTCHA 102
▸ Recognizing glyphs in an image
▸ Hard to solve algorithmically
▸ 3-dimensional distortion
▸ Noise

LEARN TO RECOGNIZE
▸ Image classiﬁcation problem
▸ CNN
Convolutional Neural Networks
▸ Supervised learning model
▸ Similar to visual cortex
▸ Good at spatial pattern recog.
▸ Robust against distortions and shifts
Typical CNN architecture by Aphex34 on Wikipedia, CC-BY-SA 4.0

LEARN TO RECOGNIZE
▸ For 5-chars:
(10+26)5 → 107∼ patterns
▸ Cannot be solved at once
▸ Just classiﬁers
Typical CNN architecture by Aphex34 on Wikipedia, CC-BY-SA 4.0

DIVIDE AND CONQUER
▸ OpenCV2
▸ De-speckling
▸ Extracting glyphs
▸ Errors due to lack of spacing
→ignoring for now

BREACH PLAN
▸ OpenCV2
▸ Glyph extraction
▸ CNN
▸ Glyph classﬁcation
Chess Teacher by JB Kilpatrick on flickr, CC-BY 2.0

BREACH PLAN?
▸ What should we learn?
▸ Synthesized with generators
(tag=parameters)
▸ Gathered truths
(tag=pre-coordinated truths)
Question by Florence Ivy on flickr, CC-BY-ND 2.0

HUMANS TO SAVE US
▸ Anti-Captcha
▸ CAPTCHA recognition service run by
humans
▸ Gathered images and tags
→Now we can learn
▸ Human powered…? but:
▸ Tedious to recon generators
▸ Of course Shatter can use AC directly

GRAB THEM OUT
▸ Let’s gather CAPTCHAs
▸ We need ~2000
▸ High RTT!
（2~sec..）
Grab by Rutger Tuller on flickr, CC-BY 2.0

GRAB THEM OUT!
▸ asyncio super-parallel grabber
→No mercy
▸ 2000 imgs / ~48s
(24ms/img)
▸ Throughputs are not so bad

READ THEM OUT
▸ Read 2000 CAPTCHAs
▸ Out-of-charset reads
▸ Inaccurate glyph extracts
▸ Take only good reads!

DIVIDE AND CONQUER
▸ OpenCV2
▸ Shrink, despeckle, expand
▸ Glyph extraction

DIVIDE AND CONQUER
▸ Samples: 6305
▸ Should be around 10000… but
▸ Dropping glyph mis-extractions
▸ Dropping CAPTCHA mis-reads

RELENTLESS LEARNER
▸ CNN on Keras
▸ N×32x32x1 → 36 ([A-Z0-9])
▸ Preprocessing
▸ resize and thresholding
▸ Normalization: [0.0f .. 1.0f]

RELENTLESS LEARNER
▸ Keeping effective learning
▸ Small input: 32x32×1
▸ amsgrad (i.e. modiﬁed Adam)
▸ Test dataset
▸ 10% of original dataset
▸ Store the model in HDF5 format
→to continuous learning

LEARN TO BREAK
▸ 50 epochs → 30min.
Tensorﬂow 2.0 @ MBP 2017
▸ GPU?
▸ Keras uses automatically
▸ Only CUDA — MBP falls short :(
Early Learner by Aaron Freimark on flickr, CC-BY-ND 2.0

LEARN TO BREAK!
▸ 99% acc. (even in other datasets)
→Excellent
▸ Recognizes even Anti-Captcha fails
▸ CNN: should need 500..1000/cls
▸ 175.1/cls in reality
▸ Small dataset :(
Early Learner by Aaron Freimark on flickr, CC-BY-ND 2.0

CAPTCHA COMPROMISED
▸ Rarely misses for another dataset

PREPARATION - SHATTER (2)
▸ Attach to target map as a code block
▸ Feed the solver, return the result into the
parameter

TEXT
ATTACK PLAN / EXECUTE
▸ Data ﬂow map
▸ CAPTCHAs are solved in realtime

DEMO 2
AUTOMATED SCAN,
SOLVING MULTIPLE CAPTCHAS

TEXT
AFTERMATH (2)
▸ We have breached CAPTCHA protection for
Nightmare
(again)
▸ Their CAPTCHAs are rather weak
(again)
No lock 2 by Jens Eilers Bischoff on flickr, CC-BY 2.0

TEXT
FREE AS FREEDOM
▸ http://sha.tter.io/
(GitHub repos will be announced there)
▸ AGPL-3: It remains free for good
▸ Currently under heavy workings on ﬁxes and ..
▸ We are striving to make it not only useful but
also essential
Freedom by Mochamad Arief on flickr, CC-BY-NC-ND 2.0

TEXT
CONCLUSION
▸ The dark web
▸ Anonymized Web
▸ Hard to name attackers
▸ CAPTCHAs are often deployed but _not_
effective!
▸ Related works are not sufﬁcient
▸ Automatic: non-comprehensive
▸ Manual: non-repeatable
IMG_2988s by 不憂照相館 on flickr, CC-BY-NC-ND 2.0

TEXT
CONCLUSION
▸ Our answer: Shatter
▸ Semi-automatic
Crawl, mark, map, edit — you do
Scan — we do
▸ Repeatable
Same map gives the same scan
▸ Comprehensive
Because you crawl
▸ Beauty lies in “semi-autonomy”
Shattering by chiaralily on flickr, CC-BY-NC 2.0

TEXT
CONCLUSION
▸ Shatter can…
▸ Deduce params automatically, or with some
code
(solving CAPTCHAs, 2FAs, …)
▸ Fingerprint and stage attacks
▸ Actively exploit vulnerabilities
▸ Cooperate with other toolchains to deeper
analysis/exploitation
Mise en scène nocturne by Jean-François Renaud on flickr, CC-BY-ND 2.0

TEXT
CONCLUSION
▸ Shatter is
▸ At: http://sha.tter.io/
(GitHub repos will be announced there)
▸ Under AGPL-3: Free as freedom, for good
▸ Stay tuned!
▸ Under heavy workings on ﬁxes and ..
▸ Should be available at 12/24/2019
Freedom by Mochamad Arief on flickr, CC-BY-NC-ND 2.0

TEXT
CONCLUSION
▸ For hidden service operators:
▸ CAPTCHAs are not effective
▸ Better update your stack
▸ If you do bad things, you must be prepared
to be exposed
Menace by Kilworth Simmonds on flickr, CC-BY-ND 2.0

FIN.
28.10.2019 MONOLITH WORKS INC.

[CB19] Shattering the dark: uncovering vulnerabilities of the dark web by Takahiro Yoshimura, Ken-ya Yoshimura

Recommended

Recommended

More Related Content

Similar to [CB19] Shattering the dark: uncovering vulnerabilities of the dark web by Takahiro Yoshimura, Ken-ya Yoshimura

Similar to [CB19] Shattering the dark: uncovering vulnerabilities of the dark web by Takahiro Yoshimura, Ken-ya Yoshimura (20)

More from CODE BLUE

More from CODE BLUE (20)

Recently uploaded

Recently uploaded (20)

[CB19] Shattering the dark: uncovering vulnerabilities of the dark web by Takahiro Yoshimura, Ken-ya Yoshimura