Presentation by Dan Kaminsky at the 2014 DigiCert Security Summit.
Video recording: https://vimeo.com/112424787
About Dan Kaminsky
Chief Scientist, White Ops
Dan Kaminsky has been a noted security researcher for over fifteen years and has spent his career advising Fortune 500 companies such as Cisco, Avaya, and Microsoft. He spent three years working with Microsoft on their Vista, Server 2008, and Windows 7 releases.
Mr. Kaminsky is best known for his work finding a critical flaw in the Internet's Domain Name System (DNS), and for leading what became the largest synchronized fix to the Internet's infrastructure of all time. Of the seven Recovery Key Shareholders who possess the ability to restore the DNS root keys, he is the American representative. He is currently chief scientist and cofounder of White Ops, a firm specializing in detecting fake users on websites.
https://www.digicert.com/events/summit-2014/
2. 15 Years In Defensive Security
• Boy is my moral outrage tired
• Reality of being in one field for this long: Your opinions change
• 1) You decide nothing works or will ever work
• Maybe switch (go back) to offense
• 2) You decide our pablums are just victim shaming
• “You didn’t run AV so that’s why you were hacked”
• Lots more where that came from
• 3) You see everything is weirder than expected. OK
• I’m more in the 2 and 3 camp
3. Some Musings About DNS
• I don’t talk about this much
• It wasn’t actually that pleasant an experience
• Welcome to Infosec
• Public claim: “DNS vulnerability affects ISPs and Enterprises, allowing all
their web and email traffic to be hijacked. Here’s a nice easy to install
patchset for whatever you’ve got, to get this fixed”
• Total bandaid, but you know, bandaids in fact stop infections
• Private reality: Was way more concerned about social networks (Forgot
My Password) and certificate authorities
• Domain Validation for certificates puts the trust on the DNS infrastructure…not good
if I control that
• May have been my first interaction with Digicert
• Pretty sure I spoke to every CA on the planet
4. The Woes Of The White Hat
• What was fun: Being sneaky!
• Yes, most of what we do isn’t public anyway
• This was leveraging the tooling we could build to protect some very public victims, to
protect some very private endpoints as well
• What was surprising: Duck Tape and Bailing Wire
• “…uh, Dan, how do we find these name servers we need to patch?”
• Once an IP becomes a DNS server, it must stay one forever or who knows what
breaks
• God help us if 4.2.2.1 ever stops being a trustworthy name server
• Hard to secure what you don’t know exists
• Networks aren’t really designed, they just kinda grow
• It’s not that fungus looks like roads. It’s that roads (and all systems, really) grow like
fungus.
5. What Does It Mean To Mitigate
• It does not mean you tried really hard, and that’s good enough
right?
• Once knew a random IT guy who was about to be promoted to director, so he
could be the guy in the hotseat when the $250M project was accepted as a
failure
• Would be fired in six months in “disgrace”
• A nonzero percentage of CISOs exist to take the fall when an attack is
successful
• You can solve the problem or you can solve the metric
• Mitigation means two things
• 1) An attacker doesn’t know his attack will work
• 2) A defender doesn’t know everything his defense works against
6. The Fog Of Cyberwar Is Real And Knowledge
Is Fleeting
• People didn’t know what name servers they were running
• There’s a lot more ignorance where that came from
• BYOD == Infrastructure Ignorance (and isn’t necessarily bad)
• We never know most of the vulnerabilities we’re exposed to
• Even if every CVE is obsessively cataloged and mapped to internal
infrastructure, most bugs never get a CVE
• Anyone who’s lived in a bug tracker for a major product has seen 100x the security vulns
that have ever gone public
• We hear about “four 0days being burned” and shrug mightily
• Real world pen testing is the live discovery of 0day
• Against both web apps (which expose enough internals for realtime discovery) and
systems
7. Mitigation Requires Grouping Into Classes
• Classic mitigations against memory corruption: Stack cookies and
ASLR
• “All of these bugs write blindly into memory, let’s make it so they don’t know
what to write”
• Even if there’s some exploit scenarios where there’s also an information leak,
there’s exploit scenarios where there aren’t
• Even if there’s some exploit authors who can find the gaps in ASLR (and
there’s always something at a fixed address somewhere) maybe there are
exploit authors who can’t
• There’s generally fuzziness in everything
8. The Ability To Mitigate Is Part Of The Primary
Defender’s Advantage
• The Primary Defender’s Advantage: We get there first.
• (Like all advantages, this is under attack too.)
• We select the parts
• We deploy the teams
• We have actual C&C, by design
• Defenders making the first move set the rules of the game and are
under no obligation to make those rules fair
• In fact it’s your job not to – you have legitimate users (employees, customers)
and you have illegitimate users (hackers)
• Hackers don’t get to file trouble tickets
9. A Major Mitigation Theme: It’s Not A Puzzle,
It’s A Game
• Chess puzzles are hard
• Put me in a room for a month and offer me a million dollars once I
beat the puzzle, and I’ll beat the puzzle
• Put me in a room for a month with Garry Kasparov, and no amount of
money will make me a better chess player than him
• Both scenarios are hard, but I can predict defeating one of them
• Attacks are mitigated when there are consequences to failing them
• That consequence includes (and generally is) detection
10. The World’s Simplest IE Exploit
• <iframe src=“foo.exe”></iframe>
• No, of course this doesn’t execute foo.exe zero click
• Zero click
• A sequence of popups occurs gating access to RCE
• The popups are not actually 100% effective
• Sometimes, a user is tricked into granting RCE
• Sometimes, a user isn’t tricked, they actually want to install Chrome or Creative Cloud or,
you know, software for their computer
• The popups are not actually 0% effective
• Entire classes of attacker wouldn’t even consider a prompting exploit no matter how
reliable it occasionally is
11. Zane Lackey’s Discovery @ Etsy:
Exploitation Is Not Instantaneous
• The traditional model: Knowledge of a vulnerability leads immediately
to generation of a perfect exploit that has no side effects
• Reality: Exploits are software too. All software takes time to build,
takes testing to make effective, and has bugs forever.
• Reality For The Web: Software has gone in-house, with continuous
deployment (i.e. no big testing phases) and obsessive monitoring to
survive that
• Obsessive monitoring = C&C
• Exploit development creates unique log entries, in the time after a
vulnerability is identified before sufficient refinement has allowed a
compromise to unfold
12. Attacker Refinement Detection
• Let’s write a SQL Injection attack
• 1) Find an endpoint that’s querying the database using a string you partially populate
• 2) Populate with crap until the SQL doesn’t parse properly and the page returns a 500 error
• 3) Refine the crap until the SQL parses and executes something nasty
• It ain’t only the attacker who can see that 500
• It ain’t only the web server that’s experiencing an error
• The SQL parser is expressing a unique failure mode
• Alert on this! It’s not happening all the time (ideally) and it’s predictive of an exploit en
route!
• This is what it means to get there first and make a game of it
• The attacker is penalized for not being able to skip step 2
• There are much more interesting signals down this path, and continuous
deployment means we actually have these signals
13. The Hope
• We’re running complex systems across large numbers of nodes
servicing huge numbers of users from a small number of centralized
systems anyway
• We’re doing deep log aggregation and automated system deployment
anyway
• It’s not just that software is eating everything. We’re generating so
much data that we’re all getting into Big Data Analytics anyway
• This is the framework in which mitigation expands from point fixes
on desktops to global fixes across systems
14. A Scheme I’m Exploring:
DDoS Mitigation via Stochastic Tracing
• The Defcon Dilemna
• Brute Force
• Elegant Solution
• Distributed Denial Of Service attacks keeps getting bigger and bigger
• One >200GB flood in 2013
• Fourteen >200GB flood in 2014
• Traditional DDoS Defense
• Bigger pipes, sometimes borrowed (via Prolexic)
• Assumes core will always be massively overprovisioned compared to the
edges “where attackers live”
• The Internet mocks all topological assumptions
15. Two Classes of DDoS: Direct and Spoofed
• Direct
• Large numbers of compromised hosts making repeated requests of some
resource
• Full TCP/HTTP connections, “Layer 7 DDoS” against application layers
• Annoying but not unmanageably large
• Spoofed (and Amplified)
• DNS and NTP blindly reply to small requests with large responses
• DNSSEC only makes this slightly worse; the problem is in DNS itself (and all the floods are
DNS floods)
• If requests appear to come from a particular victim, responses will hit
innocent parties
• Millions to tens of millions of IPs can participate in this amplification
16. Why Spoofed Floods Are So Nasty
• Suppose a hundred thousand cars all take the fastest route from San Francisco to
Seattle
• That route will get lots of traffic, will slow to a crawl, but won’t affect (say) Vegas to Denver
• Suppose the same hundred thousand cars make an extra stop first – some go to
Phoenix, some go to Vegas, some go to Denver, some go to Sacramento, and then
everyone goes to Seattle
• Also, like five cars join them in each new city
• Now, San Francisco has disgorged itself of vehicles via all possible routes
• Maximum outbound bandwidth
• Now, Seattle is absorbing traffic from all possible routes
• Maximum inbound bandwidth
• Seattle has no way of knowing all of these cars are arriving because of
shenanigans in SF
• That’s what we’re doing manually now
• That’s what we need to be doing automatically in the future
17. Stochastic Tracing In A Nutshell
• One out of a million packets, send a message to the destination saying:
• Hi, I’m this router
• I’m on this network
• I saw this packet going towards you
• For DNS/NTP, “I see a reply is going to be coming back to you”
• Here’s how to let me know if it’s a problem
• Here’s a signature saying who I am
• All networks involved in a DDoS now automatically announce themselves
and provide actionable intelligence to more quickly eliminate the
damaging traffic
• Tracing data arrives in proportion to participation in the attack – if AT&T is sending
100Mpps, they’re sending 100 tracers/sec.
18. Caveats
• Can’t be a privacy issue, going to the same network the original packet is
going to
• Not a load issue, it adds one packet for every million
• Not a deployment issue (most likely)
• Assumed sneaky tricks were required to sample one out of a million packets
• Sflow/Netflow can be pressed into service to capture the data and forward it to a
reporting box (one will service an entire network)
• Formats
• Really the question is how does the receiving network want to get this data
• ICMP is proper by IP but maybe hard to acquire. UDP easier, HTTP may be easiest
• Probably going to allow configuration in Reverse DNS space, including directing
reporting flows to IPs other than the ones being flooded
• One meeting to configure reverse DNS vs lots of meetings to get listeners across all company
IP space
19. Playing Chess Against Myself
• What if the attacker spoofs a bunch of fake reports?
• Yes of course they would do that, they’re an attacker
• Could cause a bunch of people to run around thinking they’re under attack by
Verizon w/ spoofed traffic
• They can already do that But now there’s Stochastic Tracing evidence
• This is why reports should be signed (presumably with something fast / by
DJB)
• Publish key/reporter identifiers in reverse DNS
• Eventually can start thinking about automated responses (rate limiting?) to
DDoS, which are impossible when the process is:
• 1) Human-driven across networks
• 2) Ad-hoc unstandardized rate limit driven within networks
• The solution to DDoS right now is everyone decides some weird subset of traffic to block or
rate limit and we hope that block doesn’t last forever but have no idea
20. Thinking Systemically
• A quick interruption: I’m not (just) trying to pitch my own tech to you
• I’m trying to express a way of thinking about problems
• Our attackers operate on the micro scale. We have the freedom (if
we’re willing to cooperate) to play at the macro scale. What if the
Internet, as a global system, collaborated automatically to trace
traffic flows that threaten the stability of the platform?
• There’s an infinite number of ways to cause traffic to explode across the wires
– an infinite number of attack modes to mitigate
• But all traffic has to come from somewhere
• Dramatically improving the traceability of DDoS could significantly reduce its
prevalence
21. Secure By Default: Making Sexy Boring
• I worry about the code we build the Internet with
• Network hardware has an economic model
• Carriers have an economic model
• When’s the last time you paid for a compiler?
• Without installing the ask.com toolbar
• Major successes in security have come from making things automatic
and almost boring
• Browser updating across IE, Firefox, and Chrome is now so silent that when
we see old browsers, they’re much more likely to be bots
• Security work requires complicated, fairly boring stuff to be championed and
designed and written and tested, and it’s often some of the most boring stuff
to the uninitiated
22. There Are Gaps
• It’s 2014 and insecure random number generation is still the standard in
every major programming language
• Security people bicker
• Everyone else just optimized for speed and included whatever wonktacular package
the security people finally coughed up
• This is embarrassing, and repeatedly causes exploitable vulns in production systems
• Random Numbers are not some philosophical trap
• Given any number of previous numbers, can’t predict the next one
• Given any number of future numbers, can’t calculate this one
• We have a thousand ways of doing this, they all work fine, no major programming
language does by default
• “liburandy” being released probably in December to start fixing this, but we may just
submit bugs to shell back to /dev/urandom or CryptGenRandom
23. Eliminating Bug Classes: Use After Free
• HTML5 has mostly become the language for front end development
• Depends on browsers, easily the most exposed attack surface on the planet
• Pinging every IP address is still mildly controversial. I run one way or another on
most browsers every day via ads.
• “Download and execute my unaudited software immediately and run it under a
security model that’s…actually pretty good.”
• Most common bugs in browsers are Use After Free
• Memory management is hard
• Memory management when an attacker has interactive access to your object tree is
really hard
• Much easier to attack a browser than to attack a file parser like Office – much more fine
grained control
• Not only most common, but most common undiscovered
24. Google and Microsoft are on it
• Google: Typed Heap
• Most damage from Use After Free comes when a pointer of one type is accessed as if
it was another
• Write a value as if a pointer was a Table, and then jump to a method as if it was an Image.
Where will that jump go?
• So let’s just make sure Tables are allocated together, Images are allocated together,
etc.
• Microsoft: Non-deterministic heap
• What if the attacker didn’t know when they’d successfully forced memory reuse?
• Good: Taking advantage of an attacker’s requirement for reliability. Many (not all)
attackers are uninterested in attacks that might fail.
• Bad: Attackers aren’t the only parties that require reliability, and pathological
scenarios where you really need to be reusing memory at a fast clip actually happen
in the field.
25. My Research (For Firefox, which has nothing,
not even ASLR)
• Iron Heap
• You can’t Use After Free if you don’t free
• That’s crazy if you’re talking about physical memory
• That’s not crazy at all if you’re talking about virtual memory
• Why reuse pointers if you’ve got terabytes and terabytes of address space?
• Unlike ASLR, there’s no bypass here, if there’s no pointer reuse the attacker simply
cannot exploit the vuln, ever
• Caveats
• May only be good for browsers, the one place where we have the most risk
• Requires 64 bit (and Firefox 64 bit support ain’t beautiful)
• May have performance implications
• This is surprisingly controversial
26. Testing Mitigations, A Proper Way
• 1) Mine Firefox bug database for UAFs discovered in 2012 and 2013
• 2) Deploy secure allocator on Firefox from 2012
• 3) See what percentage of then unknown UAFs now can’t function
• This doesn’t work if Firefox for 2014 is completely different than
Firefox for 2012, but it’s not
• The challenge is showing that a mitigation has made an attack
impossible / vastly more difficult, and not merely that it made a
particular exploit ineffective. UAF is uniquely detectable but many
scenarios aren’t.
• Working on this
27. Conclusions
• Mitigations are about embracing unpredictability – making things fail
poorly for an attacker and pleasantly for the defender
• Leverage continuous deployment to bring penalties to the attacker who
fails at his first attempt, because very often he will
• Stochastic Tracing may make it substantially easier to mitigate DDoS attacks
• Random number bugs can be killed pretty easily, and Use After Free in
browsers is next
• It is possible to test mitigations by using code of an era against vulns
discovered after, but one must take care to differentiate attack resistance
from one-off exploit defense