Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Bugs Aren't Random

Wird geladen in …3
×

Hier ansehen

1 von 66
1 von 66

Weitere Verwandte Inhalte

Ähnliche Bücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Ähnliche Hörbücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Bugs Aren't Random

  1. 1. BUGS AREN’T RANDOM Unifying Building and Breaking In The Modern Age Dan Kaminsky Chief Scientist White Ops
  2. 2. Hello China!  Thank you DEFCON, for supporting me for almost two decades  Thank you Baidu. Only DEFCON could go to China, and you helped make that happen.  Thank you for coming :) This is my first time to your lovely country!
  3. 3. SO!  This is a keynote, so I’m supposed to inspire you  This is a technical talk, so there’s going to be actual lines of code on this actual screen (I promise)  The goal: Connect a series of concepts you may never have thought were linked  Consider this a “Skydive” − Start with a birds eye view − Dive headfirst into the weeds − Get ourselves a bugs eye view
  4. 4. 60 Frames Per Second  I have a “hobby” around human perception − Ask me about Color Blindness sometime  There’s a myth that people see at 60 frames per second − Works OK for video games, makes people quite sick in VR  Not real − Obviously mythological, we don’t see in frames at all, our eyes jiggle around a lot and our brain dreams something up − It’s why we can dream − Lots of experiments show the average person seeing well past 60  But why 60?
  5. 5. My Traditional Answer, ”Why 60fps?” “1940’s television technology, that’s just how fast TV’s used to run.” Correct, but incomplete.
  6. 6. My Traditional Answer 
  7. 7. Oh. I’m used to technology having its own clocks (quartz crystals). Turns out you can just use the power lines as a clock. (Can != Should)
  8. 8. So... We didn’t make TV’s 60fps for human vision We made TV’s 60fps because there was a 60hz clock handy Why was it 60fps?
  9. 9. Power was 60hz Because 1890’s and Physics Were  The induction motor was found to work well on frequencies around 50 to 60 Hz, but with the materials available in the 1890s would not work well at a frequency of, say, 133 Hz.  (Wikipedia, Utility Frequency)
  10. 10. So... So is 60fps nothing to do with human vision, and everything to do with 1890’s technology?
  11. 11. So... “There is a fixed relationship between the number of magnetic poles in the induction motor field, the frequency of the alternating current, and the rotation speed.” 60hz wasn’t just 1890’s tech. It’s also physics.
  12. 12. We’re Made Of Physics Too  Human vision comes from the brain  The brain circulates electromagnetic signals ● Gamma waves work at 25-100hz  PURE SPECULATION, but it’s sort of cool to go from: − 60 fps = rate human brain implements human vision − 60 fps = tricking human vision w/ television − 60 fps = timing television with spinning magnets − 60 fps = spinning magnetics at the same rate as the human brain
  13. 13. You might be thinking  What could this possibly have to do with bugs? − We don’t necessarily know why things are the way they are − Usually we do things because we’ve been doing them − Sometimes what we’ve been doing is good enough, sometimes what we’ve been doing is bad but nobody realizes where the bugs are  I like figuring out why − Stay intellectually honest, and you’ll find cool stuff − Know you’re speculating!
  14. 14. No Really, Speculative Execution Bugs  Spectre and Meltdown  Best explanation: “Did you go to the coffee shop?” ”No!” “Did you go to the bar?” “No!” “Did you go to the club?” “… … … No!”  Saying the same thing, at a different time, is not always saying the same thing.
  15. 15. What Are These Bugs? ● There are many variants (which is kind of the idea) ● Meltdown – Try to read data you’re not allowed to – You’re told no, but at the wrong time ● Spectre – Try to run code you’re not allowed to – You’re not allowed to run it, but other things go faster or slower based on what you weren’t allowed to run ● What went wrong?
  16. 16. We Assumed You Could Only Detect Cached/Uncached, not Content ● Wrong because when you read memory, you can say “Give me this information at (address plus a value between 0 and 255)” – You’ll be told no, but now you can check: ● “Do you have the value at address+0?” ● “Do you have the value at address+1?” ● “Do you have the value at address+2?” – Fuzzy, but you can check many, many times on a gigahertz processor, and you can flush the cache and start over ● CLFLUSH
  17. 17. We Assumed We Could Make Computers Faster ● “Why do we have these bugs? Isn’t this just math?” – Lots of nerd shaming ● No community in all of technology proves the mathematical correctness of their work more than microprocessor designers – They are the industrial market for theorem provers and SAT solvers – But you have to prove the right things in the right context
  18. 18. The Same Thing Might Be Predictable or Random Based On Context ● A corporation can be relatively predictable – Plodding, even ● An executive at that corporation can be erratic – Might quit tomorrow ● His heartbeat however is relatively predictable ● An individual heart cell in that heart might not ● All four scales occupy the same point in time and space
  19. 19. NOBUG, MUSTFIX ● The theorem provers did not fail when they showed no leakage of information between contexts – The right bits went to the right places ● The theorem provers weren’t being asked to show there were no timing variations dependent on secrets – Most, if not all timing variation is defined to not exist at the scale being proven
  20. 20. The Great Repurposing ● We turned a stability boundary, into a security boundary, and hoped it would work – Historically, most code would crash all the time ● “Historically” – The game was making sure it only corrupted its own resources – The theory was that hackers were just a new source of misbehavior, let’s just isolate them like we isolate everything else – Even independent of time, that hasn’t worked amazingly well, but in the context of time...
  21. 21. Hacker misbehavior Hackers are better behaved They change smaller things (from a computer’s perspective) that are bigger things (from a human’s perspective) Spectre and Meltdown change time, which is defined as nonexistent to the microprocessor designer, and made to be information carrying to everyone else
  22. 22. Worth Noting ● Some of the exploits against Spectre/Meltdown exploit the system scheduler timer – One tick every 15.6ms on many platforms – 1000ms/15.6ms == 64fps :)
  23. 23. Spectre and Meltdown Leak Bits. You can’t leak bits you do not have. ● There is a hidden architectural choice behind these bugs – Context Switching – “We have one computer that must pretend to be many computers, with many different security levels.” ● There is another decision that can be made – If you want two security domains, get two computers – You know, computers are small now.
  24. 24. Nano-Pi Neo $20, Quad Core, Gigabit
  25. 25. Up Core 4 core Intel Apollo Lake, PCIe, $89
  26. 26. Yes, we had to write patches for everybody ● Yes, we’re putting these patches everywhere, whether there’s a security boundary crossing or not ● But yes, not every individual node has two security domains – Sometimes, the only user really is the administrator – Sometimes, the administrator is only not the administrator when running a web browser – We are sort of getting this information down to where it needs to be in the chip – There’s a fair amount of “impedence mismatch”, and a lot of microcode patching right now is just trying to get even process ID into the branch predictor++
  27. 27. Explicit Security Domains Will Come  Security domains are not users  Security domains are not processes  Security domains are not even constrained to a single kernel or a single machine  They’re their own space. All the Spectre/Meltdown goop going on is trying to give the microcode an idea of whose context they’re working on. We’ll fake “what security domain” with that...for a while.
  28. 28. Surprising Amount Of Activity Around OS Design Out There ● User/kernel is not only not always a real security boundary ● User/kernel is actually pretty slow – Everything fast gets rid of it – DPDK networking running entirely in userspace – Kernel Mode Linux from back in the day – “Rump Kernels” aren’t kernel-less – they just run full BSD (or even Linux, w/ LKL) kernels in the same memory space as the application – HPC is actively working here – mOS, Hermitcore, Kitten, etc.
  29. 29. Why Am I Telling You This? ● Security that doesn’t care about the rest of IT is Security that grows increasingly irrelevant – Computing in 2023 is not going to look like Computing in 2018 – Computing in 2018 doesn’t look like what most people think computing in 2018 is ● “There’s no such thing as the cloud, there’s just other people’s servers” – ...with other people’s pagers. ● The scale of computing has completely changed, how we fix our security problems is going to require different viewpoints
  30. 30. The Flipside ● If you’re just looking for bugs, look for the things people think don’t matter – Attack there. ● Bugs aren’t random because their source isn’t random – Developers write certain bugs based on what they aren’t thinking about – Bug finders find certain bugs based on what they know developers aren’t thinking about ● This is not always conscious ● It’s usually true, at least in anyone I’ve found that’s good at this
  31. 31. Right about now is a good time to introduce The Catchy Only Vaguely Correct Catchphrase Designed To Spark Interest
  32. 32. There’s no such thing as reverse engineering.
  33. 33. What Do I Mean? “We only make the car turn left. Those other guys handle the car turning right.” “It’s just my job to get the plane in the air. If when and how it lands, not my problem.” It’s not that there aren’t different teams – it’s that if you don’t care if your work affects the other guys...thing’s gonna crash
  34. 34. Thesis There’s no reverse engineering, there’s no forward engineering. There’s just engineering. There are cultural elements in engineering that block the integration of forward and reverse. The primary one seems to be...
  35. 35. The Problem Dev vs. Test
  36. 36. Penetration Testing  “Hackers” like to talk about the former  We are a specific branch of the latter  The latter shouldn’t be split off, but it often has to be − Everybody always sees their own code for what it should be doing, not for what it actually is
  37. 37. What’s Happening  Large amounts of tooling are isolated to the testers  Creates an enormous bias in developer knowledge, they end up not using tools and patterns that are too “test-y”  Ends up biasing the code they think they can write − More technically: Ends up biasing their transformations in a particular direction  Compile time influences runtime  Runtime doesn’t update the source to be compiled − Less technically: Like a car that pulls right
  38. 38. Concrete Example  Fortran is fast (still). Python is slow (still). Except if you use Numba. − Finally, a practical environment for transforming standard-ish Python into high perf CPU/GPU code − Requires, like all optimizers, knowledge of what types of data it’s supposed to optimize for − Games of constraints: If I can constrain what’s coming, I can throw state away and optimize only for those expectations  What happens if I constrain incorrectly? CVE numbers.  That’s why security is involved. Perf and sec are not separate universes.
  39. 39. The Problem  Python is dynamically typed, integers or floating point numbers or strings or whatever are distinguished at runtime  Developer pain required to declare up front what types might pass through
  40. 40. Another Competitor Enters The Arena  PyAnnotate by Dropbox  Monitors types used during test or production, updates the code in-place with annotations  Thus far, this hasn’t been extended to numeric optimization for Numba  Solves the problem that developers don’t actually know the right answers for expected types either  Considered appropriate only for “legacy” because
  41. 41. Because... ...because what? Why shouldn’t runtime influence code?
  42. 42. The Approach Seems Weird  “Isn’t this Profile Guided Optimization?” − No, this actually changes the source code − This is also not constrained to performance, i.e. could apply security constraints (probably been some work here)  Pair programming with a machine? − I mean, we lean on libraries a lot, to the point a lot of dev is figuring out what legos to stick together  Developers are supposed to know what the system is supposed to do. They’re not supposed to learn what the system is supposed to do by watching it fail! − A) That’s totally what they do − B) That’s totally what we do
  43. 43. Developer tools usually assume the developer is right  Optimization throws out information unneeded by the present system  The present system is wrong, a new system is needed, the information about how we deviated from correct to suboptimal may have been thrown out  So that’s how test tools – hacker tools – differ up front. We’re looking for that developer error.
  44. 44. But everyone’s tools are kind of bad  “The difference between reverse and “normal” engineering is whether you have the source code” − Assumes the source code is more comprehensible or predictable than the compiled form − I knew a guy (who will laugh when I send him this slide) who audited C++ from the compiled binary, because “who can dig through that mess, just give me the binary and I’ll walk through the table myself”.  Ultimately the more we can monitor the operation of running code, in the context of the source code generating it – the faster the loop between misconception and correction can be – the better software (or the more bugs) we’ll find.
  45. 45. Where I’m Going With This  1) Full system source debugging  If things are so open source, where’s the source? Why am I specifically recompiling things I happen to be interested in, one library at a time, including the kernel? − Questions actually do have answers: Debugging tools don’t want to take a hard dependency on source being available − Went to an SSD developer some years ago, I don’t think there is a single company on the planet with all source to one of those − But I can compile Gentoo from source… − Yes, it’s very nice. − Apt-build on ubuntu also “kind of works” – good for individual targets
  46. 46. ADB is old and busted ABD is the new hotness ● Always Be Debugging :) ● “But what about security boundaries? Am I going to have to type sudo all the time?”
  47. 47. The Problem Ever get the feeling it’s easier to be root on… someone else’s machine? :(
  48. 48. Alas ● Attackers get root for years ● You get root, one line at a time – It’s still me! – No really, still me. ● And you get such a variety of software behaviors, too.
  49. 49. Wireshark as user
  50. 50. Wireshark w/ sudo
  51. 51. Linux knows And hides an incomplete fix...
  52. 52. Coming soon, sudo 2.0: a.sandwich 
  53. 53. This is just as silly as sudosudosudosudosudosudo  One quirk −Can’t just switch effective user all the time, that’s part of why sudo is a mess −Present plan: Make permission checks pass, but otherwise keep users what they are −Use the switches to control precise parameters (they glow!)  This is actually common behavior, we just don’t notice it −VM’s are fake roots −Containers are fake roots ● HUGE reason Docker succeeded −Kali Linux is a real root
  54. 54. Jupyter is almost great  Web based reboot of interactive programming  No internal interface for adding packages ● That would depend on root access! ● (But VirtualEnv) SMACK It’s a polyglot environment, supports lots of languages ● What’s going on?
  55. 55. As usual, there’s actually a reason  There is an answer to why we’re moderately careful doling out root access, and it’s actually not really “we’re afraid of hackers”  Users can actually break their machines pretty easily, and then come to us complaining  One class of fixes involves applying a talent bar, to be able to break the machine, or making as much software as possible not put the machine at risk by rewarding it with being immune to The Prompt  Another class involves...just making it easy to fix the thing
  56. 56. VIMCEPTION ● AKA “Fork The Universe” ● Boot the running system, into a VM, with the full existing configuation, knowing we can’t break anything – Possible?
  57. 57. Why It’s Interesting ● It’s Root ● Nothing Bad Can Happen
  58. 58. Raspberry Pi: A Root For Every Kid 
  59. 59. Clarity  “Get two computers” is also “manage your persistence”
  60. 60. A Hard Question  Why are we vulnerable to ransomware?  “Because the attackers can delete our data”  Why can attackers delete our data? Why can we? Isn’t storage cheap now?  Equivalent for ephemeral installs: Why do I have these difficult to protect, expensive to replace persistent installations?  As debugging didn’t want to take a dependency on source, security may not have wanted to take a dependency on zero persistent storage. − But that might be the right design to work with, to allow arbitrary “damage” to be done and always be able to return to a known safe state.
  61. 61. Closing Thoughts ● We should not separate development and testing ● Our hardest problems in security require alignment between how we build and how we verify ● Our best solutions in technology will understand the past to see the future – All that matters is how well we protect users, and provide the services that they need. – Our personal development cultures are not as important as actually getting the job done :)

×