Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Re-inventing the Database: What to Keep and What to Throw Away

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 20 Anzeige

Re-inventing the Database: What to Keep and What to Throw Away

Herunterladen, um offline zu lesen

NoSQL has turned many database concepts upside down. Consistency models, transactions, data models, and query interfaces are being reinvented. Tradeoffs between performance, availability, managability, and usability are being re-thought. In this talk 10gen President Max Schireson, reviews some of the different approaches being taken and offers opinions on the right choices for different uses.

NoSQL has turned many database concepts upside down. Consistency models, transactions, data models, and query interfaces are being reinvented. Tradeoffs between performance, availability, managability, and usability are being re-thought. In this talk 10gen President Max Schireson, reviews some of the different approaches being taken and offers opinions on the right choices for different uses.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Anzeige

Ähnlich wie Re-inventing the Database: What to Keep and What to Throw Away (20)

Weitere von DATAVERSITY (20)

Anzeige

Aktuellste (20)

Re-inventing the Database: What to Keep and What to Throw Away

  1. 1. Reinventing the Database Max Schireson President, 10gen
  2. 2. My background At Oracle from 1994 to 2003 At MarkLogic from 2003 to Feb 2011 Join 10gen Feb 2011
  3. 3. The world has changed 1970 2011 Main memory Intel 1103, 1k bits 4GB of RAM costs $25.99 $25 99 Mass storage IBM 3330 Model 1, 100 3TB Superspeed USB MB for $129 Microprocessor Nearly – 4004 being Westmere EX has 10 developed; 4 bits and cores, 30MB L3 cache, 92,000 instructions per runs at 2.4GHz second Motor Trend Car of the Ford Torino Chevy Volt Year President Richard Nixon Barack Obama Ted Codd In his 40’s Dead Me In diapers In my 40s
  4. 4. More recent changes A decade ago Now Faster Buy a bigger server Buy more servers Faster t F t storage A SAN with more ith SSD spindles More reliable storage More expensive SAN More copies of local storage Deployed in Your data center The cloud – private or public Large user base Thousands - Millions - consumers employees Tracking Business transactions Every click and more
  5. 5. Assumptions behind todays DBMS Relational data model Third normal form ACID SQL Q Multi- Multi-statement transactions Database is hardware agnostic RAM is small and disks are slow If its too slow you can buy a faster computer
  6. 6. Yesterday’s assumptions in today’s t d ’ worldld Scaleout is hard Distributed joins are hard Making two-phase commits fast is hard two- Custom solutions proliferate p Too slow? Just add a cache ORM t l everywhere tools h More computers and disk are nearly free but SAN and f d faster computers are expensive i
  7. 7. Challenging some assumptions ti Do you need a database at all How does it scale out What type of queries does it need to be able to do How should it model data How do you query it How does it handle transactions and consistency Is i I it enterprise software, open source, an appliance, or a cloud service i f li l d i Does the data fit in memory? What if your disks are SSD?
  8. 8. My opinions Different use cases will produce different answers Existing RDBMS solutions will continue to solve a broad set of problems well but many applications will work better on top of alternative technologies Many new technologies will find niches but only one or two will become mainstream
  9. 9. Do you need a database at all ll Can you better solve your problem with a batch processing framework Can you better solve your problem with an in memory object store/cache
  10. 10. How does it scale out Scale- Scale-out for working set size Scale- Scale-out for total data size Scale out for write volume Scale- Scale-out for read volume Scale- Scale-out for redundancy How do you incrementally add nodes or change configuration How do you trade off query performance (which wants fewer index segments) for elasticity (which wants more index segments))
  11. 11. What type of queries does it need t b able to d d to be bl t do Is a key/value store enough Will you be retrieving your data by one key or by many Is there a primary way you ll be viewing your data you’ll Do you need specialized queries (eg, time series, (eg, geospatial)
  12. 12. Imagine a garage… You hand your valet the keys to your car Before they park your car, they completely disassemble it The pistons are stored in piston storage, brake pads with brake pads, steering p p g p p g wheels with steering wheels Over time, they have storage areas for catalytic converters, DVD-based nav DVD- systems, headlight washers, and traction control systems When you ask for your car back, the valet is incredibly fast at reassembly One minor issue: you have to provide the disassembly and reassembly instructions and they will be followed literally, even if you say the spare tire should be used as a steering wheel and forgot to specify re-insertion of spark plugs re- A technological marvel Might be a good way to store your car if you don’t know whether you’ll be asking for a car back or lots of brake pads or pistons – for a salvage yard?
  13. 13. How should it model data Relational Row oriented or column oriented Key value Document oriented Graph oriented
  14. 14. How do you query it Do you want an API, a language, or a map-reduce map- style interface? Will most of your queries be hand-typed, embedded hand- in code or dynamically generated
  15. 15. How do you handle transactions and consistency t ti d i t Do you need transactions at all Be careful; web services, for example, need to be able to assign userIDs Do you need multi-master updates multi- If so, how do y resolve conflicts , you Do you need immediate consistency? For some queries or all? How do you handle failures Are you optimizing for read availability or write availability
  16. 16. What is it Enterprise software Open source p With commercial support? Appliance Packaged with commodity hardware Specialized hardware Cloud Cl d service i Available for on-premise deployment? on- Integrated in another PaaS offering? Where on the net?
  17. 17. Does the data fit in memory Transactions can be very very fast Do you trust enough copies in memory (perhaps across multiple data centers) or do you require some sort of sync to persistent storage How big will the data be and how much do you care about costs
  18. 18. What if your disks are SSD Alleviate hotspots Random accesses are measured in microseconds not milliseconds Degradation from in-memory to on-disk can be in- on- more graceful But data representations on disk vs in memory may be very different which may create significant overhead
  19. 19. In choosing a solution Examine your requirements They will dictate certain choices Once you have narrowed the field Prefer solutions that may become mainstream y Consider TCO: Purchase cost Learning curve L i Productivity Viability
  20. 20. Which solution sets will become mainstream b i t High confidence Horizontally scalable: to take advantage of hardware trends Non- Non-relational: to enable scalability Highly functional: for usage beyond mega-scale mega- Developer- Developer-friendly: because decision making has shifted Freely available: for rapid adoption My predictions Document oriented: enables scalability, functionality, developer friendliness, and agility Open source: with multiple PaaS providers

×