Weitere ähnliche Inhalte
Ähnlich wie Big Data Analytics, Dave Shuttleworth - 22-9-15 (20)
Kürzlich hochgeladen (20)
Big Data Analytics, Dave Shuttleworth - 22-9-15
- 1. © 2015 EXASOL AG
Big Data Analytics
Dave Shuttleworth – Principal Consultant, Exasol UK
email: dave.shuttleworth@exasol.com
Twitter: @EXA_DaveS
- 2. © 2015 EXASOL AG
2014-2015 – EXASOL UK – Principal Consultant
Introducing EXASOL DBMS technology into UK
2003 - 2014 – Intelligent Edge Group – Principal Consultant
Data Warehouse design and migration from older technologies to new MPP DBMS
Business Intelligence infrastructure architect
New DBMS technology assessment
1992 - 2003 – WhiteCross Systems (now Kognitio) – Principal Consultant
Pre-sales and post-sales technical support
1989 -1992 – Teradata – Consultant
Pre-sales and post-sales technical support
1980 -1989 – Data General (now part of EMC) – Systems engineer
Pre-sales and post-sales technical support
1975 -1980 – UK retailer – Analyst programmer
Applications design and implementation, system management and tuning
My background
- 3. © 2015 EXASOL AG
a column store, in-memory, massively parallel processing (MPP)
database
modern software designed for analytics
runs on standard x86 hardware
Uses standard SQL language (with optional extensions)
suitable for any scale of data & any number of users
mature, proven & very cost effective
quick to implement & easy to operate
The World’s Fastest Analytic Database
What is Exasol?
- 4. © 2015 EXASOL AG
QphH@1000 GB 1,000,000 2,000,000 3,000,000 4.000,000
Sept ´14
April ´14
June ´12
Feb ´14
Dec ´13
Aug ´11
Sept ´11
Oct ´11
Dec ´11
Source: www.tpc.org / Sept 22, 2015
We are the benchmark leader
5,246,338
Microsoft 134,117
Oracle 201,487
Oracle 209,533
Microsoft 219,887
Sybase IQ 258,474
Oracle 326,454
Vectorwise 445,529
Microsoft 519,976
On 1 Terabyte of data - an order of magnitude faster than its closest rival
Queries per hour
- 5. © 2015 EXASOL AG
Unrivalled price/performance at any scale
4th Position
3rd Position
2nd Position
EXASolution 5.0
0
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
6,000,000
7,000,000
8,000,000
9,000,000
10,000,000
11,000,000
100GB
300GB
1TB
3TB
10TB
100TB
Performance(QphH)
TPC-H Scale Factor
Source: www.tpc.org / Sept 22, 2014
The larger the data the
greater the EXASolution
advantage - 66% less cost
on average than the
nearest competitor
- 6. © 2015 EXASOL AG
• There are many examples of architecture diagrams available which try
to encompass every potentially relevant technology – probably
including (but not limited to) the following:
Requirements for Big Data Analytics (?)
•Streaming data
•Social media data
•Internet of Things
•SQL/NoSQL/NewSQL
databases
•Hadoop and
associated stuff
•Data in memory
•Cloud computing
•Unstructured data
•ETL/ELT/ETLT
•Data Mining
•Predictive Analytics
•Data Quality
•MDM
•Graph databases
•MPP
•etc, etc
- 8. © 2015 EXASOL AG
• Some observations based on work with Exasol, Netezza, Teradata etc.
• It’s extremely rare to find a new ‘greenfield’ environment – there’s
generally some ‘legacy’ stuff which has to be accommodated
• Many clients are only just starting to experiment with newer
technologies such as cloud and Hadoop
• More advanced users are realising that new technologies don’t
necessarily have all the answers
• Some of the newer technologies need new skills – which might not
be easy/cheap to find
• Some metrics – setting up Cloud and Hadoop
• > 150 total steps
• > 30 decision points
• 2-6 months
In the real world..
- 9. © 2015 EXASOL AG
The good news…
Newer technologies are driving implementation prices down
New technologies support agile development approaches
As newer technologies mature they become easier to integrate
with each other
with legacy systems
New vendors are addressing the ‘integration complexity’ issues (e.g.
Cazena)
It is possible and practical to choose the ‘right tool for the job’ – a
single vendor is no longer expected to provide everything
There are existing examples in production already
- 10. © 2015 EXASOL AG
King – getting to know 500 million players..
- 11. © 2015 EXASOL AG
Evolution not Revolution
End Users
Mobile Devices
BI & Reporting
Applications
Data
Integration,
ETL and
Replication
• Ported
• Bundled
• Custom
• Prototyping
• Ad Hoc
• Dashboard
• Statistics
• Data Mining
• Analytics
ERP
CRM
SCM
Legacy
OLTP
Enterprise Data
Warehouse
External Data
EXASOL
NoSQL, Graph,
etc
- 12. © 2015 EXASOL AG
• Data and database technology isn’t going away!
• New database approaches are being developed to address the
requirements of flexibility, scalability etc
• These technologies drive an increasing need for more analysts,
database designers, data scientists
• Hybrid systems are becoming the norm, with companies mixing ‘best
of breed’ technologies (possibly open source) to get the best and
most cost-effective results – use ‘the right tool for the job’
• New vendors are addressing the complexity problems
• SQL databases will continue to be widely utilised – but alongside
other technologies and integration will become tighter
Summary
- 13. © 2015 EXASOL AG
Dave Shuttleworth
Twitter: @EXA_Daves
Email: dave.shuttleworth@exasol.com
Any questions?