The document discusses how MySQL can be used to unlock insights from big data. It describes how MySQL provides both SQL and NoSQL access to data stored in Hadoop, allowing organizations to analyze large, diverse datasets. Tools like Apache Sqoop and the MySQL Applier for Hadoop are used to import data from MySQL to Hadoop for advanced analytics, while solutions like MySQL Fabric allow databases to scale out through data sharding.
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Â
Unlocking Big Data Insights with MySQL
1. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Unlocking Big Data
Insights with MySQL
Matt Lord
MySQL Product Manager
2. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracleâs products remains at the sole discretion of Oracle.
2
3. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Cloud
Web & Enterprise OEM & ISVs
Industry Leaders Rely on MySQL
3
4. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Powers The Web
Over 500 million Tweets/day. 143,200 Tweets/sec in Aug 2013
âMany petabytesâ of data. 11.2 Million Row changes & 2.5 billion
rows read /sec handled in MySQL
6 billion hours of video watched each month
Globally-distributed database with 100 terabytes of user-related
data based on MySQL Cluster
4
5. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
An Avalanche of Data
5
6. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Why Is Big Data Important?
Value Creation
HEALTH CARE MANUFACTURING COMMUNICATIONS
âIn a big data world, a competitor that fails to
sufficiently develop its capabilities will be left behind.â
Reduce Prescription
Fraud
Accelerate Test
Cycles to Reduce
Backlog
Offering New Services
based on Location
Data
McKinsey Global Institute
RETAIL
Better Predict
Product Success
PUBLIC SECTOR
Improve Student
Outcomes
7. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Create Value
Big Data: What It Is, What It Means
Volume
Variety
Velocity
7
8. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data: Strategic Transformation
⢠From REPORTING To ANALYTICS
⢠From REAR-VIEW MIRROR To PREDICT/EXPLORE
⢠From SOME DATA To BIG DATA
8
9. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Whatâs Changed?
⢠Enablers
â Digitization â nearly everything has a digital heartbeat
â Ability to store much larger data volumes (distributed file systems)
â Ability to process much larger data volumes (parallel processing)
⢠Why is this different from BI/DW?
â Business formulated questions to ask upfront
â Drove what was data collected, data model, query design
Big Data enables what-if analysis and real-time discovery
9
10. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Adoption
⢠Web Recommendations
⢠Sentiment Analysis
⢠Marketing Campaign Analysis
⢠Customer Churn Modeling
⢠Fraud Detection
⢠Research and Development
⢠Risk Modeling
⢠Machine Learning
10
11. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Can Help You âŚ
Chief Marketing Officer
Sell More
Chief Financial Officer
Manage Risk
Chief Information Officer
Reduce Cost
11
12. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Leading Use-Case: On-Line Retail
Users
Browsing
Recommendations
Profile,
Purchase
History
Web Logs:
Pages Viewed
Comments Posted
Social media updates
Preferences
Brands âLikedâ
Recommendations
12
13. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Why Hadoop?
⢠Scales to thousands of nodes, PB of structured and unstructured data
â Combines data from multiple sources, schema-less
â Run queries against all of the data
⢠Runs on commodity servers, handles storage and processing
⢠Data is replicated, self-healing
⢠Initially just batch (Map/Reduce) processing
â Moving towards more interactive processing
⢠Oracle Big Data SQL, Spark SQL, Apache Hive, Apache Drill, Apache Phoenix
⢠IBM BigSQL, Cloudera Impala, Presto, Stinger, HAWQ, more every day âŚ
13
14. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Lifecycle
Better Decisions Using Big Data
14
ANALYZE
DECIDE ACQUIRE
ORGANIZE
CREATE VALUE
FROM DATA
15. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Lifecycle
Better Decisions Using Big Data
15
ANALYZE
DECIDE ACQUIRE
ORGANIZE
CREATE VALUE
FROM DATA
16. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Lifecycle
Better Decisions Using Big Data
16
ACQUIRE
CREATE VALUE
FROM DATA
NoSQL Interfaces
MySQL Database
MySQL Cluster
MySQL Fabric
17. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL NoSQL Interfaces: Fast, Flexible, Safe
Blazing Fast
Key/Value Queries
Fully Transactional/
ACID
NoSQL and SQL
across the same
data Set
17
Combined with Schema Flexibility: Online DDL
18. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Strategy: Best of Both Worlds
⢠Mix Key/Value & Relational Queries
⢠Transactional Integrity
⢠Complex Queries
⢠Standards & Skillsets
18
19. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.6: InnoDB, NoSQL With Memcached
Up to 9X higher âSET/INSERTâ Throughput
19
20. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7: InnoDB, NoSQL With Memcached
6x Faster than MySQL 5.6
Thank you, Facebook
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
8 16 32 64 128 256 512 1,024
QueriesperSecond
Connections
MySQL 5.7 vs 5.6 - InnoDB & Memcached
MySQL 5.7
MySQL 5.6
1 Million QPS
Intel(R) Xeon(R) CPU X7560 x86_64
4 sockets x 10 cores-HT (80 CPU threads)
2.3 GHz, 512 GB RAM
Oracle Linux 6.5
20
21. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster: Multiple NoSQL Interfaces
Mix & Match
21
22. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster 7.4 Benchmarks
⢠NoSQL C++ API, flexAsynch Benchmark
⢠32 x Intel E5-2697 Intel Servers, 2 socket, 64GB
â 14 cores and 28 CPU threads per CPU socket
â Total of 28 cores and 56 CPU threadss, operating at 2.6GHz
⢠ACID Transactions, with Synchronous
Replication
200 Million QPS
22
23. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Cluster Schema Flexibility
Configure with or without Schema
<town:maidenhead,SL6>
key value
Key Value
town:maidenhead SL6
Generic Table
Application view
SQL view
23
24. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Fabric
Scale out with Data Sharding + High Availability
⢠Scale-out through sharding
â Read AND Write
â Standard framework,
no more custom solutions
⢠HA out of the box
â On top of Replication
â Automatic failover
â Automatic routing
24
25. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Lifecycle
Better Decisions Using Big Data
25Copyright Š 2014 Oracle and/or its affiliates. All rights reserved. |
ACQUIRE
ORGANIZE
CREATE VALUE
FROM DATA
Import Data
Apache Sqoop
MySQL Hadoop Applier
26. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Apache Sqoop
⢠Apache TLP, part of Hadoop project
â Developed by Cloudera
⢠Bulk data import and export
â Between Hadoop (HDFS) and external data stores
⢠JDBC Connector architecture
â Supports plug-ins for specific functionality
⢠âFast Pathâ Connector developed for MySQL
26
27. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Apache Sqoop
Transactional
Data
HDFS StorageSqoop Job
Map
Map
Map
Map
Sqoop Import
Gather
Metadata
Submit Map Only Job
Hadoop Cluster
27
28. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Applier for Hadoop
⢠Real-time streaming of events from MySQL to Hadoop
â Supports move towards âSpeed of Thoughtâ analytics
⢠Connects to the binary log, writes events to HDFS via libhdfs library
⢠Each database table mapped to a Hive data warehouse directory
â With the ability to create custom content handlers
⢠Enables eco-system of Hadoop tools to integrate with MySQL data
⢠Available for download now: labs.mysql.com
labs.mysql.com
28
29. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Applier for Hadoop
29
labs.mysql.com
30. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Applier for Hadoop: Integration with Hive
⢠Hive runs on top of Hadoop. Install HIVE on the
hadoop master node
⢠Set the default datawarehouse directory same as
the base directory into which Hadoop Applier
writes
⢠Create similar schema's on both MySQL & Hive
⢠Timestamps are inserted as first field in HDFS
files
⢠Data is stored in 'datafile1.txt' by default
⢠The working directory is
base_dir/db_name.db/tb_name
SQL Query Hive QL
CREATE TABLE t (i INT); CREATE TABLE t (
time_stamp INT, i INT)
[ROW FORMAT
DELIMITED]
STORED AS TEXTFILE;
labs.mysql.com
30
31. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Applier for Hadoop: Integration with Hive
31
labs.mysql.com
32. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Lifecycle
Better Decisions Using Big Data
ANALYZE
DECIDE
CREATE VALUE
FROM DATA
Analyze
Export Data
Decide
32
33. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Analyze Big Data in Hadoop
33
34. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL: Reporting Database for BI
34
35. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Management ToolsAdvanced Features Support
⢠Scalability
⢠High Availability
⢠Security
⢠Audit
⢠Encryption
⢠Monitoring
⢠Backup
⢠Development
⢠Administration
⢠Migration
⢠Technical Support
⢠Consultative Support
⢠Oracle Certifications
Data Analysis with MySQL Enterprise Edition
35
36. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Enterprise Monitor with Query Analyzer
Enhance DevOps Agility Tune Analytical Queries
36
37. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Scaling, Security, and Data Protection
MySQL Enterprise Scalability
MySQL Enterprise Monitor
MySQL Enterprise Backup
MySQL Enterprise Security
MySQL Enterprise Encryption
MySQL Enterprise Audit
MySQL Enterprise Authentication
MySQL Enterprise High Availability
Oracle Enterprise Manager for MySQL
37
38. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Enterprise Support
⢠Largest MySQL engineering and support organization
⢠Backed by the MySQL developers
⢠World-class support, in 29 languages
⢠Hot fixes & maintenance releases
⢠24x7x365
⢠Unlimited incidents
⢠Consultative support
⢠Global scale and reach
Get immediate help for any MySQL
issue, plus expert advice
38
39. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Consultative Support
Make the Most of your Deployments
⢠Remote troubleshooting
⢠Replication review
⢠Partitioning review
⢠Schema review
⢠Query review
⢠Performance tuning
⢠...and more
39
40. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Why MySQL Enterprise Edition?
In Addition to all the MySQL Features you Love
Insure Your Deployments
Get the Best Results
Delight Customers
Improve
Performance
& Scalability
Enhance Agility &
Productivity
Reduce TCO
Mitigate Risks
Get
Immediate
Help if/when
Needed
Increase
Customer
Satisfaction
40
41. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL & Hadoop Integration: a Complete Example
Driving a Personalized Web Experience
41
42. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Company Overview
boo-box is one of the largest advertising networks in
South America, with a focus on the Brazilian social
media market.
Application
boo-box relies on MySQL and Hadoop to display 1
billion advertisements to 60 million people across
430,000 web sites and social network profiles every
month.
Why MySQL?
"MySQL is a core part of our big data strategy. Simple
integration with Hadoop enables us to improve our
digital advertising service and grow our business with
maximum speed and agility.â JosafĂĄ Santos,
IT Manager, boo-box
boo-box
42
43. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Leveraging Other Oracle Solutions
For Data Aquired in MySQL
Acquire Organize Analyze Decide
Web Data Acquired
in MySQL
Analyzed with
Oracle Exadata
Organized with
Oracle Big Data
Appliance
Decide Using the
power of Oracle
Exalytics
43
44. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
MySQL Enterprise Oracle Certifications
⢠Oracle Linux
⢠Oracle VM
⢠Oracle GoldenGate
⢠Oracle Solaris Clustering
⢠Oracle Clusterware
⢠Oracle Enterprise Manager
⢠Oracle Fusion Middleware
⢠Oracle Audit Vault & Database Firewall
⢠Oracle Secure Backup
⢠MyOracle Online Support
MySQL Integrates into the Oracle Environment
44
45. Copyright Š 2015, Oracle and/or its affiliates. All rights reserved. |
Summary
⢠Create value from Big Data with MySQL
⢠MySQL + Hadoop: widely deployed solution
⢠âBest of both worldsâ SQL + NoSQL Access
⢠Scale Out & data sharding with MySQL Fabric
⢠Tools and expertise to support you
⢠End to end Oracle solutions for Big Data
45