This document summarizes a breakfast seminar on data warehousing solutions with MySQL that was held in London on February 4th, 2010. The agenda included introductions, presentations on using MySQL for data warehousing, Infobright, Talend, and time for questions. The seminar provided an overview of MySQL's data warehousing strategy and ecosystem, common use cases, storage engines, and best practices for modeling, queries, partitioning, replication and other technologies.
4. MySQL Market Segments
`
Web / Web 2.0 OEM / ISV's
On Demand, SaaS,
Hosting Telecommunications Enterprise 2.0
Open-Source Powers the Web & The Network
4
Sunday, 7 February 2010
5. Timeline
MAR
2008
Sun acquired MySQL completed March 2008
Good acquisition, MySQL continues to grow
APR
2009 April 2009 : ORCL agreement to acquire Sun
JAN
2010 The EC gives full clearance to the acquisition
FEB
2010
We continue to develop, maintain, market, sell and
support MySQL!
5
Sunday, 7 February 2010
6. Oracle’s MySQL Strategy
• Becomes part of the Open Source GBU
> Independent sales organisation - retained from Sun
> Independent development organisation – retained from Sun
• Make MySQL better
> Apply Oracle’s expertise and engineering processes
> A natural extension of what Oracle has done with InnoDB
• Make MySQL support better
> Leverage Oracle’s award winning global support infrastructure
• Make MySQL part of the Oracle stack
> Many customers use both MySQL and Oracle database
> Integrate with Enterprise Manager, Secure Backup, Audit Vault
http://www.oracle.com/ocom/groups/public/@ocom/documents/webcontent/044521.pdf 6
Sunday, 7 February 2010
9. MySQL Data Warehousing Strategy
• Strongly support common data warehouse use cases
• Offer modern technology that adheres to MySQL’s
software priorities (reliability, performance, ease-of-use)
• Partner with major BI/ETL vendors
• Offer highly attractive total cost of ownership
9
Sunday, 7 February 2010
10. The MySQL DW Ecosystem
BI/REPORTING
ETL INTEGRATION
TOOLS
RDBMS
STORAGE ENGINE
PLATFORM
10
Sunday, 7 February 2010
11. Common Use Cases
1.Small, semi real-time data marts
2.Continuous, real-time/query data warehousing
3.Traditional, standard reporting warehouse
4.Massive historical, with ad-hoc queries warehouse
5.BI, analytic in OLTP applications (emerging…)
Data Mart Real-Time Traditional Historical Analytical
SQL
11
Sunday, 7 February 2010
12. MySQL Technical Strategy
• Provide open source architecture to maximize innovation
• Offer core data warehousing feature set
• Provide specialised data warehouse engines for key use
cases
• Supply strategies for combating mixed workload
challenge
12
Sunday, 7 February 2010
14. MySQL Enterprise
• MySQL Enterprise Server
• Monthly Rapid Updates
Server • Quarterly Service Packs
• Hot Fix Program
• Indemnification
• Global Monitoring of All Servers
• Web-Based Central Console
Monitor • Built-in Advisors and Expert Advice
• MySQL Query Analyzer
• Replication Monitor
• 24 x 7 x 365 Production Support
• Web-Based Knowledge Base
Support • Consultative Help
• High Availability and Scale Out
http://www.mysql.com/products/enterprise/ 14
Sunday, 7 February 2010
15. MySQL Enterprise Monitor
“Your Virtual MySQL DBA”
Assistant
• Single, consolidated view into
entire MySQL environment
• Auto discovery of MySQL
Servers, Replication Topologies
• New Query Analyzer
• Customisable rules-based
monitoring and alerts
• Identifies problems before they
occur
• Reduces risk of downtime
• Makes it easier
to scale-out without
requiring more DBAs
http://www.mysql.com/products/enterprise/advisors.html
15
Sunday, 7 February 2010
16. MySQL Query Analyzer
• Centralised monitoring of Queries
across all servers
• No reliance on Slow Query Logs,
SHOW PROCESSLIST, VMSTAT,
etc.
• Aggregated view of query
execution counts, time, and rows
• Saves time parsing atomic
executions for total query expense
“Finds code problems before your customers do.”
16
Sunday, 7 February 2010
17. The MySQL Technology behind a DW Strategy
SHARDING REPLICATION MySQL PROXY
MEMCACHED QUERY CACHE
STORAGE
PARTITIONING ENGINES
Col1 Col2 Col3 Col4 Col5 Col1 Col2 Col3 Col4 Col5
Col1 Col2 Col3 Col4 Col5
17
Sunday, 7 February 2010
19. MySQL
Data Warehouse
Cookbook
Sunday, 7 February 2010
20. Partitioning
• Partition Pruning
• Partitioning key must result in an INT
• Check table lock with MyISAM
• Check the number of open files
• Foreign Keys, Fulltext and spatial indexes are not supported
• No MyISAM, LOAD INDEX or INSERT DELAYED
• For DW, it is mainly limited to InnoDB and MyISAM
Vertical Partitioning Horizontal Partitioning
Col1 Col2 Col3 Col4 Col5 Col1 Col2 Col1 Col3 Col4 Col5 Col1 Col2 Col3 Col4 Col5 Col1 Col2 Col3 Col4 Col5
Col1 Col2 Col3 Col4 Col5
20
Sunday, 7 February 2010
21. SQL Generation
• Multipass SQL or Subqueries
• Avoid complex queries
> More efficient use of query cache, key buffer and buffer pool
> More shard friendly
> More scalable for the current version of MySQL
–No parallel query
• Use temp tables and stored procedures
• Check with EXPLAIN
> ALL (sequential scan)
> Using filesort
> Using temporary (for GROUP BY and ORDER BY)
21
Sunday, 7 February 2010
22. Server Tuning
Query Cache Temporary Tables
• SELECT...SQL_NO_CACHE • tmp_table_size
• query_cache_type • max_heap_table_size
• query_cache_limit • Implicit tmp tables can be tricky to control
• query_cache_size • Store intermediate results
• No time functions • Connect > Query > Disconnect
Thread Buffers
• join_buffer_size
• read_buffer_size
• read_rnd_buffer_size
• sort_buffer_size
• For large resultsets and for high number of concurrent users,
they should be set individually or by role
22
Sunday, 7 February 2010
23. Modelling
• Multidimensional, but with care • Queries
• Snowflake vs Star Schema > Query on Dimension N > Temp Table
> Do not denormalise descriptions > Query on Fact 1 > Temp Table
> Multiple fact tables with 1:1 relationships > Query on Fact 2 Join Temp Table
Key Desc Key Desc
Key Desc Key Desc Key Desc Key Desc Key Desc Key Desc Key Key Desc Key Key Desc
Key Key Key Desc Key Key Key Desc
PK Key Key Key Key Met Met Met Met Met PK Key Key Key Key Met Met Met Met Met
Key Key Key Desc Key Key Key Desc
Key Desc Key Desc Key Desc Key Desc Key Desc Key Desc Key Key Desc Key Key Desc
Key Desc Key Desc
PK Key Key Key Key ... Key Met Met Met PK Met Met Met Met Met Met Met
23
Sunday, 7 February 2010
24. Storage Engines
MyISAM CSV
• Compressed Tables • Good ETL trick
• Use different spindles for data and indexes • No Partitioning, no indexing, no nulls
• Fast inserts - Insert already sorted data (when possible)
• Key Buffers
• Multiple Key Buffers
• SET GLOBAL <key_cache_name>.key_buffer_size... Archive
• CACHE INDEX ... IN ... • Data compression and fast retrieve
• key_cache_block_size • INSERT & SELECT
• bulk_insert_buffer_size • No index (autoincrement only)
• Spatial and Fulltext indexes
• All active shared disk cluster
Federated
InnoDB • Limited indexing
• innodb_file_per_table • Tips:
• innodb_flush_log_at_trx_commit • Queries can be executed on multiple servers + result
• innodb_buffer_pool_size collection
• The new Innodb plugin • Use of stored procedures to consolidate results and
control the access to the FEDERATED tables
• Fast index creation
• Data compression
• Do not use FK or constraints
24
Sunday, 7 February 2010
25. Replication Source
Master
• [For some] The easiest way to
provide real time data marts
Querying Updating
• Tips:
Rotating
> Delayed replication Slaves
> Rotating servers
> Support to more power users
BI/Report
Read Servers
Write
Real -10 -30 -1 -12
Yesterday
Time Min Min Hour Hours
Source
Master
25
Sunday, 7 February 2010
26. Sharding
• Sharding
> Great to distribute the workload
> Fantastic if the queries can be executed in parallel thanks to a middle or a client
layer
> Tips:
– Replicate the dimensions
– specialise shards on facts
– partition facts on shards
BI/Report
Read Servers
Write
Shards
A1 A2 B C1 C2 D
Dimensions
Master
26
Sunday, 7 February 2010
27. More Resources Available
• Webinars
• http://www-it.mysql.com/news-and-events/web-seminars/
• Consulting
• MySQL Architecture & Design
• MySQL Performance tuning
http://www.mysql.com/consulting/
• Training
• MySQL 5.1 for developers
• MySQL 5.1 for DBAs
http://www.mysql.com/training/
• White Papers
• http://www.mysql.com/why-mysql/white-papers/
27
Sunday, 7 February 2010
28. Thank You!
Data Warehouse Solutions
with MySQL
ivan@mysql.com
http://izoratti.blogspot.com 28
Sunday, 7 February 2010