2. Problems
• Overview
– Most of the tables de-normalized / 2nd normal form
– Way too many indexes
– Missing constraints
– No subtype relationships
– Repetitive data
– Inconsistent naming convention
– Wrong choice of datatypes
– No partitioning
3. Data Issues
• Severe requirement for data standardization?
• Oversized datatypes having incorrect data
text/bigint)
• Nullable columns
• Unique constraints not present where
required creating duplicates in master tables
4. Performance Issues
• Due to oversized columns and row sizes, number
of rows fitting in a page is less, causing more IO
• Joins requiring more memory
• All sorts goes for temporary file creation on disk
increasing IO
• Lock (shared / Exclusive) time increase due to
above reasons, reducing concurrency
• Memory inefficiently used due to oversized rows.
• Every update, needs to update unnecessary
indexes increasing lock time.
5. Fixes
1. Reduce the number of rows by creating
archive db.
2. Remove unwanted indexes and reduce
insert/update overheads
3. Upgrade to MySQL 5.5
4. Partitioning ?
6. Short Term fixes at Application tier
• Set isolation level to READ_COMMITTED
• Set dynamic update and dynamic insert = TRUE. 95% of the
updates update all columns even if few columns are
modified.
• Review transaction boundaries. Few flows don't close
(commit/rollback) leaving open transactions
• Query tuning – Not all hibernate queries are optimal. Most
of them do left outer join even when not required.
• Query tuning – Review queries not using indexes, badly
formed queries
• Most queries bring all columns (Select *) and usually
discarded. This is an overhead on cache, n/w and the app.
Tier session as well.
7. Long Term Fixes
1. Schema changes (Normalized, smaller tables)
2. Partitioning
3. Sharding (Application aware / Abstract)
4. Fan- In topology or Cluster
8. Master – Master Setup
• In order scale writes, this is usually a
technique that is used. However here are pit
falls –
– Management of auto increment columns
– A violation of a constraint on one of the masters
can break replication rendering the failed master
as slave
– Resyncing of masters is time consuming and error
prone (replication limitation)