In the Beginning, There Was the Log - Webinar with Martin Hubel

In the beginning,
there was the log
Martin Hubel
+ 1 905-764-7498
martin@mhubel.com
1

Agenda
1.
The role of the database recovery log
2.
How backups fit in
3.
Recovery scenarios
4.
Planning for recovery
2

Disclaimer
•
Every attempt has been made to be accurate with product information. Trademarks are the property of their respective owners.
•
I work mainly on DB2, but have used and supported other products.
•
All DBMS mentioned in today’s webinar are commercial grade, used by hundreds of companies to manage their enterprises. No attempt is made here to compare DBMS products for functionality. Copyright©2014 Martin Hubel Consulting Inc. 3

Assumptions
•
This presentation is about OLTP applications
•
The need for backups is known
•
The need for recovery is also understood
•
Databases are enabled for recovery
•
BI / DW have different recovery requirements
•
Data is recovered / rebuilt / reload from another source
Copyright©2014 Martin Hubel Consulting Inc. 4

The database recovery log
•
Log contains information necessary to:
•
Redo completed units of work
•
Undo incomplete units of work
•
Rebuild system status
•
Today’s presentation will deal mainly with units of work (transactions) and their effect on the recovery log

A successful transaction
•
Single transaction can contain many update actions
•
All updates must complete
•
Then a commit makes changes permanent
•
Successful termination is recorded in the recovery log
•
Any failure rolls changes back to beginning
•
The DB is considered consistent before the transaction and after the commit
•Begin UR •End UR
•INS A
•UPD B
•UPD C
•INS A
•DEL D
•INS E
•
One Business Transaction

Unit of work
•
Updates performed in the subsystem between two points of consistency
•
Point of consistency (commit point) is a point where all recoverable data for an application is consistent with other data
•
Equivalent terms: Synchpoint, Logical Unit of Work or Unit of Recovery (UR)
•
Some environments require a 2 phase commit

Batch transactions
•
Batch jobs handle transaction can contain many update actions
•
Many business transactions are included in a single unit of work
•
The commit makes changes permanent for all transactions
•
Any failure rolls changes back to beginning
•
Many tens, hundreds or even thousands of transactions are processed in a single UR
•Begin UR •End UR

Rollback
• During rollback, data changes are not removed
•The log is read and the changes reversed
•The data is returned to its last consistent state (last commit)
• Some badly designed applications issue rollback statements following
if validation fails
•Can be a lot of log activity for nothing
•Always complete all validation prior to performing first update
•Begin UR
•
End UR
•INS A
•UPD B
•UPD C
•INS E
•One Business Transaction
•DEL E
•UPD C`
•UPD B`
•DEL A
•
Failure

Performance of new application
•
A new financial application had a light transaction processing load
•
Transactions were complex and had a lot of validation and updates to perform
•
Users regularly had to correct input before the transaction was ready for processing
•
Performance was poor. Design showed:
•
After each validation check, the SQL updates to that point were performed
•
If a further check failed, the updates were reversed via rollback
•
This problem was found via a high volume of log activity
•
Much of it for no benefit

Transaction Recovery/Backout
•
Some DBMS products default to this type of recovery (rollback of incomplete URs only)
•
Logging (log retain) in these products must be turned on for full recovery
TIMEOUTWorkRollbackWaitWorkRollbackDEADLOCKWorkWaitContinueabba

Log files
•
“In the beginning, there was the log”
•
DBMS writes to the active log file
•
May be dual logs or mirrored by DBMS

May be on RAID devices at hardware level
•
Logs use buffered I/O

For changes to data
•
Logs are written synchronously at commit time
•
In this way, the log is close to being correct if failure occurs
•
Design I/O system to avoid waits for log writes
•
DBMS must be able to write changes to the log

Archiving logs
•
Current log is called the active log
•
Log files fill up and switch automatically to new files
•
Called archive logs
•
Archive files may remain in same directory for some period
•
May be automatically offloaded to some facility like TSM
•
Problems can occur in the archival process
•
If log file system fills, the DBMS stops!
Active log
Other
storage
Archive logs

Batch, long running, and large transactions
•
Thousands (or more) updates can be performed, e.g. by:
•
A batch job
•
A single SQL statement
•
Long-running transactions cause problems:
•
Many locks taken during the updates
•
Many log records and log files written before a commit
•
Commit frequency must be managed
•
Rollback is also problematic
•
Requires log to be read, and to write the opposite action to reverse the change

“The audit table is too big”
•
The new DBA/Linux sysadminnotices the size of the audit table is 30 million rows
•
Issues a DELETE FROM AUDIT_TABLE WHERE DATE < ‘60 days ago’
•
10M rows will be deleted
•
The effect:
•
All users locked out of the application
•
No estimate of completion time

“The audit table is too big” -2
•
After many complaints and about 2 hours, the DBA cancels the Delete
•
Rollback starts:
•
Rollback has to read the log to finds the updates and write the opposite
•
As such, rollback typically takes twice as long
•
The users continue to be unhappy and are now more vocal

“The audit table is too big” -3
•
The DBA, using his sysadminkills, starts killing DB2 processes
•
Kill -9 has immediate effects to stop DB2 and the rollback
•
But now, the rollback is incomplete
•
At restart, DB2 must remove the effect of the incomplete rollback and start the full rollback afterwards
•
Estimated full recovery time is 24 hours
•
Some behind-the-scenes involvement reduces the outage to 8 hours on a 24x7 system
•
CIO asks: “What should we do with the DBA?”

... And Now
•
A word from Embarcadero...

Can the log be kept forever?
•
Hypothetically yes, the answer is no
•
Not practical
•
Recovery would take too long for one thing
•
Log retention would also be impractical
•
Backups taken within the DBMS are recorded and provide starting points for recovery
•
Once all databases or tables (tablespaces) have been copied, the oldest logs can be deleted

Backup copies
•
Backups provide a starting point for the recovery process
•
Need at least one full backup
•
Two generations are safer
•
DBMS options for backups:
•
Database
•
Table or tablespace
•
Incremental or delta
•
Online or offline
•
Once all databases have a valid backup, the older log files can be deleted
•
Set generous retention periods

Gaps in the log
•
A big recovery consideration is to ensure there are no gaps within the log range
•
A gap indicates that some log records are missing, and DBMS is unable to determine what those records might have contained
•
As such, forward log processing stops
•
It is advantageous to turn logging off for certain situations or operations
•
E.g, large loads

Implications of turning off logging
•
Logging is not free
•
Significant performance impact for certain operations such as batch updates or load operations
•
Not logged operations affect the ability to forward recover objects using the log
•
DBMS does not know what or how data was changed
•
Therefore, DBMS requires a backup following not logged operations to re-establish a recovery point

Production mortgage system
•
DBA group was responsible for backups
•
Forgot to take regular backups
•
Last backup was January 26
•
System group managed the logs
•
Set a “reasonable” retention period of 60 days
•
On April 1, they deleted the logs older than Feb 1
•
On April 8, a hardware problem occurred
•
Production was recovered to January 26

Application recovery scenarios
•
Transaction recovery/backout
•
Application backout
•
Point-in-time recovery
•
Checkpoint restart
•
Data synchronization
•
Data inconsistencies
•
Out of space
•
Media failure
•
Accidentally deleted data

Recovery guidelines
•
Test your recovery procedures
•
Make a backup copy prior to recovery -things can get worse
•
Make a copy after recovery if you think it might happen again
•
Make two copies using the DBMS backup utility
•
Periodically run integrity checking utilities to ensure recoverability

Common user errors -1
•
Delete or write over archive log or image copy
•
Without necessary backups, recovery may be impossible
•
Possible storage management issue
•
Misuse of disk level dump/restore
•
Valid for disaster planning, but dump everything –

Logs

DBMS system objects

User data

Common user errors -2
•
Common Programming Errors
•
Logical unit of work identification
•
Commit/CHKP placed incorrectly
•
Too many commits
•
Coding methods may vary throughout application, e.g.

Use of separate read and write routines

Checking of SQL return code codes

Application Backout
•
Required when multiple jobs running concurrently against same database
•
Work was committed against database
•
Backoutmay not be possible or practical in all cases
Startof Job TimeNormalJob End Job 1Job 2Job 3Job 4Backout, repair, and rerun

SQL change in production
•
“Oops, we thought that was the test system”
•
An SQL error changed all addresses to the same value
•
780,000 rows in table
•
“Martin, please change them back”
•
No, you can’t restore from a backup

New customers have signed up

Old customers may have updated their information

SQL change in production
•
While this was pressing, it was not a “show stopper”
•
Fortunately, I had unloaded this table for some reason
•
This became a source for the correct addresses
•
Alternatives might have been a redirected restore / transportable tablespace
•
We were able to load the backup table
•
Found SQL on internet and adapted it
•
Ultimately updated 362,000 addresses

Summary
•
All update activity is recorded in the DBMS log
•
Log performance is part of overall DBMS performance
•
Backups and logs are both used for recovery
•
Plan and test recovery scenarios prior to needing them

Martin Hubel
Martin Hubel Consulting Inc.
martin@mhubel.com
905-764-7498
In the beginning, there was the log
32

In the Beginning, There Was the Log - Webinar with Martin Hubel

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie In the Beginning, There Was the Log - Webinar with Martin Hubel

Ähnlich wie In the Beginning, There Was the Log - Webinar with Martin Hubel (20)

Mehr von Embarcadero Technologies

Mehr von Embarcadero Technologies (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

In the Beginning, There Was the Log - Webinar with Martin Hubel