SlideShare ist ein Scribd-Unternehmen logo
1 von 56
 
Refactoring Database Perficient China Lancelot Zhu [email_address]
Agenda ,[object Object],[object Object],[object Object],[object Object]
Evolutionary Database Development
Evolutionary Data Modeling ,[object Object]
Database Regression Testing ,[object Object],[object Object],[object Object],[object Object]
Configuration Management of Database Artifacts  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Developer Sandboxes  ,[object Object]
The Process of Database Refactoring
The two categories of database architecture ,[object Object],[object Object]
Database Smells ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How Database Refactoring Fits In  ,[object Object]
Why DB Refactoring is Hard ,[object Object]
The database refactoring process  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Database Refactoring Strategies
Database Refactoring Strategies ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Version your database ,[object Object],[object Object]
[object Object]
Database Refactoring Categories Category Description Examples Structural A change to the definition of one or more tables or views. ,[object Object],[object Object],[object Object],Data Quality A change that improves the quality of the information contained within a database. ,[object Object],[object Object],[object Object],Referential Integrity A change that ensures that a referenced row exists within another table and/or that ensures that a row that is no longer needed is removed appropriately. ,[object Object],[object Object],[object Object]
Database Refactoring Categories (Continued) Category Description Example Architectural A change that improves the overall manner in which external programs interact with a database. ,[object Object],[object Object],[object Object],Method A change to a method (a stored procedure, stored function, or trigger) that improves its quality. Many code refactorings are applicable to database methods. ,[object Object],[object Object],[object Object],Non-Refactoring Transformation  A change to your database schema that changes its semantics. ,[object Object],[object Object],[object Object]
Drop Column (1) ,[object Object]
Drop Column (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Drop Column (3) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Drop Table (1) ,[object Object]
Drop Table (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Rename Column (1) ,[object Object]
Rename Column (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Rename Column (3) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Rename Column (4) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Rename Table (1) ,[object Object]
Rename Table (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Rename Table (3) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Rename Table (4) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Add Lookup Table (1) ,[object Object]
Add Lookup Table (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Add Lookup Table (3) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduce Column Constraint (1) ,[object Object]
Introduce Column Constraint (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduce Default Value (1) ,[object Object]
Introduce Default Value (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Make Column Not-Nullable (1) ,[object Object]
Make Column Not-Nullable (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Add Foreign Key Constraint (1) ,[object Object]
Add Foreign Key Constraint (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Add Foreign Key Constraint (3) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduce Soft Delete (1) ,[object Object]
Introduce Soft Delete (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduce Index (1) ,[object Object]
Introduce Index (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduce Read-Only Table (1) ,[object Object]
Introduce Read-Only Table (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduce Read-Only Table (3) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduce Read-Only Table (4) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object]
Questions? Gossip? Rumor?
Thanks

Weitere ähnliche Inhalte

Was ist angesagt?

SSIS Project Profile
SSIS Project ProfileSSIS Project Profile
SSIS Project Profiletthompson0421
 
OER UNIT 4 PARTITION
OER UNIT 4 PARTITIONOER UNIT 4 PARTITION
OER UNIT 4 PARTITIONGirija Muscut
 
The Database Environment Chapter 10
The Database Environment Chapter 10The Database Environment Chapter 10
The Database Environment Chapter 10Jeanie Arnoco
 
Relational Database Management System
Relational Database Management SystemRelational Database Management System
Relational Database Management Systemsweetysweety8
 
Oracle Database DML DDL and TCL
Oracle Database DML DDL and TCL Oracle Database DML DDL and TCL
Oracle Database DML DDL and TCL Abdul Rehman
 
Database Systems - SQL - DCL Statements (Chapter 3/4)
Database Systems - SQL - DCL Statements (Chapter 3/4)Database Systems - SQL - DCL Statements (Chapter 3/4)
Database Systems - SQL - DCL Statements (Chapter 3/4)Vidyasagar Mundroy
 
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried FärberTrivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried FärberTrivadis
 
Partitioning tables and indexing them
Partitioning tables and indexing them Partitioning tables and indexing them
Partitioning tables and indexing them Hemant K Chitale
 
Partitioning Tables and Indexing Them --- Article
Partitioning Tables and Indexing Them --- ArticlePartitioning Tables and Indexing Them --- Article
Partitioning Tables and Indexing Them --- ArticleHemant K Chitale
 

Was ist angesagt? (20)

Sql DML
Sql DMLSql DML
Sql DML
 
Sql intro & ddl 1
Sql intro & ddl 1Sql intro & ddl 1
Sql intro & ddl 1
 
Oracle 11g SQL Overview
Oracle 11g SQL OverviewOracle 11g SQL Overview
Oracle 11g SQL Overview
 
SSIS Project Profile
SSIS Project ProfileSSIS Project Profile
SSIS Project Profile
 
Chapter16
Chapter16Chapter16
Chapter16
 
Teradata imp
Teradata impTeradata imp
Teradata imp
 
Sq lite
Sq liteSq lite
Sq lite
 
T-SQL Overview
T-SQL OverviewT-SQL Overview
T-SQL Overview
 
Fg d
Fg dFg d
Fg d
 
OER UNIT 4 PARTITION
OER UNIT 4 PARTITIONOER UNIT 4 PARTITION
OER UNIT 4 PARTITION
 
The Database Environment Chapter 10
The Database Environment Chapter 10The Database Environment Chapter 10
The Database Environment Chapter 10
 
Relational Database Management System
Relational Database Management SystemRelational Database Management System
Relational Database Management System
 
Oracle Database DML DDL and TCL
Oracle Database DML DDL and TCL Oracle Database DML DDL and TCL
Oracle Database DML DDL and TCL
 
Database Systems - SQL - DCL Statements (Chapter 3/4)
Database Systems - SQL - DCL Statements (Chapter 3/4)Database Systems - SQL - DCL Statements (Chapter 3/4)
Database Systems - SQL - DCL Statements (Chapter 3/4)
 
Chapter24
Chapter24Chapter24
Chapter24
 
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried FärberTrivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
 
Sql2
Sql2Sql2
Sql2
 
Partitioning tables and indexing them
Partitioning tables and indexing them Partitioning tables and indexing them
Partitioning tables and indexing them
 
Partitioning Tables and Indexing Them --- Article
Partitioning Tables and Indexing Them --- ArticlePartitioning Tables and Indexing Them --- Article
Partitioning Tables and Indexing Them --- Article
 
8. sql
8. sql8. sql
8. sql
 

Ähnlich wie Refactoring database

PPT SQL CLASS.pptx
PPT SQL CLASS.pptxPPT SQL CLASS.pptx
PPT SQL CLASS.pptxAngeOuattara
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresSteven Johnson
 
My lablkxjlkxjcvlxkcjvlxckjvlxck ppt.pptx
My lablkxjlkxjcvlxkcjvlxckjvlxck ppt.pptxMy lablkxjlkxjcvlxkcjvlxckjvlxck ppt.pptx
My lablkxjlkxjcvlxkcjvlxckjvlxck ppt.pptxEliasPetros
 
Bank mangement system
Bank mangement systemBank mangement system
Bank mangement systemFaisalGhffar
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slidesmetsarin
 
BIS06 Physical Database Models
BIS06 Physical Database ModelsBIS06 Physical Database Models
BIS06 Physical Database ModelsPrithwis Mukerjee
 
BIS06 Physical Database Models
BIS06 Physical Database ModelsBIS06 Physical Database Models
BIS06 Physical Database ModelsPrithwis Mukerjee
 
Session 8 connect your universal application with database .. builders & deve...
Session 8 connect your universal application with database .. builders & deve...Session 8 connect your universal application with database .. builders & deve...
Session 8 connect your universal application with database .. builders & deve...Moatasim Magdy
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Mark Kromer
 
Sql Commands_Dr.R.Shalini.ppt
Sql Commands_Dr.R.Shalini.pptSql Commands_Dr.R.Shalini.ppt
Sql Commands_Dr.R.Shalini.pptDrRShaliniVISTAS
 
Chapter Five Physical Database Design.pptx
Chapter Five Physical Database Design.pptxChapter Five Physical Database Design.pptx
Chapter Five Physical Database Design.pptxhaymanot taddesse
 
Training MS Access 2007
Training MS Access 2007Training MS Access 2007
Training MS Access 2007crespoje
 
SQL Prepared Statements Tutorial
SQL Prepared Statements TutorialSQL Prepared Statements Tutorial
SQL Prepared Statements TutorialProdigyView
 
Data base testing
Data base testingData base testing
Data base testingBugRaptors
 

Ähnlich wie Refactoring database (20)

Sql
SqlSql
Sql
 
PPT SQL CLASS.pptx
PPT SQL CLASS.pptxPPT SQL CLASS.pptx
PPT SQL CLASS.pptx
 
AWS RDS Migration Tool
AWS RDS Migration Tool AWS RDS Migration Tool
AWS RDS Migration Tool
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome Measures
 
Module 3
Module 3Module 3
Module 3
 
My lablkxjlkxjcvlxkcjvlxckjvlxck ppt.pptx
My lablkxjlkxjcvlxkcjvlxckjvlxck ppt.pptxMy lablkxjlkxjcvlxkcjvlxckjvlxck ppt.pptx
My lablkxjlkxjcvlxkcjvlxckjvlxck ppt.pptx
 
Review of SQL
Review of SQLReview of SQL
Review of SQL
 
Bank mangement system
Bank mangement systemBank mangement system
Bank mangement system
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
 
BIS06 Physical Database Models
BIS06 Physical Database ModelsBIS06 Physical Database Models
BIS06 Physical Database Models
 
BIS06 Physical Database Models
BIS06 Physical Database ModelsBIS06 Physical Database Models
BIS06 Physical Database Models
 
Session 8 connect your universal application with database .. builders & deve...
Session 8 connect your universal application with database .. builders & deve...Session 8 connect your universal application with database .. builders & deve...
Session 8 connect your universal application with database .. builders & deve...
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)
 
Sql Commands_Dr.R.Shalini.ppt
Sql Commands_Dr.R.Shalini.pptSql Commands_Dr.R.Shalini.ppt
Sql Commands_Dr.R.Shalini.ppt
 
Chapter Five Physical Database Design.pptx
Chapter Five Physical Database Design.pptxChapter Five Physical Database Design.pptx
Chapter Five Physical Database Design.pptx
 
Training MS Access 2007
Training MS Access 2007Training MS Access 2007
Training MS Access 2007
 
SQL Prepared Statements Tutorial
SQL Prepared Statements TutorialSQL Prepared Statements Tutorial
SQL Prepared Statements Tutorial
 
Data base testing
Data base testingData base testing
Data base testing
 
T6
T6T6
T6
 
[PHPUGPH] PHP Roadshow - MySQL
[PHPUGPH] PHP Roadshow - MySQL[PHPUGPH] PHP Roadshow - MySQL
[PHPUGPH] PHP Roadshow - MySQL
 

Refactoring database

Hinweis der Redaktion

  1. 1
  2. 3
  3. Multipurpose column. If a column is being used for several purposes, it is likely that extra code exists to ensure that the source data is being used the "right way," often by checking the values of one or more other columns. An example is a column used to store either someone's birth date if he or she is a customer or the start date if that person is an employee. Worse yet, you are likely constrained in the functionality that you can now supportfor example, how would you store the birth date of an employee? Multipurpose table. Similarly, when a table is being used to store several types of entities, there is likely a design flaw. An example is a generic Customer table that is used to store information about both people and corporations. The problem with this approach is that data structures for people and corporations differpeople have a first, middle, and last name, for example; whereas a corporation simply has a legal name. A generic Customer table would have columns that are NULL for some kinds of customers but not others. Redundant data. Redundant data is a serious problem in operational databases because when data is stored in several places, the opportunity for inconsistency occurs. For example, it is quite common to discover that customer information is stored in many different places within your organization. In fact, many companies are unable to put together an accurate list of who their customers actually are. The problem is that in one table John Smith lives at 123 Main Street, and in another table at 456 Elm Street. In this case, this is actually one person who used to live at 123 Main Street but who moved last year; unfortunately, John did not submit two change of address forms to your company, one for each application that knows about him. Tables with too many columns. When a table has many columns, it is indicative that the table lacks cohesionthat it is trying to store data from several entities. Perhaps your Customer table contains columns to store three different addresses (shipping, billing, seasonal) or several phone numbers (home, work, cell, and so on). You likely need to normalize this structure by adding Address and PhoneNumber tables. Tables with too many rows. Large tables are indicative of performance problems. For example, it is time-consuming to search a table with millions of rows. You may want to split the table vertically by moving some columns into another table, or split it horizontally by moving some rows into another table. Both strategies reduce the size of the table, potentially improving performance. "Smart" columns. A smart column is one in which different positions within the data represent different concepts. For example, if the first four digits of the client ID indicate the client's home branch, then client ID is a smart column because you can parse it to discover more granular information (for example, home branch ID). Another example includes a text column used to store XML data structures; clearly, you can parse the XML data structure for smaller data fields. Smart columns often need to be reorganized into their constituent data fields at some point so that the database can easily deal with them as separate elements. Fear of change. If you are afraid to change your database schema because you are afraid to break somethingfor example, the 50 applications that access itthat is the surest sign that you need to refactor your schema. Fear of change is a good indication that you have a serious technical risk on your hands, one that will only get worse over time.
  4. *Introduce referential integrity. You may want to introduce a referential integrity constraint on an existing Address.State to ensure the quality of the data. *Provide code lookup. Many times you want to provide a defined list of codes in your database instead of having an enumeration in every application. The lookup table is often cached in memory. *Replace a column constraint. When you introduced the column, you added a column constraint to ensure that a small number of correct code values persisted. But, as your application(s) evolved, you needed to introduce more code values, until you got to the point where it was easier to maintain the values in a lookup table instead of updating the column constraint. *Provide detailed descriptions. In addition to defining the allowable codes, you may also want to store descriptive information about the codes. For example, in the State table, you may want to relate the code CA to California. 1. Determine the table structure. You must identify the column(s) of the lookup table (State). 2. Introduce the table. Create State in the database via the CREATE TABLE command. 3. Determine lookup data. You have to determine what rows are going to be inserted in the State. 4. Introduce referential constraint. To enforce referential integrity constraints from the code column in the source table(s) to State, you must apply the Add Foreign Key refactoring.
  5. *Identifying a true default can be difficult. When many applications share the same database, they may have different default values for the same column, often for good reasons. Or it may simply be that your business stakeholders cannot agree on a single valueyou need to work closely with them to negotiate the correct value. *Unintended side effects. Some applications may assume that a null value within a column actually means something and will therefore exhibit different behavior now that columns in new rows that formerly would have been null now are not. *Confused context. When a column is not used by an application, the default value may introduce confusion over the column's usage with the application team. 1. Invariants are broken by the new value. For example, a class may assume that the value of a color column is red, green, or blue, but the default value has now been defined as yellow. 2. Code exists to apply default values. There may now be extraneous source code that checks for a null value and introduces the default value programmatically. This code should be removed. 3. Existing source code assumes a different default value. For example, existing code may look for the default value of none, which was set programmatically in the past, and if found it gives users the option to change the color. Now the default value is yellow, so this code will never be invoked.
  6. 1. Similar RI code. Some external programs will implement the RI business rule that will now be handled via the foreign key constraint within the database. This code should be removed. 2. Different RI code. Some external programs will include code that enforces different RI business rules than what you are about to implement. This implication is that you either need to reconsider adding this foreign key constraint because there is no consensus within your organization regarding the business rule that it implements or you need to rework the code to work based on this new version (from its point of view) of the business rule. 3. Nonexistent RI code. Some external programs will not even be aware of the RI business rule pertaining to these data tables.
  7. *Improve query performance. Querying a given set of tables may be very slow because of the requisite joins; therefore, a prepopulated table may improve overall performance. *Summarize data for reporting. Many reports require summary data, which can be prepopulated into a read-only table and then used many times over. *Create redundant data. Many applications query data in real time from other databases. A read-only table containing this data in your local database reduces your dependency on these other database(s), providing a buffer for when they go down or are taken down for maintenance. *Replace redundant reads. Several external programs, or stored procedures for that matter, often implement the same retrieval query. These queries can be replaced by a common read-only table or a new view *Data security. A read-only table enables end users to query the data but not update it. *Improve database readability. If you have a highly normalized database, it is usually difficult for users to navigate through all the tables to get to the required information. By introducing read-only tables that capture common, denormalized data structures, you make your database schema easier to understand because people can start by focusing just on the denormalized tables.
  8. *Periodic refresh. Use a scheduled job that refreshes your read-only table. The job may refresh all the data in the read-only table or it may just update the changes since the last refresh. Note that the amount of time taken to refresh the data should be less than the scheduled interval time of the refresh. This technique is particularly suited for data warehouse kind of environments, where data is generally summarized and used the next day. Hence, stale data can be tolerated; also, this approach provides you with an easier way to synchronize the data. *Materialized views. Some database products provide a feature where a view is no longer just a query; instead, it is actually a table based on a query. The database keeps this materialized view current based on the options you choose when you create it. This technique enables you to use the database's built-in features to refresh the data in the materialized view, with the major downside being the complexity of the view SQL. When the view SQL gets more complicated, the database products tend not to support automated synchronization of the view. *Use trigger-based synchronization. Create triggers on the source tables so that source data changes are propagated to the read-only table. This technique enables you to custom code the data synchronization, which is desirable when you have complex data objects that need to be synchronized; however, you must write all of the triggers, which could be time consuming. *Use real-time application updates. You can change your application so that it updates the read-only table, making the data current. This can only work when you know all the applications that are writing data to your source database tables. This technique allows for the application to update the read-only table, and hence its always kept current, and you can make sure that the data is not used by the application. The downside of the technique is you must write your information twice, first to the original table and second to the denormalized read-only table; this could lead to duplication and hence bugs.
  9. 15