SlideShare a Scribd company logo
1 of 40
Drupal Migration Migrating 100,000 pages of content From Legacy CMS to Drupal Rachel Jaro Solutions Architect at PrometSource www.prometsource.com
Overview We’ll talk about: Successful migration recipe Common questions you should be asking before you start Top 3 tools to do migration in Drupal Issues Tools to use in URL Rewriting File management Comparison in D6 Testing Deploying Solution
Data Migration 	“Data migration solutions extract data from a source system, correct errors, reformat, restructure and load the data into a replacement target system”.  	It sounds simple, but poorly managed data migration is the most common cause of failure in implementing a replacement system.  	-- Gershon Pick, March 2001
Successful Migration Recipe
Planning Source: http://www.flickr.com/photos/bjornmeansbear/4380595283/
Plan: What to Ask Node types (Content separation, fields) Do you want to separate contents into pages, articles, biography, news, etc. What fields are needed for each node? Who can access it? Do you really need that content type? Or can we just use taxonomies instead for similar contents.
Plan: What to Ask Taxonomy (Categorization, tags) Do you need to categorize nodes?  Would you need different access? What kind of taxonomy groups or vocabularies you would need? Permission (per nodes) and User Roles Who are going to use the site?  What are particularly their access rights?
Plan: What to Ask New URL mapping Do you need to make SEO friendly URLs? Files, files permissions and file directory Do you need advance file management or document management tool? Do you need simpler solutions? How simple is that.  Do you need access rights for each folder? Do you need browser type interface to access them? What kind of files do you need to store? Images, pdfs?
Build
Requirements Use CSV files to import data Divide migration into group or sections Map and replace old URL to SEO friendly URL Before: 05-200.htm
Data in CSV Example December 13, 2005 3:39:54 PM||||||||||December 13, 2005||||||||||Report Spotlights Need for Reform in Jackpot Jurisdictions||||||||||/press/releases/2005/december/||||||||||05-200||||||||||{UUID}|||||||||| Economics^^^^^^^^^^Economy |||||||||| <p>LoremIpsum is simply dummy text of the printing and typesetting industry. LoremIpsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p> <p>LoremIpsum is simply dummy text of the printing and typesetting industry. LoremIpsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p> $$$$$$$$$$ Separator: |||||||||| End of Row: $$$$$$$$$$
Content Type Division Example: CNN.com Divide migration sequences into US, World, Politics, Justice, etc
Solutions/Tools TW and Migrate modules Combo node_import() Drush + custom script
TW & Migrate Module Combo http://drupal.org/project/tw Supports Migrate module to run views of source data http://drupal.org/project/migrate a flexible framework for migrating content
Migrate Module Features: users browse their legacy data using views support for creating Drupal nodes, users, and comments is included hooks permit migration of other types of content. provides a dashboard for running mini migrations Drush support
Why I did not choose migrate Importing to mysql was not an option. CSV were used instead Cannot map old URL to new URL
node_import() http://drupal.org/project/node_import Features: Easy to learn, Point and click Uses CSV to upload contents Can easily delete previous imported data Can download errors when import failed for easy reference to fix issues
node_import() Problems I can’t define map old URL to new URL No drush support It doesn’t save my old settings for a csv.
Drush + Custom script Flexibility  	- I can do whatever I want with the data
Create your own migration script [demo]
Issues File Management URL Rewriting
File Management Client requirements Intuitive Has wysiwyg support Access control – upload, edit, delete, revise files by different roles Revision control – optional but good to have Limited time!
File Management Modules *DbFm was not included due to problems encountered during tests in D6
URL Rewriting Source: http://www.flickr.com/photos/randomfactor/483264915/
URLs Rewriting Solution Not recommended .htaccess Too many URL to handle.  Too much server load Recommended pathauto + path_redirect modules automated alias settings 301 redirect set global redirect Additional reference: http://acquia.com/blog/migrating-drupal-way-part-ii-saving-those-old-urls
URL Checker http://drupal.org/project/linkchecker
Access control Alternative /default/files/PressReleases /default/files/Documents /default/files/International /default/files/International/America /default/files/International/England /default/files/International/Asia
Test, Test and did I say Test? Source: http://www.flickr.com/photos/paperpariah/2424107350/
Common problems Broken links Misconfigured page Empty pages Invalid date File not found or orphan pages Page format Test when CACHE is on
Deployment
Deployment 2 Ways to Deploy your data to live environment All at once Divide and conquer
Deployment: Divide and Conquer Example: CNN Division
Deployment Mockup * shadow box is your migrated data’s production box * old CMS is still active at this time
Deployment ,[object Object]
URL Testing,[object Object]
Deployment Pros Less risk, less stress  Editors can do continues data entry daily Cons URL rewriting can be a tricky Updating the production box with new content can be an arduous task
Deployment: Updating Production Automation SVN Drush scripts to migrate contents from tester’s box to shadow box Deploy – http://drupal.org/project/deploy Manual Document configuration changes Document database changes
Recap SDLC + Agile Common questions you should be asking before you start Top 3 tools to do migration in Drupal TW & Migrate, node_import(), drush Issues File management Comparison in D6 Tools to use in URL Rewriting Testing Deployment Solution
Questions?
Resources http://groups.drupal.org/content-migration-import-and-export http://drupal.org/handbook/migrating

More Related Content

Similar to Drupal campchicago2010.rachel.datamigration

Migration Best Practices - SEOkomm 2018
Migration Best Practices - SEOkomm 2018Migration Best Practices - SEOkomm 2018
Migration Best Practices - SEOkomm 2018Bastian Grimm
 
Datasheet foldermanagementpluginforrd
Datasheet foldermanagementpluginforrdDatasheet foldermanagementpluginforrd
Datasheet foldermanagementpluginforrdMidVision
 
Seven steps to better security
Seven steps to better securitySeven steps to better security
Seven steps to better securityMichael Pignataro
 
System Architecture at DDVE
System Architecture at DDVESystem Architecture at DDVE
System Architecture at DDVEAlvar Lumberg
 
Best Practices for Migrating a Legacy-Based CMS to Drupal
Best Practices for Migrating a Legacy-Based CMS to DrupalBest Practices for Migrating a Legacy-Based CMS to Drupal
Best Practices for Migrating a Legacy-Based CMS to DrupalAcquia
 
Hybrid Cloud Journey - Maximizing Private and Public Cloud
Hybrid Cloud Journey - Maximizing Private and Public CloudHybrid Cloud Journey - Maximizing Private and Public Cloud
Hybrid Cloud Journey - Maximizing Private and Public CloudRyan Lynn
 
Merging and Migrating: Data Portability from the Trenches
Merging and Migrating: Data Portability from the TrenchesMerging and Migrating: Data Portability from the Trenches
Merging and Migrating: Data Portability from the TrenchesAtlassian
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
Migration Best Practices - SMX West 2019
Migration Best Practices - SMX West 2019Migration Best Practices - SMX West 2019
Migration Best Practices - SMX West 2019Bastian Grimm
 
Planning Your Migration to SharePoint Online #SPBiz60
Planning Your Migration to SharePoint Online #SPBiz60Planning Your Migration to SharePoint Online #SPBiz60
Planning Your Migration to SharePoint Online #SPBiz60Christian Buckley
 
Make Drupal Run Fast - increase page load speed
Make Drupal Run Fast - increase page load speedMake Drupal Run Fast - increase page load speed
Make Drupal Run Fast - increase page load speedAndy Kucharski
 
Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...
Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...
Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...BMC Software
 
What Makes Migrating to the Cloud Different Than On-Premises
What Makes Migrating to the Cloud Different Than On-PremisesWhat Makes Migrating to the Cloud Different Than On-Premises
What Makes Migrating to the Cloud Different Than On-PremisesChristian Buckley
 
BrownSites: Building and Managing a CMS Infrastructure for Higher Ed
BrownSites: Building and Managing a CMS Infrastructure for Higher EdBrownSites: Building and Managing a CMS Infrastructure for Higher Ed
BrownSites: Building and Managing a CMS Infrastructure for Higher EdAlozie Nwosu
 
Spca2014 navigating clouds sp_con14_mackie
Spca2014 navigating clouds sp_con14_mackieSpca2014 navigating clouds sp_con14_mackie
Spca2014 navigating clouds sp_con14_mackieNCCOMMS
 
Taking your site from Drupal 6 to Drupal 7
Taking your site from Drupal 6 to Drupal 7Taking your site from Drupal 6 to Drupal 7
Taking your site from Drupal 6 to Drupal 7Phase2
 
Best Practices and Tips on Migrating a Legacy-Based CMS to Drupal
Best Practices and Tips on Migrating a Legacy-Based CMS to DrupalBest Practices and Tips on Migrating a Legacy-Based CMS to Drupal
Best Practices and Tips on Migrating a Legacy-Based CMS to DrupalMediacurrent
 
Power BI Modeling Use Cases: Desktop to Enterprise with Questions and Answers
Power BI Modeling Use Cases: Desktop to Enterprise with Questions and AnswersPower BI Modeling Use Cases: Desktop to Enterprise with Questions and Answers
Power BI Modeling Use Cases: Desktop to Enterprise with Questions and AnswersSenturus
 

Similar to Drupal campchicago2010.rachel.datamigration (20)

Migration Best Practices - SEOkomm 2018
Migration Best Practices - SEOkomm 2018Migration Best Practices - SEOkomm 2018
Migration Best Practices - SEOkomm 2018
 
Datasheet foldermanagementpluginforrd
Datasheet foldermanagementpluginforrdDatasheet foldermanagementpluginforrd
Datasheet foldermanagementpluginforrd
 
Seven steps to better security
Seven steps to better securitySeven steps to better security
Seven steps to better security
 
System Architecture at DDVE
System Architecture at DDVESystem Architecture at DDVE
System Architecture at DDVE
 
Best Practices for Migrating a Legacy-Based CMS to Drupal
Best Practices for Migrating a Legacy-Based CMS to DrupalBest Practices for Migrating a Legacy-Based CMS to Drupal
Best Practices for Migrating a Legacy-Based CMS to Drupal
 
SharePoint 2010 Migration Presentation
SharePoint 2010 Migration PresentationSharePoint 2010 Migration Presentation
SharePoint 2010 Migration Presentation
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
 
Hybrid Cloud Journey - Maximizing Private and Public Cloud
Hybrid Cloud Journey - Maximizing Private and Public CloudHybrid Cloud Journey - Maximizing Private and Public Cloud
Hybrid Cloud Journey - Maximizing Private and Public Cloud
 
Merging and Migrating: Data Portability from the Trenches
Merging and Migrating: Data Portability from the TrenchesMerging and Migrating: Data Portability from the Trenches
Merging and Migrating: Data Portability from the Trenches
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
Migration Best Practices - SMX West 2019
Migration Best Practices - SMX West 2019Migration Best Practices - SMX West 2019
Migration Best Practices - SMX West 2019
 
Planning Your Migration to SharePoint Online #SPBiz60
Planning Your Migration to SharePoint Online #SPBiz60Planning Your Migration to SharePoint Online #SPBiz60
Planning Your Migration to SharePoint Online #SPBiz60
 
Make Drupal Run Fast - increase page load speed
Make Drupal Run Fast - increase page load speedMake Drupal Run Fast - increase page load speed
Make Drupal Run Fast - increase page load speed
 
Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...
Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...
Data Segregation for Remedyforce SaaS Help Desk and High-Speed Digital Servic...
 
What Makes Migrating to the Cloud Different Than On-Premises
What Makes Migrating to the Cloud Different Than On-PremisesWhat Makes Migrating to the Cloud Different Than On-Premises
What Makes Migrating to the Cloud Different Than On-Premises
 
BrownSites: Building and Managing a CMS Infrastructure for Higher Ed
BrownSites: Building and Managing a CMS Infrastructure for Higher EdBrownSites: Building and Managing a CMS Infrastructure for Higher Ed
BrownSites: Building and Managing a CMS Infrastructure for Higher Ed
 
Spca2014 navigating clouds sp_con14_mackie
Spca2014 navigating clouds sp_con14_mackieSpca2014 navigating clouds sp_con14_mackie
Spca2014 navigating clouds sp_con14_mackie
 
Taking your site from Drupal 6 to Drupal 7
Taking your site from Drupal 6 to Drupal 7Taking your site from Drupal 6 to Drupal 7
Taking your site from Drupal 6 to Drupal 7
 
Best Practices and Tips on Migrating a Legacy-Based CMS to Drupal
Best Practices and Tips on Migrating a Legacy-Based CMS to DrupalBest Practices and Tips on Migrating a Legacy-Based CMS to Drupal
Best Practices and Tips on Migrating a Legacy-Based CMS to Drupal
 
Power BI Modeling Use Cases: Desktop to Enterprise with Questions and Answers
Power BI Modeling Use Cases: Desktop to Enterprise with Questions and AnswersPower BI Modeling Use Cases: Desktop to Enterprise with Questions and Answers
Power BI Modeling Use Cases: Desktop to Enterprise with Questions and Answers
 

More from Andy Kucharski

Estimation - web software development estimation DrupalCon and DrupalCamp pre...
Estimation - web software development estimation DrupalCon and DrupalCamp pre...Estimation - web software development estimation DrupalCon and DrupalCamp pre...
Estimation - web software development estimation DrupalCon and DrupalCamp pre...Andy Kucharski
 
Drupal Camp Wroclaw 2015 Measure everything nps
Drupal Camp Wroclaw 2015 Measure everything npsDrupal Camp Wroclaw 2015 Measure everything nps
Drupal Camp Wroclaw 2015 Measure everything npsAndy Kucharski
 
Measure everything - but make NPS the Key
Measure everything - but make NPS the Key Measure everything - but make NPS the Key
Measure everything - but make NPS the Key Andy Kucharski
 
Drupal commerce performance profiling and tunning using loadstorm experiments...
Drupal commerce performance profiling and tunning using loadstorm experiments...Drupal commerce performance profiling and tunning using loadstorm experiments...
Drupal commerce performance profiling and tunning using loadstorm experiments...Andy Kucharski
 
PrometSource Mobile Development Capabilities
PrometSource Mobile Development Capabilities PrometSource Mobile Development Capabilities
PrometSource Mobile Development Capabilities Andy Kucharski
 
2012 bad camp-project management tools and organization-v4
2012 bad camp-project management tools and organization-v42012 bad camp-project management tools and organization-v4
2012 bad camp-project management tools and organization-v4Andy Kucharski
 
Front End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalFront End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalAndy Kucharski
 
Stress Test Drupal on Amazon EC2 vs. RackSpace cloud
Stress Test Drupal on Amazon EC2 vs. RackSpace cloudStress Test Drupal on Amazon EC2 vs. RackSpace cloud
Stress Test Drupal on Amazon EC2 vs. RackSpace cloudAndy Kucharski
 
Drupal Business Summit - making your sites mobile accessible, four methods
Drupal Business Summit - making your sites mobile accessible, four methodsDrupal Business Summit - making your sites mobile accessible, four methods
Drupal Business Summit - making your sites mobile accessible, four methodsAndy Kucharski
 
What should you expect from your Drupal Web Host
What should you expect from your Drupal Web HostWhat should you expect from your Drupal Web Host
What should you expect from your Drupal Web HostAndy Kucharski
 
Promet.manila2011.selling drupal
Promet.manila2011.selling drupalPromet.manila2011.selling drupal
Promet.manila2011.selling drupalAndy Kucharski
 

More from Andy Kucharski (11)

Estimation - web software development estimation DrupalCon and DrupalCamp pre...
Estimation - web software development estimation DrupalCon and DrupalCamp pre...Estimation - web software development estimation DrupalCon and DrupalCamp pre...
Estimation - web software development estimation DrupalCon and DrupalCamp pre...
 
Drupal Camp Wroclaw 2015 Measure everything nps
Drupal Camp Wroclaw 2015 Measure everything npsDrupal Camp Wroclaw 2015 Measure everything nps
Drupal Camp Wroclaw 2015 Measure everything nps
 
Measure everything - but make NPS the Key
Measure everything - but make NPS the Key Measure everything - but make NPS the Key
Measure everything - but make NPS the Key
 
Drupal commerce performance profiling and tunning using loadstorm experiments...
Drupal commerce performance profiling and tunning using loadstorm experiments...Drupal commerce performance profiling and tunning using loadstorm experiments...
Drupal commerce performance profiling and tunning using loadstorm experiments...
 
PrometSource Mobile Development Capabilities
PrometSource Mobile Development Capabilities PrometSource Mobile Development Capabilities
PrometSource Mobile Development Capabilities
 
2012 bad camp-project management tools and organization-v4
2012 bad camp-project management tools and organization-v42012 bad camp-project management tools and organization-v4
2012 bad camp-project management tools and organization-v4
 
Front End page speed performance improvements for Drupal
Front End page speed performance improvements for DrupalFront End page speed performance improvements for Drupal
Front End page speed performance improvements for Drupal
 
Stress Test Drupal on Amazon EC2 vs. RackSpace cloud
Stress Test Drupal on Amazon EC2 vs. RackSpace cloudStress Test Drupal on Amazon EC2 vs. RackSpace cloud
Stress Test Drupal on Amazon EC2 vs. RackSpace cloud
 
Drupal Business Summit - making your sites mobile accessible, four methods
Drupal Business Summit - making your sites mobile accessible, four methodsDrupal Business Summit - making your sites mobile accessible, four methods
Drupal Business Summit - making your sites mobile accessible, four methods
 
What should you expect from your Drupal Web Host
What should you expect from your Drupal Web HostWhat should you expect from your Drupal Web Host
What should you expect from your Drupal Web Host
 
Promet.manila2011.selling drupal
Promet.manila2011.selling drupalPromet.manila2011.selling drupal
Promet.manila2011.selling drupal
 

Drupal campchicago2010.rachel.datamigration

  • 1. Drupal Migration Migrating 100,000 pages of content From Legacy CMS to Drupal Rachel Jaro Solutions Architect at PrometSource www.prometsource.com
  • 2. Overview We’ll talk about: Successful migration recipe Common questions you should be asking before you start Top 3 tools to do migration in Drupal Issues Tools to use in URL Rewriting File management Comparison in D6 Testing Deploying Solution
  • 3. Data Migration “Data migration solutions extract data from a source system, correct errors, reformat, restructure and load the data into a replacement target system”. It sounds simple, but poorly managed data migration is the most common cause of failure in implementing a replacement system. -- Gershon Pick, March 2001
  • 6. Plan: What to Ask Node types (Content separation, fields) Do you want to separate contents into pages, articles, biography, news, etc. What fields are needed for each node? Who can access it? Do you really need that content type? Or can we just use taxonomies instead for similar contents.
  • 7. Plan: What to Ask Taxonomy (Categorization, tags) Do you need to categorize nodes? Would you need different access? What kind of taxonomy groups or vocabularies you would need? Permission (per nodes) and User Roles Who are going to use the site? What are particularly their access rights?
  • 8. Plan: What to Ask New URL mapping Do you need to make SEO friendly URLs? Files, files permissions and file directory Do you need advance file management or document management tool? Do you need simpler solutions? How simple is that. Do you need access rights for each folder? Do you need browser type interface to access them? What kind of files do you need to store? Images, pdfs?
  • 10. Requirements Use CSV files to import data Divide migration into group or sections Map and replace old URL to SEO friendly URL Before: 05-200.htm
  • 11. Data in CSV Example December 13, 2005 3:39:54 PM||||||||||December 13, 2005||||||||||Report Spotlights Need for Reform in Jackpot Jurisdictions||||||||||/press/releases/2005/december/||||||||||05-200||||||||||{UUID}|||||||||| Economics^^^^^^^^^^Economy |||||||||| <p>LoremIpsum is simply dummy text of the printing and typesetting industry. LoremIpsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p> <p>LoremIpsum is simply dummy text of the printing and typesetting industry. LoremIpsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p> $$$$$$$$$$ Separator: |||||||||| End of Row: $$$$$$$$$$
  • 12. Content Type Division Example: CNN.com Divide migration sequences into US, World, Politics, Justice, etc
  • 13. Solutions/Tools TW and Migrate modules Combo node_import() Drush + custom script
  • 14. TW & Migrate Module Combo http://drupal.org/project/tw Supports Migrate module to run views of source data http://drupal.org/project/migrate a flexible framework for migrating content
  • 15. Migrate Module Features: users browse their legacy data using views support for creating Drupal nodes, users, and comments is included hooks permit migration of other types of content. provides a dashboard for running mini migrations Drush support
  • 16. Why I did not choose migrate Importing to mysql was not an option. CSV were used instead Cannot map old URL to new URL
  • 17. node_import() http://drupal.org/project/node_import Features: Easy to learn, Point and click Uses CSV to upload contents Can easily delete previous imported data Can download errors when import failed for easy reference to fix issues
  • 18. node_import() Problems I can’t define map old URL to new URL No drush support It doesn’t save my old settings for a csv.
  • 19. Drush + Custom script Flexibility - I can do whatever I want with the data
  • 20. Create your own migration script [demo]
  • 21. Issues File Management URL Rewriting
  • 22. File Management Client requirements Intuitive Has wysiwyg support Access control – upload, edit, delete, revise files by different roles Revision control – optional but good to have Limited time!
  • 23. File Management Modules *DbFm was not included due to problems encountered during tests in D6
  • 24. URL Rewriting Source: http://www.flickr.com/photos/randomfactor/483264915/
  • 25. URLs Rewriting Solution Not recommended .htaccess Too many URL to handle. Too much server load Recommended pathauto + path_redirect modules automated alias settings 301 redirect set global redirect Additional reference: http://acquia.com/blog/migrating-drupal-way-part-ii-saving-those-old-urls
  • 27. Access control Alternative /default/files/PressReleases /default/files/Documents /default/files/International /default/files/International/America /default/files/International/England /default/files/International/Asia
  • 28. Test, Test and did I say Test? Source: http://www.flickr.com/photos/paperpariah/2424107350/
  • 29. Common problems Broken links Misconfigured page Empty pages Invalid date File not found or orphan pages Page format Test when CACHE is on
  • 31. Deployment 2 Ways to Deploy your data to live environment All at once Divide and conquer
  • 32. Deployment: Divide and Conquer Example: CNN Division
  • 33. Deployment Mockup * shadow box is your migrated data’s production box * old CMS is still active at this time
  • 34.
  • 35.
  • 36. Deployment Pros Less risk, less stress Editors can do continues data entry daily Cons URL rewriting can be a tricky Updating the production box with new content can be an arduous task
  • 37. Deployment: Updating Production Automation SVN Drush scripts to migrate contents from tester’s box to shadow box Deploy – http://drupal.org/project/deploy Manual Document configuration changes Document database changes
  • 38. Recap SDLC + Agile Common questions you should be asking before you start Top 3 tools to do migration in Drupal TW & Migrate, node_import(), drush Issues File management Comparison in D6 Tools to use in URL Rewriting Testing Deployment Solution

Editor's Notes

  1. Todo – make comparison of normal sdlc to migration of sdlc
  2. http://www.flickr.com/photos/14804582@N08/2111269218/