1. Migrating 100,000 pages of content
From Legacy CMS to Drupal
Rachel Jaro
Solutions Architect at PrometSource
www.prometsource.com
2. Overview
We’ll talk about:
Successful migration recipe
Common questions you should be asking before you
start
Top 3 tools to do migration in Drupal
Issues
Tools to use in URL Rewriting
File management Comparison in D6
Testing
Deploying Solution
3. Data Migration
“Data migration solutions extract data from a source
system, correct errors, reformat, restructure and load
the data into a replacement target system”.
It sounds simple, but poorly managed data migration
is the most common cause of failure in implementing
a replacement system.
-- Gershon Pick, March 2001
6. Plan: What to Ask
Node types (Content separation, fields)
Do you want to separate contents into pages, articles,
biography, news, etc.
What fields are needed for each node?
Who can access it?
Do you really need that content type? Or can we just use
taxonomies instead for similar contents.
7. Plan: What to Ask
Taxonomy (Categorization, tags)
Do you need to categorize nodes?
Would you need different access?
What kind of taxonomy groups or vocabularies you
would need?
Permission (per nodes) and User Roles
Who are going to use the site?
What are particularly their access rights?
8. Plan: What to Ask
New URL mapping
Do you need to make SEO friendly URLs?
Files, files permissions and file directory
Do you need advance file management or document
management tool?
Do you need simpler solutions? How simple is that.
Do you need access rights for each folder?
Do you need browser type interface to access them?
What kind of files do you need to store? Images, pdfs?
10. Requirements
Use CSV files to import data
Divide migration into group or sections
Map and replace old URL to SEO friendly URL
Before: 05-200.htm
11. Data in CSV Example
December 13, 2005 3:39:54 PM||||||||||December 13, 2005||||||||||Report
Spotlights Need for Reform in Jackpot
Jurisdictions||||||||||/press/releases/2005/december/||||||||||05-
200||||||||||{UUID}|||||||||| Economics^^^^^^^^^^Economy ||||||||||
<p>LoremIpsum is simply dummy text of the printing and typesetting
industry. LoremIpsum has been the industry's standard dummy text
ever since the 1500s, when an unknown printer took a galley of type and
scrambled it to make a type specimen book. </p>
<p>LoremIpsum is simply dummy text of the printing and typesetting
industry. LoremIpsum has been the industry's standard dummy text
ever since the 1500s, when an unknown printer took a galley of type and
scrambled it to make a type specimen book. </p>
$$$$$$$$$$
Separator: ||||||||||
End of Row: $$$$$$$$$$
14. TW & Migrate Module Combo
http://drupal.org/project/tw
Supports Migrate module to run views of source data
http://drupal.org/project/migrate
a flexible framework for migrating content
15. Migrate Module
Features:
users browse their legacy data using views
support for creating Drupal nodes, users, and
comments is included
hooks permit migration of other types of content.
provides a dashboard for running mini migrations
Drush support
16. Why I did not choose migrate
Importing to mysql was not an option. CSV were used
instead
Cannot map old URL to new URL
22. File Management
Client requirements
Intuitive
Has wysiwyg support
Access control – upload, edit, delete, revise files by
different roles
Revision control – optional but good to have
Limited time!
25. URLs Rewriting Solution
Not recommended
.htaccess
Too many URL to handle.
Too much server load
Recommended
pathauto + path_redirect modules
automated alias settings
301 redirect set
global redirect
Additional reference:
http://acquia.com/blog/migrating-drupal-way-part-ii-saving-those-old-urls
27. Access control Alternative
/default/files/PressReleases
/default/files/Documents
/default/files/International
/default/files/International/America
/default/files/International/England
/default/files/International/Asia
28. Test, Test and did I say Test?
Source: http://www.flickr.com/photos/paperpariah/2424107350/
29. Common problems
Broken links
Misconfigured page
Empty pages
Invalid date
File not found or orphan pages
Page format
Test when CACHE is on
35. Deployment Mockup
* shadow box is your migrated data’s production box
* replacing old CMS with Drupal
36. Deployment
Pros
Less risk, less stress
Editors can do continues data entry daily
Cons
URL rewriting can be a tricky
Updating the production box with new content can be
an arduous task
37. Deployment: Updating Production
Automation
SVN
Drush scripts to migrate contents from tester’s box to
shadow box
Deploy – http://drupal.org/project/deploy
Manual
Document configuration changes
Document database changes
38. Recap
SDLC + Agile
Common questions you should be asking before you
start
Top 3 tools to do migration in Drupal
TW & Migrate, node_import(), drush
Issues
File management Comparison in D6
Tools to use in URL Rewriting
Testing
Deployment Solution