Batch Scripting with Drupal (Featuring the EntityFieldQuery API)
1. Batch Scripting with Drupal
(Featuring the EntityFieldQuery API)
Engr.Ranel O.Padon
DrupalCamp Manila 2014
ranel.padon@gmail.com | https://github.com/ranelpadon
2. About Me
Full-time Drupal developer for CNN Travel
Part-time Python lecturer.
Involved in computational Java and Python projects before.
Plays competitive football and futsal.
3. TOPICS
Why do batch scripting?
How to leverage Entities and the EntityFieldQuery API.
How to implement a batch module?
Sample Actual Use Cases
4. Why do batching?
There are things that are hard to teach to robots:
spatial awareness and image interpretation.
But there are things that machines could do and easily beat humans:
doing repetitive tasks.
In Drupal, you don't want your site editors to do repetitive tasks:
e.g. updating a field to the same value.
6. Why do batching?
Batch processing is execution of a series of programs on a
computer without manual intervention.
Designed to integrate nicely with the Form API, but can also be used
by non-Form API scripts.
7. Why do batching?
Avoids PHP Timeout (max_execution_time errors)
For long and complex data processing
You can give the admins real-time feedback or summary of results.
8. When to do batching?
Implementing installation profiles
Used by Drupal's install.php and update.php
Updating the value of a field for all Event nodes.
Deleting all nodes older than 3 years.
Migrating Column nodes to Column taxonomy terms.
Creating custom nodes upon saving a content with an uploaded Excel file,
Batch API is triggered by hook_node_presave() and
goes through each row of the attached Excel file.
9. The Rise of Entities
“Oh no, I left my stuff in our house.”
Stuff, just like entities, are useful abstraction.
They could change meaning depending on the context.
10. The Rise of Entities
In Physics, you could treat each object under study as particles.
In Drupal, they are called entities.
Facilitates a unified way to work
with different data units
Concept simplification contributes
to better modularity,flexibility and maintainability.
http://evolvingweb.ca/story/understanding-entity-api-module
11. The Rise of Entities
Before D7, users and comments didn't have the same power
that nodes (content) had.
no translations, fields, versioning, and so on.
Views (relies on fields) didn’t work with comments and users.
12. The Rise of Entities
Field is a reusable piece of content. Each field is a primitive data type,
with custom validators and widgets for editing and formatters for display.
Entity Type group together fields (use Entity API for custom ones):
Nodes (content)
Comments
Taxonomy Terms
User Profiles
13. The Rise of Entities
Bundles are an implementation of an entity type.
They are subtypes of an entity type.
Bundles (subtypes) like articles, events, blog posts, or products could be
generated from node entity. You could add a file download field on
Basic Pages and a subtitle field on Articles.
You could also assign geographic coordinates field to all bundles/entities.
14. The Rise of Entities
Entity would be one instance of a particular entity type
(specific article or specific user via entity_load()).
Drupal 7 Core provides entity_load(), while the Entity API contrib
module provides entity_save() and entity_delete().
15. The Rise of Entities
In terms of Object-Oriented Programming:
An entity type is a base class
A bundle is an extended/derived class
A field is a class member,attribute, or property,
An entity is an object or instance of a base or extended class
16. EntityFieldQuery API
Tool for querying Entities (compared to db_select())
Can query entity properties and fields
Can query field data across entity types:
Fetch all pages and users tagged with taxonomy term “premium”
Returns entity IDs that you could load using entity_load()
Database-agnostic (no issues when migrating from MySQL to PostgreSQL)
28. Custom Batch Module
The usual suspects:
mybatch.info
mybatch.module
Then, enable the module in “admin/modules” or using Drush:
$ drush en -y mybatch
29. Custom Batch Module
Information required in mybatch.module, without a form:
I. URL that will be utilized.
II. Function callback definition for that URL.
Usually contains the setup for the batch process.
III.The batch operation's function definition.
IV.The function definition that will be called after
the batch operation.
38. Custom Batch Module
Information required in mybatch.module, when using a form:
I. URL that will be utilized.
II. Form callback definition for that URL (hook_form).
III. Form Submit definition.
Usually contains the setup for the batch process.
IV.The batch operation's function definition.
V.The function definition that will be called after the batch operation.
60. Summary
Information required in mybatch.module, without a form:
I. URL that will be utilized.
II. Function callback definition for that URL.
Usually contains the setup for the batch process.
III.The batch operation's function definition.
IV.The function definition that will be called after
the batch operation.
61. Summary
Information required in mybatch.module, when using a form:
I. URL that will be utilized.
II. Form callback definition for that URL (hook_form).
III. Form Submit definition.
Usually contains the setup for the batch process.
IV.The batch operation's function definition.
V.The function definition that will be called after the batch operation.
62. Summary
Implementing the Batch API without a form could be also
integrated to Drupal hooks or even Drush.
EntityFieldQuery integration to batch operations callback
will facilitate a more readable, flexible, and maintainable
data fetching in the batch process.
63. Recommended Links
Batch API Docs:
https://drupal.org/node/180528
Examples Module (Batch API):
http://d7.drupalexamples.info/examples/batch_example
Batch API integration with Drush:
http://www.metaltoad.com/blog/using-drupal-batch-api
EntityFieldQuery Docs:
https://drupal.org/node/1343708
EntityFieldQuery as alternative to Views:
http://treehouseagency.com/blog/tim-cosgrove/2012/02/16/entityfieldquery-let-drupal-do-heavy-lifting-pt-1