An approach to assign BPEL workflow steps to available resources is presented. The approach takes data dependencies between workflow steps and the utilization of resources at runtime into account.
The developed scheduling algorithm simulates whether the makespan of workflows could be reduced by providing additional resources from a Cloud infrastructure. If yes, Cloud resources are automatically set up and used to increase throughput.
The proposed approach does not require any changes to the BPEL standard. An implementation based on the ActiveBPEL engine and Amazon\\\'s Elastic Compute Cloud is presented.
Experimental results for a real-life workflow from a medical application indicate that workflow execution times can be reduced significantly.
Data Flow Driven Scheduling Of Bpel Workflows Using Cloud Resources, IEEE CLOUD 2010, Miami
1. Data Flow Driven Scheduling of BPEL
Workflows Using Cloud Resources
Tim Dörnemann, Ernst Juhnke, Thomas Noll,
Dominik Seiler, Bernd Freisleben
{doernemt, ejuhnke, noll, seiler, freisleb}@informatik.uni-marburg.de
3. Business Process Execution Language
• BPEL is the de-facto standard for workflow /
business process modeling in the web service
area
• Programming in the large: complex applications
are built by composing existing components (web
services)
• the composed process is exposed as a web
service itself and integrates perfectly into SOAs
5. BPEL – dynamic resource selection
• Destinations of invoke operations • mixup of business logic and
are typically set at design time infrastructural settings
• setting at runtime possible, but • very high modeling overhead to
complicated make resource selection dynamic
<assign>
<copy>
<from> <literal>
<wsa:EndpointReference xmlns:ns="NSPACE">
<wsa:Address>http://FQDN:PORT/SERVICE-ADDRESS</wsa:Address>
<wsa:ServiceName PortName="Port"> ns:SERVICE-NAME</wsa:ServiceName>
<wsa:ReferenceParameters> <wsa:To>...</wsa:To> <wsa:Action>...</wsa:Action>
</wsa:ReferenceParameters>
</wsa:EndpointReference>
</literal> </from> <to variable="targetEPR"/> </copy>
<copy> <from variable="targetEPR" /> <to partnerLink="targetPL" /> </copy>
</assign>
6. Peak Load Scenario
• Scenario: static/pre-defined target hosts, workflow is invoked many
times in parallel
• Leads to high load on workflow's target machines
– increase of workflow runtime / response time
– negative user experience
– loss of stability
• worst case: abandonment of workflow
– waste of CPU hours
(lost intermediate results)
BPEL Engine
7. Peak Load Scenario
• Scenario: static/pre-defined target hosts, workflow is invoked many
times in parallel
• Leads to high load on workflow's target machines
– increase of workflow runtime / response time
– negative user experience
– loss of stability
• worst case: abandonment of workflow
– waste of CPU hours
(lost intermediate results)
BPEL Engine
8. Peak Load Scenario
• Scenario: static/pre-defined target hosts, workflow is invoked many
times in parallel
• Leads to high load on workflow's target machines
– increase of workflow runtime / response time
– negative user experience
– loss of stability
• worst case: abandonment of workflow
– waste of CPU hours
(lost intermediate results)
BPEL Engine
9. Peak Load Scenario
• Scenario: static/pre-defined target hosts, workflow is invoked many
times in parallel
• Leads to high load on workflow's target machines
– increase of workflow runtime / response time
– negative user experience
– loss of stability
• worst case: abandonment of workflow
– waste of CPU hours ?
(lost intermediate results)
BPEL Engine
12. Solution Requirements
• BPEL is a non-DAG workflow language
– (While) loops
– Rescheduling
– Low computation time
• Example:
– Workflow with 10 activities and 6 available resources
– 106 matches have to be computed
• Heuristic algorithm is necessary
13. Design: Genetic Algorithm
• Widely used approach in literature
• Natural choice
– Chromosome → invoke activity
– Genome → list of activities
– Population → set of candidate resource allocations
• Low risk of local minimum problem
15. Design: Critical Paths
• Critical path (CP) is a
linear part of the data flow
graph
• Reduces assignment
complexity
• CPs are sorted according
to their estimated runtime
(descending)
– GA computes
schedule for CPs in
this order
16. Design: Reservation
• Reserves resources for a certain time (exclusive allocation)
– Prevents overloading of resources
• Coordinates re-scheduling of subgraphs
– Reservations are removed when execution of operation is
finished
– If a operation has no reservation, either
• the reservation was violated
and therefore removed
• the operation is in a
cycle (while)
20. Implementation (cont„d)
• Pass reference (FlexSwA) instead of actual data
Details:
Steffen Heinzl, Markus Mathes, Thomas Friese, Matthew Smith, Bernd Freisleben
Flex-SwA: Flexible Exchange of Binary Data Based on SOAP Messages with Attachments
In: Proceedings of the IEEE International Conference on Web Services (ICWS), pp. 3-10, IEEE Computer Society Press, 2006
21. Evaluation
• Sample application stems from the medical research (apnoea
detection)
– heavily uses native code (Physio Toolkit)
• Total amount of transferred data per Workflow
– 258 MB
– + 118 MB from client to engine
• Test bed
– dedicated resources:
Core2Duo E6850, 2 GB RAM
– Cloud resources: “High-CPU Medium Instance”,
5 EC2 Compute Units, 1.7 GB RAM
22. Evaluation (cont„d)
1. a new workflow every 30
seconds
2. at an interval of 90
seconds, two workflows
are started
3. four workflows are
started concurrently
Workflow 1
Workflow 2
Workflow 3
Workflow 4
Workflow 5 time
23. Conclusion
• Data flow aware scheduler for BPEL
– uses genetic algorithm as heuristic
– reduces makespan of workflow
– utilizes existing and virtual resources more efficient
• Future work
– extend approach to support multi-objective scheduling
• Example: cost and performance optimization
– Impl. details like ahead-of-time provisioning of VMs to avoid
delays (see last slide)
24. Thank you for your Attention!
Any Questions or Remarks?
{doernemt, ejuhnke, noll, seiler, freisleb}@informatik.uni-marburg.de