2011-10-31 | 01:30 PM - 02:15 PM
Spring Batch has a large user base and a good track record in production systems, but what is it all really about, and why does it work? This presentation provides a short bootstrap to get a new user started with the Batch domain, showing the key concepts and explaining the benefits of the framework. Then it goes into a deeper dive and looks at what holds it all together, with a close look at some of the most important but least understood features, including restart, retry and transactions.
How to Troubleshoot Apps for the Modern Connected Worker
Spring Day | Behind the Scenes at Spring Batch | Dave Syer
1. Inside Spring Batch
Dave Syer, VMware, JAX London 2011
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited.
2. Overview
• Very quick start with Spring Batch
• Spring Batch Admin
• State management – thread isolation
• Retry and skip
• Restart and transactions
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 2
3. Processing the Same File
Twice…
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 3
4. Spring Batch
Application
Business logic
Batch Core
Quality of service,
Batch Infrastructure
auditability,
management
information
Re-usable low level
stuff: flat files, XML
files, database keys
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 4
5. Spring Batch Admin
Application
Batch Core
Runtime services
(JSON and Java)
plus optional UI
Batch Infrastructure
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 5
7. Item-Oriented Processing
• Input-output can be grouped together = Item-Oriented
Processing
Step ItemReader ItemWriter
execute()
read()
item
repeat,
retry, write(items)
etc.
ExitStatus
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 7
8. Job Configuration and
Execution
The EndOfDay Job
Job schedule.date = 2007/05/05
JobParameters
* The EndOfDay Job
JobInstance for 2007/05/05
* The first attempt at
JobExecution EndOfDay Job
for 2007/05/05
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 8
9. State Management
• Isolation – thread safety
• Retry and skip
• Restart
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 9
10. Thread Isolation: StepScope
File writer needs to be step scoped so it can flush and close the output stream
<bean class="org.sfw...FlatFileItemWriter" scope=“step”>
<property name=“resource">
<value>
/data/#{jobName}-{#stepName}.csv
</value>
</property>
</bean>
Because it is step scoped the writer has access to the
StepContext and can replace these patterns with runtime values
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 10
11. Step Scope Responsibilities
• Create beans for the duration of a step
• Respect Spring bean lifecycle metadata (e.g.
InitializingBean at start of step, DisposableBean
at end of step)
• Recognise StepScopeAware components and
inject the StepContext
• Allows stateful components in a multithreaded
environment
• Well-known internal services recognised
automatically
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 11
12. Quality of Service
• Stuff happens:
– Item fails
– Job fails
• Failures can be
– Transient – try again and see if you succeed
– Skippable – ignore it and maybe come back to it later
– Fatal – need manual intervention
• Mark a job execution as FAILED
• When it restarts, pick up where you left off
• All framework concerns: not business logic
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 12
13. Quality of Service Sample
<step id="step1">
<tasklet>
<chunk reader="itemGenerator" writer="itemWriter"
commit-interval="1"
retry-limit="3" skip-limit="10">
...
</chunk>
</tasklet>
</step>
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 13
14. Retry and the Transaction
REPEAT(while more input) {
chunk = ACCUMULATE(size=500) { Chunk
input;
Provider
}
RETRY {
TX {
for (item : chunk) { process; }
write; Chunk
} Processor
}
}
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 14
15. Retry and Skip: Failed
Processor
RETRY(up to n times) {
TX {
Skip is just
for (item : chunk) { process; }
an exhausted
write; retry
}
} RECOVER {
TX {
for (item : successful) { process; }
write;
skip(item);
}
}
}
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 15
16. Flushing: ItemWriter
public class RewardWriter implements
ItemWriter<Reward> {
public void write(List<Reward> rewards) {
// do stuff to output Reward records
// and flush changes here…
}
}
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 16
17. Retry and Skip: Failed Write
RETRY(up to n times) {
TX {
for (item : chunk) { process; }
write; Scanning for
} failed item
} RECOVER {
for (item : chunk) {
TX {
process;
write;
} CATCH {
skip(item);
}
}
}
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 17
18. Restart and Partial Failure
• Store state to enable restart
• What happens to the business data on error?
• What happens to the restart data?
• Goal: they all need to rollback together
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 18
19. Partial Failure: Piggyback the
Business Transaction
JOB {
STEP {
REPEAT(while more input) {
TX {
Inside
REPEAT(size=500) {
business
input; transaction
output;
}
FLUSH and UPDATE; Database
}
} Persist
} context data
for next
}
execution
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 19
20. ItemStream
Step ItemStream JobRepository
execute()
open(executionContext)
Called before
commit
repeat,
update(executionContext)
retry,
etc. save(executionContext)
close()
ExitStatus
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 20
21. Overview
• Very quick start with Spring Batch
• Spring Batch Admin
• State management – thread isolation
• Retry and skip
• Restart and transactions
Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 21