- Java EE 7 First Look
- NDJOBO Armel Fabrice
- 1391字
- 2021-07-23 15:23:38
Batch Applications for Java Platform 1.0
The Batch Applications API for the Java Platform 1.0 was developed under JSR 352. This section just gives you an overview of the API. The complete document specification (for more information) can be downloaded from http://jcp.org/aboutJava/communityprocess/final/jsr352/index.html.
What is batch processing?
According to the Cambridge Advanced Learner's Dictionary, a batch is a group of things or people dealt with at the same time or considered similar in type. And a process is a series of actions that you take in order to achieve a result. Based on these two definitions, we can say that batch processing is a series of repetitive actions on a large amount of data in order to achieve a result. Given the large amounts of data that it has to deal with, batch processing is often used for end of day, month, period, and year processing.
The following is a short list of domains where you can use batch processing:
- Data import/export from/to XML or CSV files
- Accounting processing such as consolidations
- ETL (extract-transform-load) in a data warehouse
- Digital files processing (downloading, processing, or saving)
- Notification of a service's subscribers (such as forum, group, and so on)
Why a dedicated API for batch processing?
After having an idea about batch processing, some people might ask themselves: Why not just set a foreach
loop that launches many threads? First of all, you have to know that batch processing is not only concerned with the execution speed. Indeed, the processing of large amounts of data is often affected by many exceptions, which could generate a number of preoccupations: What action should be taken in case of exceptions? Should we cancel the whole process for any exception? If not, what action should be canceled? For which type of exception? If you only need to cancel a certain number of transactions, how do you recognize them? And at the end of a batch processing, it is always important to know how many treatments have been canceled. How many have been registered successfully? How many have been ignored?
As you can see, we have not finished identifying questions that batch processing can raise, but we discover that this is already a great deal. Trying to build such a tool on your own may not only complicate your application but also introduce new bugs.
Understanding the Batch API
The Batch Applications API for the Java Platform 1.0 was developed to provide a solution to the different needs listed in the earlier bullet items. It targets both Java SE and Java EE applications and requires at least the 6th Version of JVM.
The features offered by this API can be summarized as follows:
- It offers the Reader-Processor-Writer pattern natively and gives you the ability to implement your own batch pattern. This allows you to choose the best pattern depending on the case.
- It gives the possibility of defining the behavior (skip, retry, rollback, and so on) of the batch processing for each type of error.
- It supports many step-level metrics such as:
rollbackCount
,readSkipCount
,writeSkipCount
, and so on for monitoring. - It can be configured to run some processes in parallel and offer the possibility to use JTA or
RESOURCE_LOCAL
transaction mode.
To do this, the Batch Applications API for the Java Platform 1.0 is based on a solid architecture that can be outlined by the following diagram. A Job is managed by a JobOperator
and has one or many steps, which can be either chunk or batchlet. During its lifecycle, information (metadata) about a job is stored in JobRepository
, as shown in the following diagram:

JobRepository
As we said earlier, JobRepository
stores metadata about current and past running jobs. It can be accessed through JobOperator
.
Job
A Job can be seen as an entity to encapsulate a unit of batch processing. It is made up of one or many steps, which must be configured within an XML file called a Job configuration file or Job XML. This file will contain job identification information and different steps that compose the job. The code that follows shows the skeleton of a Job XML file.
<job id="inscription-validator-Job" version="1.0"xmlns="http://xmlns.jcp.org/xml/ns/javaee"> <step id="step1" > ... </step> <step id="step2" > ... </step> </job>
The Job XML file is named with the convention <name>.xml
(for example, inscriptionJob.xml
) and should be stored under the META-INF/batch-jobs
directory for portable application.
Step
A Step is an autonomous phase of a batch. It contains all the necessary information to define and control a piece of batch processing. A batch step is either a chunk or a batchlet (the two are mutually exclusive). The step of the following code is a chunk type step:
<job id="inscription-validator-Job" version="1.0"xmlns="http://xmlns.jcp.org/xml/ns/javaee"> <step id="validate-notify" > <chunk> <reader ref="InscriptionReader" /> <processor ref="InscriptionProcessor" /> <writer ref="StudentNotifier" /> </chunk> </step> </job>
Chunk
A chunk is a type of step that implements the Reader-Processor-Writer pattern. It runs in the scope of a configurable transaction and can receive many configuration values. The following code is a more enhanced version of the inscription-validator-Job shown in the preceding code. In this listing, we have added a configuration to define the unit element that will be used in order to manage the commit behavior of the chunk (checkpoint-policy="item"
), and a configuration to define the number of items (unit elements) to process before commit (item-count="15"
). We have also specified the number of exceptions a step will skip if any configured exceptions that can be skipped are thrown by the chunk (skip-limit="30"
).
The following code is an example of a chunk type step with some configuration:
<job id="inscription-validator-Job" version="1.0" xmlns="http://xmlns.jcp.org/xml/ns/javaee"> <step id="validate-notify" > <chunk item-count="15" checkpoint-policy="item" skip-limit="30"> <reader ref="InscriptionReader" /> <processor ref="InscriptionProcessor" /> <writer ref="StudentNotifier" /> </chunk> </step> </job>
The following code shows us what chunk batch artifact implementation looks like. The InscriptionCheckpoint
allows you to know the line that is being processed. The source code of this section is a validation program that sends a message to the candidates to let them know if they have been accepted or not. At the end, it displays monitoring information in a web page. The processing is launched by the ChunkStepBatchProcessing.java
Servlet.
The following code is a skeleton of chunk batch artifact implementations:
public class InscriptionReader extends AbstractItemReader { @Override public Object readItem() throws Exception { //Read data and return the item } } public class InscriptionProcessor implements ItemProcessor{ @Override public Object processItem(Object o) throws Exception { //Receive item from the reader, process and return the result } } public class StudentNotifier extends AbstractItemWriter { @Override public void writeItems(List<Object> items) throws Exception { //Receive items from the processor then write it out } } public class InscriptionCheckpoint implements Serializable { private int lineNumber; public void incrementLineNumber(){ lineNumber++; } public int getLineNumber() { return lineNumber; } }
Batchlet
A batchlet is a type of step to implement your own batch pattern. Unlike a chunk that performs tasks in three phases (reading, processing, and writing), a batchlet step is invoked once and returns an exit status at the end of processing. The following code shows us what a batchlet batch artifact implementation looks like. The source code of this section sends an information message to all students and displays some important information about the batch. The processing is launched by the BatchletStepBatchProcessing.java
Servlet.
The following code is a skeleton of batchlet batch artifact implementation:
public class StudentInformation extends AbstractBatchlet{ @Override public String process() throws Exception { // process return "COMPLETED"; } }
The batch.xml configuration file
The batch.xml
file is an XML file that contains the batch artifacts of the batch application. It establishes a correspondence between the batch artifact implementation and the reference name that is used in the Job XML file. The batch.xml
file must be stored in the META-INF
directory for a portable application. The following code gives us the contents of the batch.xml
file for the inscription-validator-Job
Job shown in the preceding code.
The following code is an example of batch.xml
:
<batch-artifacts xmlns="http://xmlns.jcp.org/xml/ns/javaee"> <ref id="InscriptionReader" class="com.packt.ch02.batchprocessing.chunk.InscriptionReader" /> <ref id="StudentNotifier" class="com.packt.ch02.batchprocessing.chunk.StudentNotifier" /> <ref id="InscriptionProcessor" class="com.packt.ch02.batchprocessing.chunk.InscriptionProcessor" /> </batch-artifacts>
JobOperator
The JobOperator
instance is accessible through the getJobOperator()
method of the BatchRuntime
class. It provides a set of operations to manage (start
, stop
, restart
and so on) a job and access JobRepository
(getJobNames
, getJobInstances
, getStepExecutions
, and so on). The following code shows how to start the inscription-validator-Job
Job shown earlier without any specific property. It is important to note that the inscriptionJob
value that is specified in the JobOperator.start
command is the name of the Job XML file (not the ID of the job). In the Servlet ChunkStepBatchProcessing
, you will see how to retrieve the status and how to monitor information about batch processing from the JobOperator
instance.
The following code is an example of code to start a Job:
JobOperator jobOperator = BatchRuntime.getJobOperator(); if(jobOperator != null) jobOperator.start("inscriptionJob", null);
- C++ Primer習題集(第5版)
- Learning Python Web Penetration Testing
- C++面向對象程序設計(第三版)
- Flask Blueprints
- The Modern C++ Challenge
- Developing Mobile Web ArcGIS Applications
- SQL Server 2012數據庫技術及應用(微課版·第5版)
- Practical Windows Forensics
- Windows Server 2012 Unified Remote Access Planning and Deployment
- Quarkus實踐指南:構建新一代的Kubernetes原生Java微服務
- 高級語言程序設計(C語言版):基于計算思維能力培養
- QGIS By Example
- Learning Vaadin 7(Second Edition)
- Angular開發入門與實戰
- Vue.js 2 Web Development Projects