Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 3801

One Reader with Multiple Processors and Writers in Spring Batch

$
0
0

1. Introduction

In this tutorial, we’ll explore how to implement a Spring Batch job with one reader, multiple processors, and multiple writers. This approach is useful when we need to read data once, process it in different ways, and then write the results to multiple destinations.

2. Setting up the Spring Batch Project

Before we start, we need to include the Spring Boot Starter Batch and Spring Boot Starter Data JPA dependencies in our pom.xml file:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-batch</artifactId>
    <version>3.5.0</version>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-jpa</artifactId>
    <version>3.4.2</version>
</dependency>
<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <scope>runtime</scope>
</dependency>

These dependencies bring in Spring Batch for our job processing, Spring Data JPA for database operations, and H2 as an in-memory database for development purposes.

2.1. Preparing the Input CSV File

Before implementing the batch components, we need sample data to process. Let’s create a simple CSV file named customers.csv with the following content:

id,name,email,type
1,John,john@example.com,A
2,Alice,alice@example.com,B
3,Bob,bob@example.com,A
4,Eve,eve@example.com,B

This file contains customer records with four fields: a unique identifier, name, email address, and a type designation that will determine our processing path. We’ll store this file in the src/main/resources directory of our project.

2.2. Creating the Data Model

Our batch job needs a Java class to represent the customer data from our CSV file. Let’s create a Customer entity class that maps to our database table:

@Entity
public class Customer {
    @Id
    private Long id;
    private String name;
    private String email;
    private String type;
    
    // Constructors, getters, and setters
}

3. Implementing the CSV Reader

We can now create the component that reads records from our CSV file. Spring Batch provides excellent support for flat file reading through the FlatFileItemReader class:

@Bean
public FlatFileItemReader<Customer> customerReader() {
    return new FlatFileItemReaderBuilder<Customer>()
      .name("customerItemReader")
      .resource(new ClassPathResource("customers.csv"))
      .delimited()
      .names("id", "name", "email", "type")
      .fieldSetMapper(new BeanWrapperFieldSetMapper<Customer>() {{
        setTargetType(Customer.class);
      }})
      .build();
}

This configuration creates a reader that parses our CSV file line by line, mapping each record to a Customer object. The reader handles the file opening and closing automatically, and processes the data in chunks for memory efficiency.

The field names specified in the names() method must match both our CSV header and our Customer class properties.

4. Creating Conditional Processors

We’ll create two separate processors and a routing mechanism to choose between them. Each of these processors implements Spring Batch’s ItemProcessor interface, which defines a single method, process(), used to transform input data before it is written:

public class TypeAProcessor implements ItemProcessor<Customer, Customer> {
    @Override
    public Customer process(Customer customer) {
        customer.setName(customer.getName().toUpperCase());
        customer.setEmail("A_" + customer.getEmail());
        return customer;
    }
}
public class TypeBProcessor implements ItemProcessor<Customer, Customer> {
    @Override
    public Customer process(Customer customer) {
        customer.setName(customer.getName().toLowerCase());
        customer.setEmail("B_" + customer.getEmail());
        return customer;
    }
}

The TypeAProcessor handles customers of type A by converting their names to uppercase and prefixing their email addresses. The process() method takes a Customer object, transforms it, and returns the modified version.

For customers of type B, the TypeBProcessor applies different transformations by converting names to lowercase and using a different email prefix. Both processors implement the same ItemProcessor interface, making them interchangeable in our processing pipeline.

5. Implementing the Processor Router

To connect our processors to the appropriate records, we need a routing mechanism that examines each customer’s type field:

public class CustomerProcessorRouter implements ItemProcessor<Customer, Customer> {
    private final TypeAProcessor typeAProcessor;
    private final TypeBProcessor typeBProcessor;
    public CustomerProcessorRouter(TypeAProcessor typeAProcessor, 
      TypeBProcessor typeBProcessor) {
        this.typeAProcessor = typeAProcessor;
        this.typeBProcessor = typeBProcessor;
    }
    @Override
    public Customer process(Customer customer) throws Exception {
        if ("A".equals(customer.getType())) {
            return typeAProcessor.process(customer);
        } else if ("B".equals(customer.getType())) {
            return typeBProcessor.process(customer);
        }
        return customer;
    }
}

Our router class examines each incoming Customer object and delegates it to the appropriate processor based on the type field. This design keeps our processing logic cleanly separated while maintaining a single processing step in our job definition.

6. Configuring Multiple Writers

After processing our data differently based on type, we want to write the results to multiple destinations. We’ll implement both a database writer and a flat file writer.

6.1. Database Writer Configuration

We begin by creating the database writer component that will handle all JPA operations:

@Bean
public JpaItemWriter<Customer> dbWriter(EntityManagerFactory entityManagerFactory) {
    JpaItemWriter<Customer> writer = new JpaItemWriter<>();
    writer.setEntityManagerFactory(entityManagerFactory);
    return writer;
}

This writer uses JPA to persist our processed Customer objects to the database. During job execution, this JpaItemWriter will persist our processed Customer objects to the configured database, handling all the necessary JPA operations, including inserts and updates.

6.2. Flat File Writer Configuration

For our secondary output destination, we implement a flat file writer that generates a CSV file:

@Bean
public FlatFileItemWriter<Customer> fileWriter() {
    return new FlatFileItemWriterBuilder<Customer>()
      .name("customerItemWriter")
      .resource(new FileSystemResource("output/processed_customers.txt"))
      .delimited()
      .delimiter(",")
      .names("id", "name", "email", "type")
      .build();
}

The FlatFileItemWriter is configured to use comma delimiters and includes explicit field naming that matches our Customer entity properties. During job execution, this writer will create a structured CSV file containing all processed customer records in the specified format.

6.3. Combining Components in a Composite Writer

To write to both destinations simultaneously, we’ll use Spring Batch’s CompositeItemWriter:

@Bean
public CompositeItemWriter<Customer> compositeWriter(
  JpaItemWriter<Customer> dbWriter,
  FlatFileItemWriter<Customer> fileWriter) {
    CompositeItemWriter<Customer> writer = new CompositeItemWriter<>();
    writer.setDelegates(List.of(dbWriter, fileWriter));
    return writer;
}

This composite writer acts as a delegate for the writers, ensuring that each processed item is written to all destinations. The order of delegates determines the sequence of writing.

7. Configuring the Step and Job

Now, let’s put everything together by creating a step and a job configuration:

@Bean
public Job processCustomersJob(JobBuilderFactory jobs,
  StepBuilderFactory steps,
  FlatFileItemReader<Customer> reader,
  CustomerProcessorRouter processor,
  CompositeItemWriter<Customer> writer) {
    Step step = steps.get("processCustomersStep")
      .<Customer, Customer>chunk(10)
      .reader(reader)
      .processor(processor)
      .writer(writer)
      .build();
    return jobs.get("customerProcessingJob")
      .start(step)
      .build();
}

This job configuration defines a single step that reads customers in chunks of 10, processes each through our router, and writes the results to both the database and flat file.

8. Running and Testing the Job

To verify our batch job works as expected, let’s write an integration test that launches the job and asserts both the database and output file results for different customer types:

List<Customer> dbCustomers = jdbcTemplate.query(
    "SELECT id, name, email, type FROM customer WHERE type = 'A'",
    (rs, rowNum) -> new Customer(
        rs.getLong("id"),
        rs.getString("name"),
        rs.getString("email"),
        rs.getString("type"))
);
assertFalse(dbCustomers.isEmpty());
dbCustomers.forEach(c -> {
    assertEquals(c.getName(), c.getName().toUpperCase());
    assertTrue(c.getEmail().startsWith("A_"));
});
Path outputFile = Paths.get("output/processed_customers.txt");
assertTrue(Files.exists(outputFile));
List<String> lines = Files.readAllLines(outputFile);
boolean hasTypeB = lines.stream().anyMatch(line -> line.endsWith(",B"));
assertTrue(hasTypeB);
lines.forEach(line -> {
    String[] parts = line.split(",");
    if ("B".equals(parts[3])) {
        assertEquals(parts[1], parts[1].toLowerCase());
        assertTrue(parts[2].startsWith("B_"));
    }
});

From the test case, we first query the database to check that customers of type A have been saved with their names converted to uppercase and their email prefixed with “A_”. Next, we also read the output file to confirm that customers of type B have their names converted to lowercase and their emails prefixed with “B_”.

9. Conclusion

In this article, we learned how to configure a Spring Batch job using a single reader but multiple processors and writers. We read data from a CSV file, routed each record to a specific processor based on its content, and finally delegated the writing to multiple writers.

As always, the source code is available over on GitHub.

The post One Reader with Multiple Processors and Writers in Spring Batch first appeared on Baeldung.
       

Viewing all articles
Browse latest Browse all 3801

Latest Images

Trending Articles



Latest Images