org.springframework.batch.core.JobExecutionException: Flow execution ended unexpectedly

Spring JIRA | Chad Wilson | 2 years ago
  1. 0

    We recently migrated to Spring Batch 3.x a system that has been running fine in production for a number of years on older Spring Batch 2.x releases. While this has been running fine in production for a couple of months now, yesterday a batch failed with what appears to be some kind of race condition / multithreading bug. Re-running the batch completely (which would operate on exactly the same data in a re-run) worked fine so this is not easily reproducible. The error was {noformat} 08:25:21,432 [main] AbstractJob ERROR <execute> - Encountered fatal error executing job org.springframework.batch.core.JobExecutionException: Flow execution ended unexpectedly at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:140) at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:304) at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:135) at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50) at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:128) at org.springframework.batch.core.launch.support.CommandLineJobRunner.start(CommandLineJobRunner.java:362) at org.springframework.batch.core.launch.support.CommandLineJobRunner.main(CommandLineJobRunner.java:590) at com.ml.elt.automarking.util.AutomarkBatchCommandLineRunner.main(AutomarkBatchCommandLineRunner.java:14) Caused by: org.springframework.batch.core.job.flow.FlowExecutionException: Ended flow=ECIJob at state=ECIJob.loadXMLMaster with exception at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:171) at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:141) at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:134) ... 7 more Caused by: java.lang.NullPointerException at org.springframework.batch.core.scope.context.SynchronizationManagerSupport.decrement(SynchronizationManagerSupport.java:149) at org.springframework.batch.core.scope.context.SynchronizationManagerSupport.close(SynchronizationManagerSupport.java:143) at org.springframework.batch.core.scope.context.SynchronizationManagerSupport.release(SynchronizationManagerSupport.java:193) at org.springframework.batch.core.scope.context.StepSynchronizationManager.release(StepSynchronizationManager.java:112) at org.springframework.batch.core.step.AbstractStep.doExecutionRelease(AbstractStep.java:284) at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:274) at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:64) at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67) at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:162) ... 9 more {noformat} In the step that failed, the batch has been split into partitions with gridSize=2 running across 5 threads. it looks like the failure happened at the end of the step. Having a look inside the code the NPE line is below. Could this be due to inconsistent synchronization or possibly an assumption that the count hasn't already been removed that is not true? Without being highly familiar with the code it looks a bit odd to me to synchronize on one field (counts) during increment, but another (contexts) during decrement. {code:java} private void decrement() { E current = getCurrent().pop(); if (current != null) { int remaining = counts.get(current).decrementAndGet(); // <--- PROBLEMATIC LINE if (remaining <= 0) { synchronized (contexts) { contexts.remove(current); counts.remove(current); } } } } public void increment() { E current = getCurrent().peek(); if (current != null) { AtomicInteger count; synchronized (counts) { count = counts.get(current); if (count == null) { count = new AtomicInteger(); counts.put(current, count); } } count.incrementAndGet(); } } {code}

    Spring JIRA | 2 years ago | Chad Wilson
    org.springframework.batch.core.JobExecutionException: Flow execution ended unexpectedly
  2. 0

    We recently migrated to Spring Batch 3.x a system that has been running fine in production for a number of years on older Spring Batch 2.x releases. While this has been running fine in production for a couple of months now, yesterday a batch failed with what appears to be some kind of race condition / multithreading bug. Re-running the batch completely (which would operate on exactly the same data in a re-run) worked fine so this is not easily reproducible. The error was {noformat} 08:25:21,432 [main] AbstractJob ERROR <execute> - Encountered fatal error executing job org.springframework.batch.core.JobExecutionException: Flow execution ended unexpectedly at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:140) at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:304) at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:135) at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50) at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:128) at org.springframework.batch.core.launch.support.CommandLineJobRunner.start(CommandLineJobRunner.java:362) at org.springframework.batch.core.launch.support.CommandLineJobRunner.main(CommandLineJobRunner.java:590) at com.ml.elt.automarking.util.AutomarkBatchCommandLineRunner.main(AutomarkBatchCommandLineRunner.java:14) Caused by: org.springframework.batch.core.job.flow.FlowExecutionException: Ended flow=ECIJob at state=ECIJob.loadXMLMaster with exception at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:171) at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:141) at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:134) ... 7 more Caused by: java.lang.NullPointerException at org.springframework.batch.core.scope.context.SynchronizationManagerSupport.decrement(SynchronizationManagerSupport.java:149) at org.springframework.batch.core.scope.context.SynchronizationManagerSupport.close(SynchronizationManagerSupport.java:143) at org.springframework.batch.core.scope.context.SynchronizationManagerSupport.release(SynchronizationManagerSupport.java:193) at org.springframework.batch.core.scope.context.StepSynchronizationManager.release(StepSynchronizationManager.java:112) at org.springframework.batch.core.step.AbstractStep.doExecutionRelease(AbstractStep.java:284) at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:274) at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148) at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:64) at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67) at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:162) ... 9 more {noformat} In the step that failed, the batch has been split into partitions with gridSize=2 running across 5 threads. it looks like the failure happened at the end of the step. Having a look inside the code the NPE line is below. Could this be due to inconsistent synchronization or possibly an assumption that the count hasn't already been removed that is not true? Without being highly familiar with the code it looks a bit odd to me to synchronize on one field (counts) during increment, but another (contexts) during decrement. {code:java} private void decrement() { E current = getCurrent().pop(); if (current != null) { int remaining = counts.get(current).decrementAndGet(); // <--- PROBLEMATIC LINE if (remaining <= 0) { synchronized (contexts) { contexts.remove(current); counts.remove(current); } } } } public void increment() { E current = getCurrent().peek(); if (current != null) { AtomicInteger count; synchronized (counts) { count = counts.get(current); if (count == null) { count = new AtomicInteger(); counts.put(current, count); } } count.incrementAndGet(); } } {code}

    Spring JIRA | 2 years ago | Chad Wilson
    org.springframework.batch.core.JobExecutionException: Flow execution ended unexpectedly
  3. 0

    [ExitStatus] works only with COMPLETED WITH SKIPS - Spring Forum

    spring.io | 1 year ago
    org.springframework.batch.core.JobExecutionException: Partition handler returned an unsuccessful step
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    Memory leak because of HibernateSessionImpl in Spring batch

    Stack Overflow | 1 year ago | Chakrapani Kulkarni
    org.springframework.batch.core.JobExecutionException: Partition handler returned an unsuccessful step
  6. 0

    [SSP-2878] Data Import Tool - Empty CSV file fails entire job - JASIG Issue Tracker

    jasig.org | 1 year ago
    org.springframework.batch.core.JobExecutionException: Partition handler returned an unsuccessful step

    4 unregistered visitors
    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. java.lang.NullPointerException

      No message provided

      at org.springframework.batch.core.scope.context.SynchronizationManagerSupport.decrement()
    2. Spring Batch Core
      SimpleJobLauncher$1.run
      1. org.springframework.batch.core.scope.context.SynchronizationManagerSupport.decrement(SynchronizationManagerSupport.java:149)
      2. org.springframework.batch.core.scope.context.SynchronizationManagerSupport.close(SynchronizationManagerSupport.java:143)
      3. org.springframework.batch.core.scope.context.SynchronizationManagerSupport.release(SynchronizationManagerSupport.java:193)
      4. org.springframework.batch.core.scope.context.StepSynchronizationManager.release(StepSynchronizationManager.java:112)
      5. org.springframework.batch.core.step.AbstractStep.doExecutionRelease(AbstractStep.java:284)
      6. org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:274)
      7. org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148)
      8. org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:64)
      9. org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67)
      10. org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:162)
      11. org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:141)
      12. org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:134)
      13. org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:304)
      14. org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:135)
      14 frames
    3. Spring Core
      SyncTaskExecutor.execute
      1. org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50)
      1 frame
    4. Spring Batch Core
      CommandLineJobRunner.main
      1. org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:128)
      2. org.springframework.batch.core.launch.support.CommandLineJobRunner.start(CommandLineJobRunner.java:362)
      3. org.springframework.batch.core.launch.support.CommandLineJobRunner.main(CommandLineJobRunner.java:590)
      3 frames
    5. com.ml.elt
      AutomarkBatchCommandLineRunner.main
      1. com.ml.elt.automarking.util.AutomarkBatchCommandLineRunner.main(AutomarkBatchCommandLineRunner.java:14)
      1 frame