org.apache.reef.exception.EvaluatorException: Evaluator [container_1474871539042_0001_01_000002] is assumed to be in state [RUNNING]. But the resource manager reports it to be in state [FAILED]. This means that the Evaluator failed but wasn't able to send an error message back to the driver. Task [0] was running when the Evaluator crashed.

GitHub | HosiYuki | 3 months ago
  1. 0

    GitHub comment 1349#249500448

    GitHub | 3 months ago | HosiYuki
    org.apache.reef.exception.EvaluatorException: Evaluator [container_1474871539042_0001_01_000002] is assumed to be in state [RUNNING]. But the resource manager reports it to be in state [FAILED]. This means that the Evaluator failed but wasn't able to send an error message back to the driver. Task [0] was running when the Evaluator crashed.

    Root Cause Analysis

    1. org.apache.reef.exception.EvaluatorException

      Evaluator [container_1474871539042_0001_01_000002] is assumed to be in state [RUNNING]. But the resource manager reports it to be in state [FAILED]. This means that the Evaluator failed but wasn't able to send an error message back to the driver. Task [0] was running when the Evaluator crashed.

      at org.apache.reef.runtime.common.driver.evaluator.EvaluatorManager.onResourceStatusMessage()
    2. org.apache.reef
      YarnContainerManager.onContainersCompleted
      1. org.apache.reef.runtime.common.driver.evaluator.EvaluatorManager.onResourceStatusMessage(EvaluatorManager.java:589)
      2. org.apache.reef.runtime.common.driver.resourcemanager.ResourceStatusHandler.onNext(ResourceStatusHandler.java:63)
      3. org.apache.reef.runtime.common.driver.resourcemanager.ResourceStatusHandler.onNext(ResourceStatusHandler.java:36)
      4. org.apache.reef.runtime.yarn.driver.REEFEventHandlers.onResourceStatus(REEFEventHandlers.java:91)
      5. org.apache.reef.runtime.yarn.driver.YarnContainerManager.onContainerStatus(YarnContainerManager.java:391)
      6. org.apache.reef.runtime.yarn.driver.YarnContainerManager.onContainersCompleted(YarnContainerManager.java:128)
      6 frames
    3. hadoop-yarn-client
      AMRMClientAsyncImpl$CallbackHandlerThread.run
      1. org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300)
      1 frame