java.lang.IllegalStateException: MemcpyAsync H2H failed: [140394272981392] -> [8883763712]

GitHub | wmeddie | 7 months ago
tip
Do you know that we can give you better hits? Get more relevant results from Samebug’s stack trace search.
  1. 0

    Error running MultiGPULenetMnistExample on Azure NV24

    GitHub | 7 months ago | wmeddie
    java.lang.IllegalStateException: MemcpyAsync H2H failed: [140394272981392] -> [8883763712]

    Root Cause Analysis

    1. java.lang.IllegalStateException

      MemcpyAsync H2H failed: [140394272981392] -> [8883763712]

      at org.nd4j.jita.handler.impl.CudaZeroHandler.memcpyAsync()
    2. org.nd4j.jita
      AtomicAllocator.memcpyBlocking
      1. org.nd4j.jita.handler.impl.CudaZeroHandler.memcpyAsync(CudaZeroHandler.java:510)
      2. org.nd4j.jita.handler.impl.CudaZeroHandler.memcpyBlocking(CudaZeroHandler.java:611)
      3. org.nd4j.jita.allocator.impl.AtomicAllocator.memcpyBlocking(AtomicAllocator.java:801)
      3 frames
    3. org.nd4j.linalg
      JCublasNDArrayFactory.average
      1. org.nd4j.linalg.jcublas.JCublasNDArrayFactory.average(JCublasNDArrayFactory.java:800)
      2. org.nd4j.linalg.jcublas.JCublasNDArrayFactory.average(JCublasNDArrayFactory.java:851)
      2 frames
    4. nd4j-api
      Nd4j.averageAndPropagate
      1. org.nd4j.linalg.factory.Nd4j.averageAndPropagate(Nd4j.java:4784)
      1 frame
    5. org.deeplearning4j.parallelism
      ParallelWrapper.fit
      1. org.deeplearning4j.parallelism.ParallelWrapper.fit(ParallelWrapper.java:143)
      1 frame
    6. org.deeplearning4j.examples
      MultiGpuLenetMnistExample.main
      1. org.deeplearning4j.examples.multigpu.MultiGpuLenetMnistExample.main(MultiGpuLenetMnistExample.java:145)
      1 frame