org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 4 times, most recent failure: Lost task 7.3 in stage 0.0 (TID 16, spark04): java.io.FileNotFoundException: File file:/home/file/new/ALL.adam/part-r-00227.gz.parquet does not exist

GitHub | car2008 | 6 months ago
tip
Your exception is missing from the Samebug knowledge base.
Here are the best solutions we found on the Internet.
Click on the to mark the helpful solution and get rewards for you help.
  1. 0

    spark-submit throw exception in spark-standalone using .adam which transformed from .vcf

    GitHub | 6 months ago | car2008
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 4 times, most recent failure: Lost task 7.3 in stage 0.0 (TID 16, spark04): java.io.FileNotFoundException: File file:/home/file/new/ALL.adam/part-r-00227.gz.parquet does not exist

    Root Cause Analysis

    1. org.apache.spark.SparkException

      Job aborted due to stage failure: Task 7 in stage 0.0 failed 4 times, most recent failure: Lost task 7.3 in stage 0.0 (TID 16, spark04): java.io.FileNotFoundException: File file:/home/file/new/ALL.adam/part-r-00227.gz.parquet does not exist

      at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus()
    2. Hadoop
      FilterFileSystem.getFileStatus
      1. org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:534)
      2. org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
      3. org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
      4. org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
      4 frames
    3. Parquet
      ParquetRecordReader.initialize
      1. parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:385)
      2. parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:157)
      3. parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:140)
      3 frames
    4. Spark
      Executor$TaskRunner.run
      1. org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:158)
      2. org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:129)
      3. org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:64)
      4. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      5. org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      6. org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      7. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      8. org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      9. org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      10. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      11. org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      12. org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      13. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      14. org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      15. org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      16. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      17. org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      18. org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
      19. org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
      20. org.apache.spark.scheduler.Task.run(Task.scala:89)
      21. org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      21 frames
    5. Java RT
      Thread.run
      1. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      2. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      3. java.lang.Thread.run(Thread.java:745)
      3 frames