htsjdk.samtools.SAMFormatException: Invalid GZIP header

GitHub | tpoterba | 8 months ago
tip
Click on the to mark the solution that helps you, Samebug will learn from it.
As a community member, you’ll be rewarded for you help.
  1. 0

    BGZIP bug

    GitHub | 8 months ago | tpoterba
    htsjdk.samtools.SAMFormatException: Invalid GZIP header

    Root Cause Analysis

    1. htsjdk.samtools.SAMFormatException

      Invalid GZIP header

      at htsjdk.samtools.util.BlockGunzipper.unzipBlock()
    2. HTS JDK
      BlockCompressedInputStream.available
      1. htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:72)
      2. htsjdk.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:410)
      3. htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:392)
      4. htsjdk.samtools.util.BlockCompressedInputStream.available(BlockCompressedInputStream.java:127)
      4 frames
    3. org.seqdoop.hadoop_bam
      BGZFSplitCompressionInputStream.read
      1. org.seqdoop.hadoop_bam.util.BGZFSplitCompressionInputStream.readWithinBlock(BGZFSplitCompressionInputStream.java:81)
      2. org.seqdoop.hadoop_bam.util.BGZFSplitCompressionInputStream.read(BGZFSplitCompressionInputStream.java:48)
      2 frames
    4. Java RT
      InputStream.read
      1. java.io.InputStream.read(InputStream.java:101)
      1 frame
    5. Hadoop
      CompressedSplitLineReader.fillBuffer
      1. org.apache.hadoop.mapreduce.lib.input.CompressedSplitLineReader.fillBuffer(CompressedSplitLineReader.java:130)
      1 frame
    6. Hadoop
      LineReader.readLine
      1. org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
      2. org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
      2 frames
    7. Hadoop
      TextInputFormat.getRecordReader
      1. org.apache.hadoop.mapreduce.lib.input.CompressedSplitLineReader.readLine(CompressedSplitLineReader.java:159)
      2. org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:134)
      3. org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
      3 frames
    8. Spark
      Executor$TaskRunner.run
      1. org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:239)
      2. org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
      3. org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
      4. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
      5. org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
      6. org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      7. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
      8. org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
      9. org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      10. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
      11. org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
      12. org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      13. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
      14. org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
      15. org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
      16. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
      17. org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
      18. org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      19. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
      20. org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
      21. org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
      22. org.apache.spark.scheduler.Task.run(Task.scala:88)
      23. org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      23 frames
    9. Java RT
      Thread.run
      1. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      2. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      3. java.lang.Thread.run(Thread.java:745)
      3 frames