java.io.IOException: Could not read footer: java.lang.NoClassDefFoundError: parquet/org/codehaus/jackson/JsonGenerationException at >>>>> parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:189) at >>>>> parquet.hadoop.ParquetFileReader.readAllFootersInParallelUsingSummaryFiles(ParquetFileReader.java:145) at >>>>> parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:354) at >>>>> parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:339) at >>>>> parquet.hadoop.ParquetInputFormat.getSplits(ParquetInputFormat.java:246) at >>>>> org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:85) at >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207) at >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)

Google Groups | Uri Laserson | 3 years ago
  1. 0

    Re: Using Parquet from an interactive Spark shell

    Google Groups | 3 years ago | Uri Laserson
    java.io.IOException: Could not read footer: java.lang.NoClassDefFoundError: parquet/org/codehaus/jackson/JsonGenerationException at >>>>> parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:189) at >>>>> parquet.hadoop.ParquetFileReader.readAllFootersInParallelUsingSummaryFiles(ParquetFileReader.java:145) at >>>>> parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:354) at >>>>> parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:339) at >>>>> parquet.hadoop.ParquetInputFormat.getSplits(ParquetInputFormat.java:246) at >>>>> org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:85) at >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207) at >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
  2. 0

    Re: Using Parquet from an interactive Spark shell

    Google Groups | 3 years ago | Uri Laserson
    java.io.IOException: Could not read footer: java.lang.NoClassDefFoundError: parquet/org/codehaus/jackson/JsonGenerationException at >> parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:189) at >> parquet.hadoop.ParquetFileReader.readAllFootersInParallelUsingSummaryFiles(ParquetFileReader.java:145) at >> parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:354) at >> parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:339) at >> parquet.hadoop.ParquetInputFormat.getSplits(ParquetInputFormat.java:246) at >> org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:85)
  3. 0

    GitHub comment 19#222298002

    GitHub | 6 months ago | skoppar
    java.io.IOException: Could not read footer: java.lang.RuntimeException: file:/Users/skoppar/workspace/pyspark-beacon/stream/allproto.log is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [55, 73, 67, 10] at org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:248) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$24.apply(ParquetRelation.scala:812) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$24.apply(ParquetRelation.scala:801) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$22.apply(RDD.scala:756) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$22.apply(RDD.scala:756)
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    Can't read parquet with spark2.0

    Google Groups | 4 months ago | Unknown author
    java.io.IOException: Could not read footer for file FileStatus{path=alluxio://master1:9000/tpctest/catalog_sales/_common_metadata; isDirectory=false; length=3654; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} at org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:247) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$24.apply(ParquetRelation.scala:812) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$24.apply(ParquetRelation.scala:801) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$22.apply(RDD.scala:756) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$22.apply(RDD.scala:756) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  6. 0

    Can't read parquet with spark2.0

    Google Groups | 4 months ago | Unknown author
    java.io.IOException: Could not read footer for file FileStatus{path=alluxio://master1:9000/tpctest/catalog_sales/_common_metadata; isDirectory=false; length=3654; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} at org.apache.parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:247) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$24.apply(ParquetRelation.scala:812) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$24.apply(ParquetRelation.scala:801) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$22.apply(RDD.scala:756) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$22.apply(RDD.scala:756)

    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. java.io.IOException

      Could not read footer: java.lang.NoClassDefFoundError: parquet/org/codehaus/jackson/JsonGenerationException at >>>>> parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:189) at >>>>> parquet.hadoop.ParquetFileReader.readAllFootersInParallelUsingSummaryFiles(ParquetFileReader.java:145) at >>>>> parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:354) at >>>>> parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:339) at >>>>> parquet.hadoop.ParquetInputFormat.getSplits(ParquetInputFormat.java:246) at >>>>> org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:85) at >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207) at >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)

      at scala.Option.getOrElse()
    2. Scala
      Option.getOrElse
      1. scala.Option.getOrElse(Option.scala:120)
      1 frame
    3. Spark
      RDD.collect
      1. org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
      2. org.apache.spark.SparkContext.runJob(SparkContext.scala:863)
      3. org.apache.spark.rdd.RDD.collect(RDD.scala:602)
      3 frames
    4. Unknown
      $iwC.<init>
      1. $iwC$$iwC$$iwC$$iwC.<init>(<console>:20)
      2. $iwC$$iwC$$iwC.<init>(<console>:25)
      3. $iwC$$iwC.<init>(<console>:27)
      4. $iwC.<init>(<console>:29)
      4 frames