org.apache.spark.sql.AnalysisException: Duplicate column(s) : "Int8", "String" found, cannot save to parquet format;

  1. 0

    Duplicate column exception when reading Parquet files from S3A using Spark

    Stack Overflow | 2 months ago | newbie_learner
    org.apache.spark.sql.AnalysisException: Duplicate column(s) : "Int8", "String" found, cannot save to parquet format;
  2. 0

    UTF-8 BOM not properly handled

    GitHub | 1 year ago | sfelsheim
    org.apache.spark.sql.AnalysisException: Cannot resolve column name "addrId" among (´╗┐addrId, city, state, zip);
  3. 0

    Bug: Projection operations (via select) propagate problematic column names that lead to run-time failures

    GitHub | 2 months ago | imarios
    org.apache.spark.sql.AnalysisException: Cannot resolve column name "_1" among (i, i);
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    Why does dropna() not work?

    Stack Overflow | 2 years ago | Jason
    org.apache.spark.sql.AnalysisException: Cannot resolve column name "dropna" among (Name, Age, Country, Score);
  6. 0

    Adding StringType column to existing Spark DataFrame and then applying default values

    Stack Overflow | 2 months ago | smeeb
    org.apache.spark.sql.AnalysisException: Cannot resolve column name "col" among (x, y);

    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. org.apache.spark.sql.AnalysisException

      Duplicate column(s) : "Int8", "String" found, cannot save to parquet format;

      at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.checkConstraints()
    2. org.apache.spark
      ParquetRelation.dataSchema
      1. org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.checkConstraints(ParquetRelation.scala:190)
      2. org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.dataSchema(ParquetRelation.scala:199)
      2 frames
    3. Spark Project SQL
      HadoopFsRelation.schema
      1. org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:561)
      2. org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:560)
      2 frames
    4. org.apache.spark
      LogicalRelation.<init>
      1. org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:37)
      1 frame
    5. Spark Project SQL
      SQLContext.parquetFile
      1. org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:395)
      2. org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:267)
      3. org.apache.spark.sql.SQLContext.parquetFile(SQLContext.scala:1052)
      3 frames