org.apache.spark.sql.AnalysisException: Table or view not found: `default`.`aaa`; line 1 pos 14

Stack Overflow | Shabash Naidu | 2 months ago
  1. 0

    According to the [Hive Language Manual|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union] for UNION ALL: {quote} The number and names of columns returned by each select_statement have to be the same. Otherwise, a schema error is thrown. {quote} Spark SQL silently swallows an error when the tables being joined with UNION ALL have the same number of columns but different names. Reproducible example: {code} // This test is meant to run in spark-shell import java.io.File import java.io.PrintWriter import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.SaveMode val ctx = sqlContext.asInstanceOf[HiveContext] import ctx.implicits._ def dataPath(name:String) = sys.env("HOME") + "/" + name + ".jsonlines" def tempTable(name: String, json: String) = { val path = dataPath(name) new PrintWriter(path) { write(json); close } ctx.read.json("file://" + path).registerTempTable(name) } // Note category vs. cat names of first column tempTable("test_one", """{"category" : "A", "num" : 5}""") tempTable("test_another", """{"cat" : "A", "num" : 5}""") // +--------+---+ // |category|num| // +--------+---+ // | A| 5| // | A| 5| // +--------+---+ // // Instead, an error should have been generated due to incompatible schema ctx.sql("select * from test_one union all select * from test_another").show // Cleanup new File(dataPath("test_one")).delete() new File(dataPath("test_another")).delete() {code} When the number of columns is different, Spark can even mix in datatypes. Reproducible example (requires a new spark-shell session): {code} // This test is meant to run in spark-shell import java.io.File import java.io.PrintWriter import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.SaveMode val ctx = sqlContext.asInstanceOf[HiveContext] import ctx.implicits._ def dataPath(name:String) = sys.env("HOME") + "/" + name + ".jsonlines" def tempTable(name: String, json: String) = { val path = dataPath(name) new PrintWriter(path) { write(json); close } ctx.read.json("file://" + path).registerTempTable(name) } // Note test_another is missing category column tempTable("test_one", """{"category" : "A", "num" : 5}""") tempTable("test_another", """{"num" : 5}""") // +--------+ // |category| // +--------+ // | A| // | 5| // +--------+ // // Instead, an error should have been generated due to incompatible schema ctx.sql("select * from test_one union all select * from test_another").show // Cleanup new File(dataPath("test_one")).delete() new File(dataPath("test_another")).delete() {code} At other times, when the schema are complex, Spark SQL produces a misleading error about an unresolved Union operator: {code} scala> ctx.sql("""select * from view_clicks | union all | select * from view_clicks_aug | """) 15/08/11 02:40:25 INFO ParseDriver: Parsing command: select * from view_clicks union all select * from view_clicks_aug 15/08/11 02:40:25 INFO ParseDriver: Parse Completed 15/08/11 02:40:25 INFO HiveMetaStore: 0: get_table : db=default tbl=view_clicks 15/08/11 02:40:25 INFO audit: ugi=ubuntu ip=unknown-ip-addr cmd=get_table : db=default tbl=view_clicks 15/08/11 02:40:25 INFO HiveMetaStore: 0: get_table : db=default tbl=view_clicks 15/08/11 02:40:25 INFO audit: ugi=ubuntu ip=unknown-ip-addr cmd=get_table : db=default tbl=view_clicks 15/08/11 02:40:25 INFO HiveMetaStore: 0: get_table : db=default tbl=view_clicks_aug 15/08/11 02:40:25 INFO audit: ugi=ubuntu ip=unknown-ip-addr cmd=get_table : db=default tbl=view_clicks_aug 15/08/11 02:40:25 INFO HiveMetaStore: 0: get_table : db=default tbl=view_clicks_aug 15/08/11 02:40:25 INFO audit: ugi=ubuntu ip=unknown-ip-addr cmd=get_table : db=default tbl=view_clicks_aug org.apache.spark.sql.AnalysisException: unresolved operator 'Union; at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38) at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:126) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:50) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:98) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:97) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:97) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:97) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:97) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:97) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:97) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:50) at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:42) at org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:931) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:131) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:755){code}

    Apache's JIRA Issue Tracker | 1 year ago | Simeon Simeonov
    org.apache.spark.sql.AnalysisException: unresolved operator 'Union;
  2. 0

    allow 'tbl_spark's to be constructed from streaming Spark DataFrames

    GitHub | 4 months ago | kevinushey
    org.apache.spark.sql.AnalysisException: Queries with streaming sources must be executed with writeStream.start();
  3. Speed up your debug routine!

    Automated exception search integrated into your IDE

  4. 0

    Having count(distinct) not working with hivecontext query in spark 1.6

    Stack Overflow | 1 month ago | Yash_spark
    org.apache.spark.sql.AnalysisException: resolved attribute(s) gid#687,z#688 missing from x#685,y#252,z#255 in operator !Aggregate [x#685,y#252], [cast(((count(if ((gid#687 = 1)) z#688 else null),mode=Complete,isDistinct=false) > cast(1 as bigint)) as boolean) AS havingCondition#686,x#685,y#252];
  5. 0

    (scala) Can not found registered table spark sql

    Stack Overflow | 9 months ago | Hello lad
    org.apache.spark.sql.AnalysisException: Table not found: test; line 1 pos 14

    1 unregistered visitors
    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. org.apache.spark.sql.AnalysisException

      Table or view not found: `default`.`aaa`; line 1 pos 14

      at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis()
    2. Spark Project Catalyst
      TreeNode$$anonfun$foreachUp$1.apply
      1. org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
      2. org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:71)
      3. org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:67)
      4. org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)
      5. org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:125)
      6. org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:125)
      6 frames
    3. Scala
      List.foreach
      1. scala.collection.immutable.List.foreach(List.scala:381)
      1 frame
    4. Spark Project Catalyst
      Analyzer.checkAnalysis
      1. org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:125)
      2. org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:67)
      3. org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:58)
      3 frames
    5. Spark Project SQL
      SparkSession.sql
      1. org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
      2. org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
      3. org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
      3 frames
    6. in.inndata.sparkjoinsexamples
      SparkJoinExample.main
      1. in.inndata.sparkjoinsexamples.SparkJoinExample.main(SparkJoinExample.java:10)
      1 frame
    7. Java RT
      Method.invoke
      1. sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      2. sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      3. sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      4. java.lang.reflect.Method.invoke(Method.java:498)
      4 frames
    8. Spark
      SparkSubmit.main
      1. org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
      2. org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
      3. org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
      4. org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
      5. org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      5 frames