org.apache.spark.scheduler.TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, 10.240.0.2): scala.MatchError: 707070 (of class java.lang.Long)

DataStax JIRA | Russell Spitzer | 11 months ago
  1. 0

    Currently you cannot write Longs to VarInts from in DataFrames {code} org.apache.spark.scheduler.TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, 10.240.0.2): scala.MatchError: 707070 (of class java.lang.Long) at com.datastax.spark.connector.writer.SqlRowWriter$$anonfun$readColumnValues$1.apply$mcVI$sp(SqlRowWriter.scala:28) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at com.datastax.spark.connector.writer.SqlRowWriter.readColumnValues(SqlRowWriter.scala:26) at com.datastax.spark.connector.writer.SqlRowWriter.readColumnValues(SqlRowWriter.scala:13) at com.datastax.spark.connector.writer.BoundStatementBuilder.bind(BoundStatementBuilder.scala:35) at com.datastax.spark.connector.writer.GroupingBatchBuilder.next(GroupingBatchBuilder.scala:106) at com.datastax.spark.connector.writer.GroupingBatchBuilder.next(GroupingBatchBuilder.scala:31) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at com.datastax.spark.connector.writer.GroupingBatchBuilder.foreach(GroupingBatchBuilder.scala:31) at com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:154) at com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:138) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:109) at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:139) at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109) at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:138) at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:37) at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:37) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/writer/SqlRowWriter.scala#L28-L31 We can extend this to allow for other numeric types and fail more gracefully if we can't figure out a proper conversion In addition {code} case UUIDType => UUID.fromString(row(i).asInstanceOf[String]) case InetType => InetAddress.getByName(row(i).asInstanceOf[String]) {code} Will both improperly deal will nulls

    DataStax JIRA | 11 months ago | Russell Spitzer
    org.apache.spark.scheduler.TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, 10.240.0.2): scala.MatchError: 707070 (of class java.lang.Long)
  2. 0

    Currently you cannot write Longs to VarInts from in DataFrames {code} org.apache.spark.scheduler.TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, 10.240.0.2): scala.MatchError: 707070 (of class java.lang.Long) at com.datastax.spark.connector.writer.SqlRowWriter$$anonfun$readColumnValues$1.apply$mcVI$sp(SqlRowWriter.scala:28) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at com.datastax.spark.connector.writer.SqlRowWriter.readColumnValues(SqlRowWriter.scala:26) at com.datastax.spark.connector.writer.SqlRowWriter.readColumnValues(SqlRowWriter.scala:13) at com.datastax.spark.connector.writer.BoundStatementBuilder.bind(BoundStatementBuilder.scala:35) at com.datastax.spark.connector.writer.GroupingBatchBuilder.next(GroupingBatchBuilder.scala:106) at com.datastax.spark.connector.writer.GroupingBatchBuilder.next(GroupingBatchBuilder.scala:31) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at com.datastax.spark.connector.writer.GroupingBatchBuilder.foreach(GroupingBatchBuilder.scala:31) at com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:154) at com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:138) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:109) at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:139) at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109) at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:138) at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:37) at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:37) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/writer/SqlRowWriter.scala#L28-L31 We can extend this to allow for other numeric types and fail more gracefully if we can't figure out a proper conversion In addition {code} case UUIDType => UUID.fromString(row(i).asInstanceOf[String]) case InetType => InetAddress.getByName(row(i).asInstanceOf[String]) {code} Will both improperly deal will nulls

    DataStax JIRA | 11 months ago | Russell Spitzer
    org.apache.spark.scheduler.TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, 10.240.0.2): scala.MatchError: 707070 (of class java.lang.Long)
  3. 0

    My C* has tables {code} CREATE TABLE csod.role ( object_id uuid, code text, description text, level int, name text, solr_query text, PRIMARY KEY (object_id) ) {code} and {code} CREATE TABLE csod.user_role ( role uuid, user uuid, role_name text, solr_query text, PRIMARY KEY (role, user) ) {code} When I try to use CassandraSQLContext in the Spark shell for joining this tables I get an exception: {code} scala> csc.sql("select * from role r join user_role ur on r.object_id = ur.role").collect WARN 2016-02-10 16:44:46 org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, 172.26.28.101): scala.MatchError: UUIDType (of class org.apache.spark.sql.cassandra.types.UUIDType$) at org.apache.spark.sql.execution.SparkSqlSerializer2$$anonfun$createSerializationFunction$1.apply(SparkSqlSerializer2.scala:232) at org.apache.spark.sql.execution.SparkSqlSerializer2$$anonfun$createSerializationFunction$1.apply(SparkSqlSerializer2.scala:227) at org.apache.spark.sql.execution.Serializer2SerializationStream.writeKey(SparkSqlSerializer2.scala:65) at org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:206) at org.apache.spark.util.collection.WritablePartitionedIterator$$anon$3.writeNext(WritablePartitionedPairCollection.scala:104) at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:375) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:208) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} If I understand right the join should work just like a string composition but it doesn't.

    DataStax JIRA | 10 months ago | Alexander Sedov
    org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, 172.26.28.101): scala.MatchError: UUIDType (of class org.apache.spark.sql.cassandra.types.UUIDType$)
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    My C* has tables {code} CREATE TABLE csod.role ( object_id uuid, code text, description text, level int, name text, solr_query text, PRIMARY KEY (object_id) ) {code} and {code} CREATE TABLE csod.user_role ( role uuid, user uuid, role_name text, solr_query text, PRIMARY KEY (role, user) ) {code} When I try to use CassandraSQLContext in the Spark shell for joining this tables I get an exception: {code} scala> csc.sql("select * from role r join user_role ur on r.object_id = ur.role").collect WARN 2016-02-10 16:44:46 org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, 172.26.28.101): scala.MatchError: UUIDType (of class org.apache.spark.sql.cassandra.types.UUIDType$) at org.apache.spark.sql.execution.SparkSqlSerializer2$$anonfun$createSerializationFunction$1.apply(SparkSqlSerializer2.scala:232) at org.apache.spark.sql.execution.SparkSqlSerializer2$$anonfun$createSerializationFunction$1.apply(SparkSqlSerializer2.scala:227) at org.apache.spark.sql.execution.Serializer2SerializationStream.writeKey(SparkSqlSerializer2.scala:65) at org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:206) at org.apache.spark.util.collection.WritablePartitionedIterator$$anon$3.writeNext(WritablePartitionedPairCollection.scala:104) at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:375) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:208) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} If I understand right the join should work just like a string composition but it doesn't.

    DataStax JIRA | 10 months ago | Alexander Sedov
    org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, 172.26.28.101): scala.MatchError: UUIDType (of class org.apache.spark.sql.cassandra.types.UUIDType$)
  6. 0

    Spark SQL Join error on cassandra UUID types

    Stack Overflow | 1 year ago | jguerra
    org.apache.spark.scheduler.TaskSetManager: Lost task 3.0 in stage 0.0 (TID 6, 161.72.45.76): scala.MatchError: UUIDType (of class org.apache.spark.sql.cassandra.types.UUIDType$)

    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. org.apache.spark.scheduler.TaskSetManager

      Lost task 2.0 in stage 1.0 (TID 3, 10.240.0.2): scala.MatchError: 707070 (of class java.lang.Long)

      at com.datastax.spark.connector.writer.SqlRowWriter$$anonfun$readColumnValues$1.apply$mcVI$sp()
    2. spark-cassandra-connector
      SqlRowWriter$$anonfun$readColumnValues$1.apply$mcVI$sp
      1. com.datastax.spark.connector.writer.SqlRowWriter$$anonfun$readColumnValues$1.apply$mcVI$sp(SqlRowWriter.scala:28)
      1 frame
    3. Scala
      Range.foreach$mVc$sp
      1. scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
      1 frame
    4. spark-cassandra-connector
      GroupingBatchBuilder.next
      1. com.datastax.spark.connector.writer.SqlRowWriter.readColumnValues(SqlRowWriter.scala:26)
      2. com.datastax.spark.connector.writer.SqlRowWriter.readColumnValues(SqlRowWriter.scala:13)
      3. com.datastax.spark.connector.writer.BoundStatementBuilder.bind(BoundStatementBuilder.scala:35)
      4. com.datastax.spark.connector.writer.GroupingBatchBuilder.next(GroupingBatchBuilder.scala:106)
      5. com.datastax.spark.connector.writer.GroupingBatchBuilder.next(GroupingBatchBuilder.scala:31)
      5 frames
    5. Scala
      Iterator$class.foreach
      1. scala.collection.Iterator$class.foreach(Iterator.scala:727)
      1 frame
    6. spark-cassandra-connector
      RDDFunctions$$anonfun$saveToCassandra$1.apply
      1. com.datastax.spark.connector.writer.GroupingBatchBuilder.foreach(GroupingBatchBuilder.scala:31)
      2. com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:154)
      3. com.datastax.spark.connector.writer.TableWriter$$anonfun$write$1.apply(TableWriter.scala:138)
      4. com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
      5. com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:109)
      6. com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:139)
      7. com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109)
      8. com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:138)
      9. com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:37)
      10. com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:37)
      10 frames
    7. Spark
      Executor$TaskRunner.run
      1. org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
      2. org.apache.spark.scheduler.Task.run(Task.scala:88)
      3. org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      3 frames
    8. Java RT
      Thread.run
      1. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      2. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      3. java.lang.Thread.run(Thread.java:745)
      3 frames