org.apache.spark.SparkException

There are no available Samebug tips for this exception. Do you have an idea how to solve this issue? A short tip would help users who saw this issue last week.

  • In 1.2.0 the failing tests are in {{RDDSpec}}: {noformat} java.lang.IllegalArgumentException: ReplicaPartitioner can only determine the partition of a tuple whose key is a non-empty Set[InetAddress]. Invalid key: Set() at com.datastax.spark.connector.rdd.partitioner.ReplicaPartitioner.getPartition(ReplicaPartitioner.scala:44) at org.apache.spark.util.collection.ExternalSorter.org$apache$spark$util$collection$ExternalSorter$$getPartition(ExternalSorter.scala:113) at org.apache.spark.util.collection.ExternalSorter$$anonfun$insertAll$1.apply(ExternalSorter.scala:212) at org.apache.spark.util.collection.ExternalSorter$$anonfun$insertAll$1.apply(ExternalSorter.scala:211) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:366) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) WARN 16:12:59,591 org.apache.spark.Logging$class (Logging.scala:71) - Lost task 0.0 in stage 37.0 (TID 223, localhost): java.lang.IllegalArgumentException: ReplicaPartitioner can only determine the partition of a tuple whose key is a non-empty Set[InetAddress]. Invalid key: Set() at com.datastax.spark.connector.rdd.partitioner.ReplicaPartitioner.getPartition(ReplicaPartitioner.scala:44) at org.apache.spark.util.collection.ExternalSorter.org$apache$spark$util$collection$ExternalSorter$$getPartition(ExternalSorter.scala:113) at org.apache.spark.util.collection.ExternalSorter$$anonfun$insertAll$1.apply(ExternalSorter.scala:212) at org.apache.spark.util.collection.ExternalSorter$$anonfun$insertAll$1.apply(ExternalSorter.scala:211) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:366) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ERROR 16:12:59,591 org.apache.spark.Logging$class (Logging.scala:75) - Task 0 in stage 37.0 failed 1 times; aborting job [info] - should be repartitionable *** FAILED *** (42 milliseconds) [info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 37.0 failed 1 times, most recent failure: Lost task 0.0 in stage 37.0 (TID 223, localhost): java.lang.IllegalArgumentException: ReplicaPartitioner can only determine the partition of a tuple whose key is a non-empty Set[InetAddress]. Invalid key: Set() [info] at com.datastax.spark.connector.rdd.partitioner.ReplicaPartitioner.getPartition(ReplicaPartitioner.scala:44) [info] at org.apache.spark.util.collection.ExternalSorter.org$apache$spark$util$collection$ExternalSorter$$getPartition(ExternalSorter.scala:113) [info] at org.apache.spark.util.collection.ExternalSorter$$anonfun$insertAll$1.apply(ExternalSorter.scala:212) [info] at org.apache.spark.util.collection.ExternalSorter$$anonfun$insertAll$1.apply(ExternalSorter.scala:211) [info] at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) [info] at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:366) [info] at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211) [info] at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) [info] at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) [info] at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) [info] at org.apache.spark.scheduler.Task.run(Task.scala:56) [info] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200) [info] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [info] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [info] at java.lang.Thread.run(Thread.java:745) [info] [info] Driver stacktrace: [info] at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214) [info] at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203) [info] at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202) [info] at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) [info] at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) [info] at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202) [info] at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696) [info] at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696) [info] at scala.Option.foreach(Option.scala:236) [info] at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696) [info] at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420) [info] at akka.actor.Actor$class.aroundReceive(Actor.scala:465) [info] at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375) [info] at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) [info] at akka.actor.ActorCell.invoke(ActorCell.scala:487) [info] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) [info] at akka.dispatch.Mailbox.run(Mailbox.scala:220) [info] at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) [info] at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [info] at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [info] at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [info] at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) {noformat}
    via by Jacek Lewandowski,
    • org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 37.0 failed 1 times, most recent failure: Lost task 0.0 in stage 37.0 (TID 223, localhost): java.lang.IllegalArgumentException: ReplicaPartitioner can only determine the partition of a tuple whose key is a non-empty Set[InetAddress]. Invalid key: Set() at com.datastax.spark.connector.rdd.partitioner.ReplicaPartitioner.getPartition(ReplicaPartitioner.scala:44) at org.apache.spark.util.collection.ExternalSorter.org$apache$spark$util$collection$ExternalSorter$$getPartition(ExternalSorter.scala:113) at org.apache.spark.util.collection.ExternalSorter$$anonfun$insertAll$1.apply(ExternalSorter.scala:212) at org.apache.spark.util.collection.ExternalSorter$$anonfun$insertAll$1.apply(ExternalSorter.scala:211) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:366) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
    No Bugmate found.