org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 2.0 failed 1 times, most recent failure: Lost task 2.0 in stage 2.0 (TID 10, localhost): org.apache.spark.util.TaskCompletionListenerException: state should be: open

JIRA | Sam Weaver | 10 months ago
tip
Your exception is missing from the Samebug knowledge base.
Here are the best solutions we found on the Internet.
Click on the to mark the helpful solution and get rewards for you help.
  1. 0

    Working in Python, here is the code that is being called: {code: Python} # set up environment conf = SparkConf() \ .setAppName("MovieRatings") \ .set("spark.executor.memory", "4g") sc = SparkContext(conf=conf) sqlContext = SQLContext(sc) df = sqlContext.read.format("com.mongodb.spark.sql").load() print("Schema:") df.printSchema() print("Count:") df.count() {code} count never returns, instead a stack trace is generated: {code} 16/04/29 13:14:12 INFO SparkContext: Created broadcast 3 from broadcast at MongoRDD.scala:145 Schema: root |-- _id: string (nullable = true) |-- genre: string (nullable = true) |-- rating: string (nullable = true) |-- title: string (nullable = true) |-- user_id: string (nullable = true) Count: 16/04/29 13:14:14 INFO SparkContext: Starting job: count at NativeMethodAccessorImpl.java:-2 16/04/29 13:14:14 INFO DAGScheduler: Registering RDD 12 (count at NativeMethodAccessorImpl.java:-2) 16/04/29 13:14:14 INFO DAGScheduler: Got job 1 (count at NativeMethodAccessorImpl.java:-2) with 1 output partitions 16/04/29 13:14:14 INFO DAGScheduler: Final stage: ResultStage 3 (count at NativeMethodAccessorImpl.java:-2) 16/04/29 13:14:14 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 2) 16/04/29 13:14:14 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 2) 16/04/29 13:14:14 INFO DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[12] at count at NativeMethodAccessorImpl.java:-2), which has no missing parents 16/04/29 13:14:14 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 11.4 KB, free 26.0 KB) 16/04/29 13:14:14 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 5.8 KB, free 31.7 KB) 16/04/29 13:14:14 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on localhost:52084 (size: 5.8 KB, free: 511.1 MB) 16/04/29 13:14:14 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1006 16/04/29 13:14:14 INFO DAGScheduler: Submitting 6 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[12] at count at NativeMethodAccessorImpl.java:-2) 16/04/29 13:14:14 INFO TaskSchedulerImpl: Adding task set 2.0 with 6 tasks 16/04/29 13:14:14 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 8, localhost, partition 0,ANY, 2640 bytes) 16/04/29 13:14:14 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 9, localhost, partition 1,ANY, 2652 bytes) 16/04/29 13:14:14 INFO TaskSetManager: Starting task 2.0 in stage 2.0 (TID 10, localhost, partition 2,ANY, 2652 bytes) 16/04/29 13:14:14 INFO TaskSetManager: Starting task 3.0 in stage 2.0 (TID 11, localhost, partition 3,ANY, 2652 bytes) 16/04/29 13:14:14 INFO Executor: Running task 0.0 in stage 2.0 (TID 8) 16/04/29 13:14:14 INFO Executor: Running task 1.0 in stage 2.0 (TID 9) 16/04/29 13:14:14 INFO Executor: Running task 2.0 in stage 2.0 (TID 10) 16/04/29 13:14:14 INFO Executor: Running task 3.0 in stage 2.0 (TID 11) 16/04/29 13:14:15 INFO GenerateMutableProjection: Code generated in 403.549317 ms 16/04/29 13:14:15 INFO GenerateUnsafeProjection: Code generated in 39.364259 ms 16/04/29 13:14:15 INFO GenerateMutableProjection: Code generated in 17.201067 ms 16/04/29 13:14:15 INFO GenerateUnsafeRowJoiner: Code generated in 10.042296 ms 16/04/29 13:14:15 INFO GenerateUnsafeProjection: Code generated in 14.153414 ms 16/04/29 13:14:16 INFO BlockManagerInfo: Removed broadcast_2_piece0 on localhost:52084 in memory (size: 1788.0 B, free: 511.1 MB) 16/04/29 13:14:19 INFO MongoClientCache: Closing MongoClient: [127.0.0.1:27017] 16/04/29 13:14:19 INFO connection: Closed connection [connectionId{localValue:3, serverValue:7591}] to 127.0.0.1:27017 because the pool has been closed. 16/04/29 13:14:19 ERROR TaskContextImpl: Error in TaskCompletionListener java.lang.IllegalStateException: state should be: open at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70) at com.mongodb.connection.DefaultServer.getConnection(DefaultServer.java:70) at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getConnection(ClusterBinding.java:86) at com.mongodb.operation.QueryBatchCursor.killCursor(QueryBatchCursor.java:263) at com.mongodb.operation.QueryBatchCursor.close(QueryBatchCursor.java:148) at com.mongodb.MongoBatchCursorAdapter.close(MongoBatchCursorAdapter.java:41) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:258) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:256) at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:19 WARN TaskMemoryManager: leak 4.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@589f3649 16/04/29 13:14:19 ERROR Executor: Managed memory leak detected; size = 4456448 bytes, TID = 10 16/04/29 13:14:19 INFO connection: Closed connection [connectionId{localValue:4, serverValue:7592}] to 127.0.0.1:27017 because the pool has been closed. 16/04/29 13:14:19 ERROR Executor: Exception in task 2.0 in stage 2.0 (TID 10) org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:19 ERROR TaskContextImpl: Error in TaskCompletionListener java.lang.IllegalStateException: state should be: open at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70) at com.mongodb.connection.DefaultServer.getConnection(DefaultServer.java:70) at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getConnection(ClusterBinding.java:86) at com.mongodb.operation.QueryBatchCursor.killCursor(QueryBatchCursor.java:263) at com.mongodb.operation.QueryBatchCursor.close(QueryBatchCursor.java:148) at com.mongodb.MongoBatchCursorAdapter.close(MongoBatchCursorAdapter.java:41) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:258) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:256) at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:19 WARN TaskMemoryManager: leak 4.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@da9b296 16/04/29 13:14:19 ERROR Executor: Managed memory leak detected; size = 4456448 bytes, TID = 8 16/04/29 13:14:19 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 8) org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:19 INFO TaskSetManager: Starting task 4.0 in stage 2.0 (TID 12, localhost, partition 4,ANY, 2652 bytes) 16/04/29 13:14:19 INFO TaskSetManager: Starting task 5.0 in stage 2.0 (TID 13, localhost, partition 5,ANY, 2640 bytes) 16/04/29 13:14:19 INFO Executor: Running task 4.0 in stage 2.0 (TID 12) 16/04/29 13:14:19 INFO Executor: Running task 5.0 in stage 2.0 (TID 13) 16/04/29 13:14:19 WARN TaskSetManager: Lost task 2.0 in stage 2.0 (TID 10, localhost): org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 ERROR TaskSetManager: Task 2 in stage 2.0 failed 1 times; aborting job 16/04/29 13:14:20 INFO TaskSetManager: Lost task 0.0 in stage 2.0 (TID 8) on executor localhost: org.apache.spark.util.TaskCompletionListenerException (state should be: open) [duplicate 1] 16/04/29 13:14:20 INFO cluster: Cluster created with settings {hosts=[127.0.0.1:27017], mode=MULTIPLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500} 16/04/29 13:14:20 INFO cluster: Adding discovered server 127.0.0.1:27017 to client view of cluster 16/04/29 13:14:20 INFO cluster: Cluster created with settings {hosts=[127.0.0.1:27017], mode=MULTIPLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500} 16/04/29 13:14:20 INFO cluster: Cluster description not yet available. Waiting for 30000 ms before timing out 16/04/29 13:14:20 INFO cluster: Adding discovered server 127.0.0.1:27017 to client view of cluster 16/04/29 13:14:20 INFO connection: Opened connection [connectionId{localValue:6, serverValue:7596}] to 127.0.0.1:27017 16/04/29 13:14:20 INFO cluster: Monitor thread successfully connected to server with description ServerDescription{address=127.0.0.1:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 2, 1]}, minWireVersion=0, maxWireVersion=4, maxDocumentSize=16777216, roundTripTimeNanos=345918} 16/04/29 13:14:20 INFO cluster: Discovered cluster type of STANDALONE 16/04/29 13:14:20 INFO connection: Opened connection [connectionId{localValue:7, serverValue:7597}] to 127.0.0.1:27017 16/04/29 13:14:20 INFO cluster: Monitor thread successfully connected to server with description ServerDescription{address=127.0.0.1:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 2, 1]}, minWireVersion=0, maxWireVersion=4, maxDocumentSize=16777216, roundTripTimeNanos=348599} 16/04/29 13:14:20 INFO cluster: Discovered cluster type of STANDALONE 16/04/29 13:14:20 INFO MongoClientCache: Creating MongoClient: [127.0.0.1:27017] 16/04/29 13:14:20 INFO MongoClientCache: Creating MongoClient: [127.0.0.1:27017] 16/04/29 13:14:20 INFO TaskSchedulerImpl: Cancelling stage 2 16/04/29 13:14:20 INFO MongoClientCache: Closing MongoClient: [127.0.0.1:27017] 16/04/29 13:14:20 INFO connection: Opened connection [connectionId{localValue:8, serverValue:7598}] to 127.0.0.1:27017 16/04/29 13:14:20 INFO connection: Opened connection [connectionId{localValue:9, serverValue:7599}] to 127.0.0.1:27017 16/04/29 13:14:20 INFO Executor: Executor is trying to kill task 4.0 in stage 2.0 (TID 12) 16/04/29 13:14:20 INFO TaskSchedulerImpl: Stage 2 was cancelled 16/04/29 13:14:20 INFO Executor: Executor is trying to kill task 1.0 in stage 2.0 (TID 9) 16/04/29 13:14:20 INFO Executor: Executor is trying to kill task 5.0 in stage 2.0 (TID 13) 16/04/29 13:14:20 INFO Executor: Executor is trying to kill task 3.0 in stage 2.0 (TID 11) 16/04/29 13:14:20 INFO DAGScheduler: ShuffleMapStage 2 (count at NativeMethodAccessorImpl.java:-2) failed in 5.548 s 16/04/29 13:14:20 INFO connection: Closed connection [connectionId{localValue:2, serverValue:7590}] to 127.0.0.1:27017 because the pool has been closed. 16/04/29 13:14:20 ERROR TaskContextImpl: Error in TaskCompletionListener java.lang.IllegalStateException: state should be: open at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70) at com.mongodb.connection.DefaultServer.getConnection(DefaultServer.java:70) at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getConnection(ClusterBinding.java:86) at com.mongodb.operation.QueryBatchCursor.killCursor(QueryBatchCursor.java:263) at com.mongodb.operation.QueryBatchCursor.close(QueryBatchCursor.java:148) at com.mongodb.MongoBatchCursorAdapter.close(MongoBatchCursorAdapter.java:41) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:258) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:256) at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 INFO DAGScheduler: Job 1 failed: count at NativeMethodAccessorImpl.java:-2, took 5.577351 s 16/04/29 13:14:20 WARN TaskMemoryManager: leak 4.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@4a9b97ee 16/04/29 13:14:20 ERROR Executor: Managed memory leak detected; size = 4456448 bytes, TID = 9 16/04/29 13:14:20 ERROR Executor: Exception in task 1.0 in stage 2.0 (TID 9) org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 INFO TaskSetManager: Lost task 1.0 in stage 2.0 (TID 9) on executor localhost: org.apache.spark.util.TaskCompletionListenerException (state should be: open) [duplicate 2] Traceback (most recent call last): File "/Users/samweaver/Desktop/movies-spark.py", line 58, in <module> df.count() File "/Users/samweaver/Downloads/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 269, in count File "/Users/samweaver/Downloads/spark-1.6.1-bin-hadoop2.6/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__ File "/Users/samweaver/Downloads/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/sql/utils.py", line 45, in deco File "/Users/samweaver/Downloads/spark-1.6.1-bin-hadoop2.6/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value py4j.protocol.Py4JJavaError16/04/29 13:14:20 INFO connection: Closed connection [connectionId{localValue:5, serverValue:7593}] to 127.0.0.1:27017 because the pool has been closed. 16/04/29 13:14:20 ERROR TaskContextImpl: Error in TaskCompletionListener java.lang.IllegalStateException: state should be: open at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70) at com.mongodb.connection.DefaultServer.getConnection(DefaultServer.java:70) at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getConnection(ClusterBinding.java:86) at com.mongodb.operation.QueryBatchCursor.killCursor(QueryBatchCursor.java:263) at com.mongodb.operation.QueryBatchCursor.close(QueryBatchCursor.java:148) at com.mongodb.MongoBatchCursorAdapter.close(MongoBatchCursorAdapter.java:41) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:258) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:256) at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 WARN TaskMemoryManager: leak 4.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@13186b4c 16/04/29 13:14:20 ERROR Executor: Managed memory leak detected; size = 4456448 bytes, TID = 11 16/04/29 13:14:20 ERROR Executor: Exception in task 3.0 in stage 2.0 (TID 11) org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 INFO TaskSetManager: Lost task 3.0 in stage 2.0 (TID 11) on executor localhost: org.apache.spark.util.TaskCompletionListenerException (state should be: open) [duplicate 3] : An error occurred while calling o28.count. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 2.0 failed 1 times, most recent failure: Lost task 2.0 in stage 2.0 (TID 10, localhost): org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.collect(RDD.scala:926) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2086) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1498) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1505) at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1515) at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1514) at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2099) at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1514) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:209) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more 16/04/29 13:14:20 INFO MongoClientCache: Closing MongoClient: [127.0.0.1:27017] 16/04/29 13:14:20 INFO SparkContext: Invoking stop() from shutdown hook 16/04/29 13:14:20 INFO SparkUI: Stopped Spark web UI at http://10.4.126.113:4041 16/04/29 13:14:20 INFO connection: Closed connection [connectionId{localValue:9, serverValue:7599}] to 127.0.0.1:27017 because the pool has been closed. 16/04/29 13:14:20 ERROR TaskContextImpl: Error in TaskCompletionListener java.lang.IllegalStateException: state should be: open at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70) at com.mongodb.connection.DefaultServer.getConnection(DefaultServer.java:70) at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getConnection(ClusterBinding.java:86) at com.mongodb.operation.QueryBatchCursor.killCursor(QueryBatchCursor.java:263) at com.mongodb.operation.QueryBatchCursor.close(QueryBatchCursor.java:148) at com.mongodb.MongoBatchCursorAdapter.close(MongoBatchCursorAdapter.java:41) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:258) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:256) at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 WARN TaskMemoryManager: leak 4.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@7a4b56ab 16/04/29 13:14:20 ERROR Executor: Managed memory leak detected; size = 4456448 bytes, TID = 13 16/04/29 13:14:20 ERROR Executor: Exception in task 5.0 in stage 2.0 (TID 13) org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/04/29 13:14:20 INFO MemoryStore: MemoryStore cleared 16/04/29 13:14:20 INFO BlockManager: BlockManager stopped 16/04/29 13:14:20 INFO BlockManagerMaster: BlockManagerMaster stopped 16/04/29 13:14:20 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/04/29 13:14:20 INFO SparkContext: Successfully stopped SparkContext 16/04/29 13:14:20 INFO ShutdownHookManager: Shutdown hook called 16/04/29 13:14:20 INFO connection: Closed connection [connectionId{localValue:8, serverValue:7598}] to 127.0.0.1:27017 because the pool has been closed. 16/04/29 13:14:20 ERROR TaskContextImpl: Error in TaskCompletionListener java.lang.IllegalStateException: state should be: open at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70) at com.mongodb.connection.DefaultServer.getConnection(DefaultServer.java:70) at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getConnection(ClusterBinding.java:86) at com.mongodb.operation.QueryBatchCursor.killCursor(QueryBatchCursor.java:263) at com.mongodb.operation.QueryBatchCursor.close(QueryBatchCursor.java:148) at com.mongodb.MongoBatchCursorAdapter.close(MongoBatchCursorAdapter.java:41) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:258) at com.mongodb.spark.rdd.MongoRDD$$anonfun$compute$1.apply(MongoRDD.scala:256) at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79) at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 WARN TaskMemoryManager: leak 4.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@5768f06b 16/04/29 13:14:20 ERROR Executor: Managed memory leak detected; size = 4456448 bytes, TID = 12 16/04/29 13:14:20 ERROR Executor: Exception in task 4.0 in stage 2.0 (TID 12) org.apache.spark.util.TaskCompletionListenerException: state should be: open at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:91) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Exception in thread "Executor task launch worker-3" java.lang.IllegalStateException: RpcEnv already stopped. at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:159) at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:131) at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:192) at org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:516) at org.apache.spark.scheduler.local.LocalBackend.statusUpdate(LocalBackend.scala:151) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:317) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/29 13:14:20 INFO ShutdownHookManager: Deleting directory /private/var/folders/d5/b5c42jg54t98npm2f1dndrp80000gp/T/spark-08656be4-1592-44f5-a509-c02bdef10271 16/04/29 13:14:20 INFO ShutdownHookManager: Deleting directory /private/var/folders/d5/b5c42jg54t98npm2f1dndrp80000gp/T/spark-08656be4-1592-44f5-a509-c02bdef10271/httpd-3bd90f28-2ffd-43e0-bffe-6bb2107b3bc1 16/04/29 13:14:20 INFO ShutdownHookManager: Deleting directory /private/var/folders/d5/b5c42jg54t98npm2f1dndrp80000gp/T/spark-08656be4-1592-44f5-a509-c02bdef10271/pyspark-693e6ff0-eb3c-4b64-87a3-529d2002122f 16/04/29 13:14:20 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. {code}

    JIRA | 10 months ago | Sam Weaver
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 2.0 failed 1 times, most recent failure: Lost task 2.0 in stage 2.0 (TID 10, localhost): org.apache.spark.util.TaskCompletionListenerException: state should be: open
  2. 0

    NetworkClient: Invalid target URI POST@null/pharmadata/test/_search

    GitHub | 10 months ago | anand-singh
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): org.apache.spark.util.TaskCompletionListenerException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[host:port]]
  3. 0

    Re: pyspark not working

    incubator-zeppelin-users | 2 years ago | prateek arora
    org.apache.spark.SparkException: Error from python worker: /usr/bin/python: No module named pyspark PYTHONPATH was: /yarn/nm/usercache/ubuntu/filecache/80/zeppelin-spark-0.5.0-incubating-SNAPSHOT.jar:/usr/local/spark-1.3.1-bin-hadoop2.6/python:/usr/local/spark-1.3.1-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163)
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    Py4Java: ImportError: No module named numpy when running Python shell for Apache Spark

    Stack Overflow | 2 years ago
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 1 times, most recent failure: Lost task 3.0 in stage 0.0 (TID 3, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/Users/m/workspace/spark-1.2.0-bin-hadoop2.4/python/pyspark/worker.py", line 90, in main command = pickleSer._read_with_length(infile) File "/Users/m/workspace/spark-1.2.0-bin-hadoop2.4/python/pyspark/serializers.py", line 151, in _read_with_length return self.loads(obj) File "/Users/m/workspace/spark-1.2.0-bin-hadoop2.4/python/pyspark/serializers.py", line 396, in loads return cPickle.loads(obj) File "/Users/m/workspace/spark-1.2.0-bin-hadoop2.4/python/pyspark/mllib/__init__.py", line 24, in <module> import numpy ImportError: No module named numpy
  6. 0

    Why does Apache PySpark top() fail when the RDD contains a user defined class?

    Stack Overflow | 2 years ago | user3279453
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 49.0 failed 1 times, most recent failure: Lost task 1.0 in stage 49.0 (TID 99, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "C:\Programs\Apache\Spark\spark-1.2.0-bin-hadoop2.4\python\pyspark\worker.py", line 107, in main process() File "C:\Programs\Apache\Spark\spark-1.2.0-bin-hadoop2.4\python\pyspark\worker.py", line 98, in process serializer.dump_stream(func(split_index, iterator), outfile) File "C:\Programs\Apache\Spark\spark-1.2.0-bin-hadoop2.4\python\pyspark\serializers.py", line 231, in dump_stream bytes = self.serializer.dumps(vs) File "C:\Programs\Apache\Spark\spark-1.2.0-bin-hadoop2.4\python\pyspark\serializers.py", line 393, in dumps return cPickle.dumps(obj, 2) PicklingError: Can't pickle <class '__main__.TestClass'>: attribute lookup __main__.TestClass failed

  1. tyson925 2 times, last 10 months ago
5 unregistered visitors
Not finding the right solution?
Take a tour to get the most out of Samebug.

Tired of useless tips?

Automated exception search integrated into your IDE

Root Cause Analysis

  1. org.apache.spark.SparkException

    Job aborted due to stage failure: Task 2 in stage 2.0 failed 1 times, most recent failure: Lost task 2.0 in stage 2.0 (TID 10, localhost): org.apache.spark.util.TaskCompletionListenerException: state should be: open

    at org.apache.spark.TaskContextImpl.markTaskCompleted()
  2. Spark
    Executor$TaskRunner.run
    1. org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:87)
    2. org.apache.spark.scheduler.Task.run(Task.scala:91)
    3. org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    3 frames
  3. Java RT
    Thread.run
    1. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    2. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    3. java.lang.Thread.run(Thread.java:745)
    3 frames