org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/worker.py", line 77, in main serializer.dump_stream(func(split_index, iterator), outfile) File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 191, in dump_stream self.serializer.dump_stream(self._batched(iterator), stream) File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 123, in dump_stream for obj in iterator: File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 180, in _batched for item in iterator: File "/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/python/pyspark/rdd.py", line 612, in func File "/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/examples/src/main/python/pi.py", line 36, in f SystemError: unknown opcode

spark-user | Andrew Or | 3 years ago
tip
Your exception is missing from the Samebug knowledge base.
Here are the best solutions we found on the Internet.
Click on the to mark the helpful solution and get rewards for you help.
  1. 0

    Re: pyspark yarn got exception

    spark-user | 3 years ago | Andrew Or
    org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/worker.py", line 77, in main serializer.dump_stream(func(split_index, iterator), outfile) File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 191, in dump_stream self.serializer.dump_stream(self._batched(iterator), stream) File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 123, in dump_stream for obj in iterator: File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 180, in _batched for item in iterator: File "/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/python/pyspark/rdd.py", line 612, in func File "/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/examples/src/main/python/pi.py", line 36, in f SystemError: unknown opcode
  2. 0

    Apache-Spark load files from HDFS

    Stack Overflow | 2 years ago | Ruofan Kong
    org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/ying/AWS_Tutorial/spark-1.4.0/python/lib/pyspark.zip/pyspark/worker.py", line 111, in main process() File "/home/ying/AWS_Tutorial/spark-1.4.0/python/lib/pyspark.zip/pyspark/worker.py", line 106, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/home/ying/AWS_Tutorial/spark-1.4.0/python/pyspark/rdd.py", line 2318, in pipeline_func return func(split, prev_func(split, iterator)) File "/home/ying/AWS_Tutorial/spark-1.4.0/python/pyspark/rdd.py", line 2318, in pipeline_func return func(split, prev_func(split, iterator)) File "/home/ying/AWS_Tutorial/spark-1.4.0/python/pyspark/rdd.py", line 2318, in pipeline_func return func(split, prev_func(split, iterator)) File "/home/ying/AWS_Tutorial/spark-1.4.0/python/pyspark/rdd.py", line 304, in func return f(iterator) File "/home/ying/AWS_Tutorial/spark-1.4.0/python/pyspark/rdd.py", line 719, in processPartition f(x) File "/home/ying/AWS_Tutorial/spark_codes/sum.py", line 41, in <lambda> temp = datafile.foreach(lambda (path, content): myfunc(str(path).strip('file:'))) File "/home/ying/AWS_Tutorial/spark_codes/sum.py", line 26, in myfunc cr = csv.reader(open(s,"rb")) IOError: [Errno 2] No such file or directory: 'hdfs://localhost:9000/data/test1.csv'
  3. 0

    Add date field to RDD in Spark

    Stack Overflow | 2 years ago
    org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/terrapin/Spark_Hadoop/spark-1.1.1-bin-cdh4/python/pyspark/worker.py", line 79, in main serializer.dump_stream(func(split_index, iterator), outfile) File "/home/terrapin/Spark_Hadoop/spark-1.1.1-bin-cdh4/python/pyspark/serializers.py", line 196, in dump_stream self.serializer.dump_stream(self._batched(iterator), stream) File "/home/terrapin/Spark_Hadoop/spark-1.1.1-bin-cdh4/python/pyspark/serializers.py", line 127, in dump_stream for obj in iterator: File "/home/terrapin/Spark_Hadoop/spark-1.1.1-bin-cdh4/python/pyspark/serializers.py", line 185, in _batched for item in iterator: File "/home/terrapin/Spark_Hadoop/spark-1.1.1-bin-cdh4/python/pyspark/rdd.py", line 1147, in takeUpToNumLeft yield next(iterator) File "/home/terrapin/Spark_Hadoop/spark-1.1.1-bin-cdh4/test3.py", line 72, in parsedate dt=dateutil.parser.parse("01 Jan 1900 00:00:00").date() AttributeError: 'module' object has no attribute 'parser'
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    spark Task Execution error. No Such File or Directory

    Stack Overflow | 2 years ago
    org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/hadoop/spark/python/pyspark/worker.py", line 90, in main command = pickleSer.loads(command.value) File "/home/hadoop/spark/python/pyspark/broadcast.py", line 106, in value self._value = self.load(self._path) File "/home/hadoop/spark/python/pyspark/broadcast.py", line 87, in load with open(path, 'rb', 1 << 20) as f: IOError: [Errno 2] No such file or directory: '/mnt/spark/spark-aab8afa8-42e8-451f-88ac-8981e8dea00a/pyspark-1258404b-a23d-49e1-90f2-76a59c96a8ac/tmpN3K1iO'
  6. 0

    PySpark Job throwing IOError

    Stack Overflow | 2 years ago
    org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/hadoop/spark/python/pyspark/worker.py", line 90, in main command = pickleSer.loads(command.value) File "/home/hadoop/spark/python/pyspark/broadcast.py", line 106, in value self._value = self.load(self._path) File "/home/hadoop/spark/python/pyspark/broadcast.py", line 87, in load with open(path, 'rb', 1 << 20) as f: IOError: [Errno 2] No such file or directory: '/mnt/spark/spark-ea646b94-3f68-47a5-8e1c-b23ac0799718/pyspark-7d842875-fae2-4563-b367-92d89d292b60/tmpjlEm1f'

    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. org.apache.spark.api.python.PythonException

      Traceback (most recent call last): File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/worker.py", line 77, in main serializer.dump_stream(func(split_index, iterator), outfile) File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 191, in dump_stream self.serializer.dump_stream(self._batched(iterator), stream) File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 123, in dump_stream for obj in iterator: File "/tmp/hadoop/yarn/local/usercache/root/filecache/23/spark-assembly-1.0.1.2.1.3.0-563-hadoop2.4.0.2.1.3.0-563.jar/pyspark/serializers.py", line 180, in _batched for item in iterator: File "/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/python/pyspark/rdd.py", line 612, in func File "/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/examples/src/main/python/pi.py", line 36, in f SystemError: unknown opcode

      at org.apache.spark.api.python.PythonRDD$$anon$1.read()
    2. Spark
      Executor$TaskRunner.run
      1. org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:115)
      2. org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:145)
      3. org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:78)
      4. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
      5. org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
      6. org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
      7. org.apache.spark.scheduler.Task.run(Task.scala:51)
      8. org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
      8 frames
    3. Java RT
      Thread.run
      1. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      2. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      3. java.lang.Thread.run(Thread.java:744)
      3 frames