org.apache.spark.api.python.PythonException

Traceback (most recent call last): File "/data/3/tmp/hadoop-hadoop/nm-local-dir/usercache/user/appcache/application_1468851295159_0020/container_1468851295159_0020_01_000016/pyspark.zip/pyspark/worker.py", line 111, in main process() File "/data/3/tmp/hadoop-hadoop/nm-local-dir/usercache/user/appcache/application_1468851295159_0020/container_1468851295159_0020_01_000016/pyspark.zip/pyspark/worker.py", line 106, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/data/3/tmp/hadoop-hadoop/nm-local-dir/usercache/user/appcache/application_1468851295159_0020/container_1468851295159_0020_01_000016/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/usr/local/spark/python/pyspark/rdd.py", line 1898, in <lambda> IndexError: list index out of range

Solutions on the web112

  • ) File "/data/3/tmp/hadoop-hadoop/nm-local-dir/usercache/user/appcache/application_1468851295159_0020/container_1468851295159_0020_01_000016/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/usr/local/spark/python/pyspark/rdd.py", line 1898, in <lambda> IndexError: list index out of range
  • Traceback (most recent call last): File "/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1453072376140_0005/container_1453072376140_0005_01_000002/pyspark.zip/pyspark/worker.py", line 111, in main process() File "/hadoop/yarn
  • via GitHub by dennishuo
    , 8 months ago
    Traceback (most recent call last): File "/hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1452810606380_0004/container_1452810606380_0004_01_000002/pyspark.zip/pyspark/worker.py", line 111, in main process() File
  • Stack trace

    • org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/data/3/tmp/hadoop-hadoop
    • m-local-dir/usercache/user/appcache/application_1468851295159_0020/container_1468851295159_0020_01_000016/pyspark.zip/pyspark/worker.py", line 111, in main process() File "/data/3/tmp/hadoop-hadoop
    • m-local-dir/usercache/user/appcache/application_1468851295159_0020/container_1468851295159_0020_01_000016/pyspark.zip/pyspark/worker.py", line 106, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/data/3/tmp/hadoop-hadoop
    • m-local-dir/usercache/user/appcache/application_1468851295159_0020/container_1468851295159_0020_01_000016/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/usr/local/spark/python/pyspark/rdd.py", line 1898, in <lambda> IndexError: list index out of range at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:166) at org.apache.spark.api.python.PythonRunner$$anon$1.next(PythonRDD.scala:129) at org.apache.spark.api.python.PythonRunner$$anon$1.next(PythonRDD.scala:125) at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:452) at org.apache.spark.api.python.PythonRunner$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:280) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765) at org.apache.spark.api.python.PythonRunner$WriterThread.run(PythonRDD.scala:239)

    Write tip

    You have a different solution? A short tip here would help you and many other users who saw this issue last week.

    Users with the same issue

    You are the first who have seen this exception. Write a tip to help other users and build your expert profile.