org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 48.0 failed 1 times, most recent failure: Lost task 0.0 in stage 48.0 (TID 167, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/worker.py", line 111, in main process() File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/worker.py", line 106, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/home/newuser/spark/python/pyspark/rdd.py", line 797, in func yield reduce(f, iterator, initial) File "<ipython-input-46-9557bb70b499>", line 1, in <lambda> TypeError: 'int' object has no attribute '__getitem__'

Stack Overflow | Wanderer | 7 months ago
tip
Do you know that we can give you better hits? Get more relevant results from Samebug’s stack trace search.
  1. 0

    In Pyspark how to add all values in a list?

    Stack Overflow | 7 months ago | Wanderer
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 48.0 failed 1 times, most recent failure: Lost task 0.0 in stage 48.0 (TID 167, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/worker.py", line 111, in main process() File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/worker.py", line 106, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/home/newuser/spark/python/pyspark/rdd.py", line 797, in func yield reduce(f, iterator, initial) File "<ipython-input-46-9557bb70b499>", line 1, in <lambda> TypeError: 'int' object has no attribute '__getitem__'
  2. 0

    pyspark json not working

    Stack Overflow | 1 year ago | user2065276
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/usr/local/spark/python/lib/pyspark.zip/pyspark/worker.py", line 111, in main process() File "/usr/local/spark/python/lib/pyspark.zip/pyspark/worker.py", line 106, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/usr/local/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/usr/local/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1295, in takeUpToNumLeft File "/usr/lib/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded
  3. 0

    How to modify numpy arrays in Spark dataframe?

    Stack Overflow | 8 months ago | alfredox
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 25.0 failed 1 times, most recent failure: Lost task 0.0 in stage 25.0 (TID 30, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/databricks/spark/python/pyspark/worker.py", line 111, in main process() File "/databricks/spark/python/pyspark/worker.py", line 106, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/databricks/spark/python/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/databricks/spark/python/pyspark/rdd.py", line 1295, in takeUpToNumLeft yield next(iterator) File "<ipython-input-46-4a4c467a0b3d>", line 13, in <lambda> IndexError: invalid index to scalar variable.
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    Re: pyspark not working

    incubator-zeppelin-users | 2 years ago | prateek arora
    org.apache.spark.SparkException: Error from python worker: /usr/bin/python: No module named pyspark PYTHONPATH was: /yarn/nm/usercache/ubuntu/filecache/80/zeppelin-spark-0.5.0-incubating-SNAPSHOT.jar:/usr/local/spark-1.3.1-bin-hadoop2.6/python:/usr/local/spark-1.3.1-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163)
  6. 0

    Py4Java: ImportError: No module named numpy when running Python shell for Apache Spark

    Stack Overflow | 2 years ago
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 1 times, most recent failure: Lost task 3.0 in stage 0.0 (TID 3, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/Users/m/workspace/spark-1.2.0-bin-hadoop2.4/python/pyspark/worker.py", line 90, in main command = pickleSer._read_with_length(infile) File "/Users/m/workspace/spark-1.2.0-bin-hadoop2.4/python/pyspark/serializers.py", line 151, in _read_with_length return self.loads(obj) File "/Users/m/workspace/spark-1.2.0-bin-hadoop2.4/python/pyspark/serializers.py", line 396, in loads return cPickle.loads(obj) File "/Users/m/workspace/spark-1.2.0-bin-hadoop2.4/python/pyspark/mllib/__init__.py", line 24, in <module> import numpy ImportError: No module named numpy

  1. tyson925 2 times, last 10 months ago
5 unregistered visitors
Not finding the right solution?
Take a tour to get the most out of Samebug.

Tired of useless tips?

Automated exception search integrated into your IDE

Root Cause Analysis

  1. org.apache.spark.SparkException

    Job aborted due to stage failure: Task 0 in stage 48.0 failed 1 times, most recent failure: Lost task 0.0 in stage 48.0 (TID 167, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/worker.py", line 111, in main process() File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/worker.py", line 106, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/home/newuser/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/home/newuser/spark/python/pyspark/rdd.py", line 797, in func yield reduce(f, iterator, initial) File "<ipython-input-46-9557bb70b499>", line 1, in <lambda> TypeError: 'int' object has no attribute '__getitem__'

    at org.apache.spark.api.python.PythonRunner$$anon$1.read()
  2. Spark
    Executor$TaskRunner.run
    1. org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:166)
    2. org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:207)
    3. org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:125)
    4. org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
    5. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
    6. org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
    7. org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    8. org.apache.spark.scheduler.Task.run(Task.scala:88)
    9. org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    9 frames
  3. Java RT
    Thread.run
    1. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    2. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    3. java.lang.Thread.run(Thread.java:745)
    3 frames