org.apache.spark.api.python.PythonException

Traceback (most recent call last): # File "/home/cgu.local/thiagovm/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 172, in main # process() # File "/home/cgu.local/thiagovm/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 167, in process # serializer.dump_stream(func(split_index, iterator), outfile) # File "/home/cgu.local/thiagovm/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream # vs = list(itertools.islice(iterator, batch)) # File "/usr/local/lib/python3.5/site-packages/splearn/feature_extraction/text.py", line 289, in <lambda> # A = Z.transform(lambda X: list(map(analyze, X)), column='X').persist() # File "/usr/local/lib/python3.5/site-packages/sklearn/feature_extraction/text.py", line 238, in <lambda> # tokenize(preprocess(self.decode(doc))), stop_words) # File "/usr/local/lib/python3.5/site-packages/sklearn/feature_extraction/text.py", line 204, in <lambda> # return lambda x: strip_accents(x.lower()) # AttributeError: 'numpy.ndarray' object has no attribute 'lower'


Solutions on the web115

Solution icon of stackoverflow
/pyspark/worker.py", line 167, in process # serializer.dump_stream(func(split_index, iterator), outfile) # File "/home/cgu.local/thiagovm/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream # vs = list

Solution icon of github
via GitHub by mrelich
, 1 week ago
(func(split_index, iterator), outfile) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/home/mattr/spark_test2/spark_env/lib/python2.7/site-packages

Solution icon of stackoverflow
via Stack Overflow by Jack Daniel
, 1 month ago
serializer.dump_stream(func(split_index, iterator), outfile) File "/usr/local/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "<stdin>", line 1, in <lambda> File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 1272, in __getattr__ raise AttributeError(item) AttributeError: lower

Solution icon of stackoverflow
169, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/home/main/spark-2.1.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 106, in <lambda> func = lambda _, it: map(mapper, it) File "<string>", line

Solution icon of stackoverflow
serializer.dump_stream(func(split_index, iterator), outfile) File "/usr/local/spark/2.1.0/python/lib/pyspark.zip/pyspark/serializers.py", line 268, in dump_stream vs = list(itertools.islice(iterator, batch)) File "<stdin>", line 1, in <lambda> File

Solution icon of github
via GitHub by ssallys
, 10 months ago
Traceback (most recent call last): File "/usr/local/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 172, in main process() File "/usr/local/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line

Solution icon of github
Traceback (most recent call last): File "/usr/local/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 172, in main process() File "/usr/local/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line

Solution icon of stackoverflow
via Stack Overflow by majdouline
, 4 months ago
serializer.dump_stream(func(split_index, iterator), outfile) File "/usr/local/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 268, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/usr/local/spark/python/lib/pyspark.zip/pyspark/rdd.py

Solution icon of stackoverflow
serializer.dump_stream(func(split_index, iterator), outfile) File "/usr/local/src/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "utils.py", line 6, in returnIfTrue if row[1] in settings.ageList: AttributeError: 'module' object has no attribute 'ageList'

Solution icon of stackoverflow
via Stack Overflow by Algina
, 8 months ago
Traceback (most recent call last): File "/home/alg/programs/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 175, in main process() File "/home/alg/programs/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark

Stack trace

  • org.apache.spark.api.python.PythonException: Traceback (most recent call last): # File "/home/cgu.local/thiagovm/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 172, in main # process() # File "/home/cgu.local/thiagovm/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 167, in process # serializer.dump_stream(func(split_index, iterator), outfile) # File "/home/cgu.local/thiagovm/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/serializers.py", line 263, in dump_stream # vs = list(itertools.islice(iterator, batch)) # File "/usr/local/lib/python3.5/site-packages/splearn/feature_extraction/text.py", line 289, in <lambda> # A = Z.transform(lambda X: list(map(analyze, X)), column='X').persist() # File "/usr/local/lib/python3.5/site-packages/sklearn/feature_extraction/text.py", line 238, in <lambda> # tokenize(preprocess(self.decode(doc))), stop_words) # File "/usr/local/lib/python3.5/site-packages/sklearn/feature_extraction/text.py", line 204, in <lambda> # return lambda x: strip_accents(x.lower()) # AttributeError: 'numpy.ndarray' object has no attribute 'lower' at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193) at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234) at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) at org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:332) at org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:330) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:935) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:668) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330) at org.apache.spark.rdd.RDD.iterator(RDD.scala:281) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) at org.apache.spark.scheduler.Task.run(Task.scala:85) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Write tip

You have a different solution? A short tip here would help you and many other users who saw this issue last week.

Users with the same issue

You are the first who have seen this exception. Write a tip to help other users and build your expert profile.