ache.spark.api.python.PythonException: Traceback (most recent call last): File "c:\spark\python\lib\pyspark.zip\pyspark\worker.py", line 172, in main File "c:\spark\python\lib\pyspark.zip\pyspark\worker.py", line 167, in process File "c:\spark\python\lib\pyspark.zip\pyspark\serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "c:\spark\python\lib\pyspark.zip\pyspark\rdd.py", line 1306, in takeUpToNumLeft File "c:/sparkcourse/test-recommendation.py", line 8, in get_counts_and_averages return ID_and_ratings_tuple[0], (nratings, float(sum(x for x in ID_and_ratings_tuple[1]))/nratings) TypeError: unsupported operand type(s) for +: 'int' and 'str'

Stack Overflow | JohnB | 4 months ago
  1. 0

    Pyspark - recommendation engine - unsupported operand type(s) for +: 'int' and 'str'

    Stack Overflow | 4 months ago | JohnB
    ache.spark.api.python.PythonException: Traceback (most recent call last): File "c:\spark\python\lib\pyspark.zip\pyspark\worker.py", line 172, in main File "c:\spark\python\lib\pyspark.zip\pyspark\worker.py", line 167, in process File "c:\spark\python\lib\pyspark.zip\pyspark\serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "c:\spark\python\lib\pyspark.zip\pyspark\rdd.py", line 1306, in takeUpToNumLeft File "c:/sparkcourse/test-recommendation.py", line 8, in get_counts_and_averages return ID_and_ratings_tuple[0], (nratings, float(sum(x for x in ID_and_ratings_tuple[1]))/nratings) TypeError: unsupported operand type(s) for +: 'int' and 'str'

    Root Cause Analysis

    1. ache.spark.api.python.PythonException

      Traceback (most recent call last): File "c:\spark\python\lib\pyspark.zip\pyspark\worker.py", line 172, in main File "c:\spark\python\lib\pyspark.zip\pyspark\worker.py", line 167, in process File "c:\spark\python\lib\pyspark.zip\pyspark\serializers.py", line 263, in dump_stream vs = list(itertools.islice(iterator, batch)) File "c:\spark\python\lib\pyspark.zip\pyspark\rdd.py", line 1306, in takeUpToNumLeft File "c:/sparkcourse/test-recommendation.py", line 8, in get_counts_and_averages return ID_and_ratings_tuple[0], (nratings, float(sum(x for x in ID_and_ratings_tuple[1]))/nratings) TypeError: unsupported operand type(s) for +: 'int' and 'str'

      at org.apache.spark.api.python.PythonRunner$$anon$1.read()
    2. Spark
      Executor$TaskRunner.run
      1. org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193)
      2. org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234)
      3. org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152)
      4. org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
      5. org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
      6. org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
      7. org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
      8. org.apache.spark.scheduler.Task.run(Task.scala:85)
      9. org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
      9 frames
    3. Java RT
      Thread.run
      1. java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
      2. java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
      3. java.lang.Thread.run(Unknown Source)
      3 frames