org.apache.spark.SparkException

Job aborted due to stage failure: Task serialization failed: org.apache.spark.SparkException: Failed to register classes with Kryo org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:128) org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273) org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:258) org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174) org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201) org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102) org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85) org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63) org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326) org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006) org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921) org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

Samebug tips5

This might be an issue with the file location in the Spark submit command. Try it with "spark-submit --master spark://master:7077 hello_world_from_pyspark.py {file location}"


rafaelrafael

Check if you've set a name in Application -> Run. If you didn't, the generated XML is gonna have missing information and then this exception will be thrown.


rafaelrafael

Don't give up yet. Our experts can help. Paste your full stack trace to get a solution.

Solutions on the web8290

  • via GitHub by tkdagdelen
    ,
  • via GitHub by tkdagdelen
    ,
  • via GitHub by tkdagdelen
    ,
  • Stack trace

    • org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: org.apache.spark.SparkException: Failed to register classes with Kryo org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:128) org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273) org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:258) org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174) org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201) org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102) org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85) org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63) org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326) org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006) org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921) org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1016) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.collect(RDD.scala:926) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$.mergeSchemasInParallel(ParquetRelation.scala:799) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$MetadataCache.org$apache$spark$sql$execution$datasources$parquet$ParquetRelation$MetadataCache$$readSchema(ParquetRelation.scala:517) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$MetadataCache$$anonfun$12.apply(ParquetRelation.scala:421) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$MetadataCache$$anonfun$12.apply(ParquetRelation.scala:421) at scala.Option.orElse(Option.scala:257) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$MetadataCache.refresh(ParquetRelation.scala:421) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.org$apache$spark$sql$execution$datasources$parquet$ParquetRelation$$metadataCache$lzycompute(ParquetRelation.scala:145) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.org$apache$spark$sql$execution$datasources$parquet$ParquetRelation$$metadataCache(ParquetRelation.scala:143) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$6.apply(ParquetRelation.scala:202) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$6.apply(ParquetRelation.scala:202) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.dataSchema(ParquetRelation.scala:202) at org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:636) at org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:635) at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:37) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109) at net.fnothaft.gnocchi.cli.RegressPhenotypes.run(RegressPhenotypes.scala:79) at org.bdgenomics.utils.cli.BDGSparkCommand$class.run(BDGCommand.scala:54) at net.fnothaft.gnocchi.cli.RegressPhenotypes.run(RegressPhenotypes.scala:70) at net.fnothaft.gnocchi.cli.GnocchiMain$.main(GnocchiMain.scala:48) at net.fnothaft.gnocchi.cli.GnocchiMain.main(GnocchiMain.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Failed to register classes with Kryo at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:128) at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273) at org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:258) at org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174) at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201) at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102) at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326) at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Caused by: java.lang.ClassNotFoundException: org.bdgenomics.adam.ADAMKryoRegistrator at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:123) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:123) at scala.Option.map(Option.scala:145) at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:123) ... 16 more

    Write tip

    You have a different solution? A short tip here would help you and many other users who saw this issue last week.

    Users with the same issue

    jokesterjokester
    Once,
    olle.hallinolle.hallin
    Once,
    poroszdporoszd
    Once,
    Unknown visitor
    Unknown visitorOnce,
    Unknown visitor
    Unknown visitorOnce,
    1669 more bugmates