org.apache.spark.SparkException

tip

This might be an issue with the file location in the Spark submit command. Try it with "spark-submit --master spark://master:7077 hello_world_from_pyspark.py {file location}"

tip

Check if you've set a name in Application -> Run. If you didn't, the generated XML is gonna have missing information and then this exception will be thrown.

tip

A few things cause this exception: 1) Check if you have all jars and if they're in the correct path when running. 2) Your classpath might be broken, you can define it in the command line with "java -cp yourClassPath" or at your IDE if you're using one.

tip

Needs to add netty-transport-native-epoll with a OS classifier to the classpath, it's not included in netty-all

You have a different solution? A short tip here would help you and many other users who saw this issue last week.

  • GitHub comment 23#233436008
    via GitHub by tkdagdelen
    ,
  • GitHub comment 23#233437026
    via GitHub by tkdagdelen
    ,
  • GitHub comment 23#233078363
    via GitHub by tkdagdelen
    ,
  • Error in spark-submit but not in pyspark
    via Stack Overflow by nanounanue
    ,
  • GitHub comment 2886#275626070
    via GitHub by jheijkoop
    ,
  • Scala 2.11 support?
    via GitHub by Hydrotoast
    ,
  • command line interface is failing when i tried to load 1gb file. I tried to increase the memory using command line I am getting fallowing error. can you please advice how i can work around it? [root@bda1node03 forecast]# flags="-Xmx2048m" kite-dataset csv-import AccuWeatherForecast11222015.csv ctest1 Exception in thread "main" java.lang.ClassNotFoundException: -Xmx2048m at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.hadoop.util.RunJar.run(RunJar.java:214) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) ---------------------------------------------------------------------------- Exception in thread "main" java.lang.OutOfMemoryError: Java heap space: failed reallocation of scalar replaced objects at java.util.Arrays.copyOfRange(Arrays.java:3664) at java.lang.String.<init>(String.java:201) at java.lang.StringBuilder.toString(StringBuilder.java:407) at au.com.bytecode.opencsv.CSVParser.parseLine(CSVParser.java:250) at au.com.bytecode.opencsv.CSVParser.parseLineMulti(CSVParser.java:174) at au.com.bytecode.opencsv.CSVReader.readNext(CSVReader.java:237) at org.kitesdk.data.spi.filesystem.CSVFileReader.advance(CSVFileReader.java:174) at org.kitesdk.data.spi.filesystem.CSVFileReader.next(CSVFileReader.java:168) at org.kitesdk.shaded.com.google.common.collect.Iterators$7.computeNext(Iterators.java:648) at org.kitesdk.shaded.com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at org.kitesdk.shaded.com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.kitesdk.data.spi.filesystem.MultiFileDatasetReader.hasNext(MultiFileDatasetReader.java:125) at com.google.common.collect.Lists.newArrayList(Lists.java:138) at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:256) at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:217) at org.apache.crunch.impl.mem.collect.MemCollection.<init>(MemCollection.java:79) at org.apache.crunch.impl.mem.MemPipeline.read(MemPipeline.java:166) at org.apache.crunch.impl.mem.MemPipeline.read(MemPipeline.java:157) at org.kitesdk.tools.TransformTask.run(TransformTask.java:135) at org.kitesdk.cli.commands.CSVImportCommand.run(CSVImportCommand.java:186) at org.kitesdk.cli.Main.run(Main.java:178) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.kitesdk.cli.Main.main(Main.java:256) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) [root@bda1node03 forecast]# vi test6.csv
    via by chandra s koripella,
    • org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: org.apache.spark.SparkException: Failed to register classes with Kryo org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:128) org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273) org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:258) org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174) org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201) org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102) org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85) org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63) org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326) org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006) org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921) org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1016) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.collect(RDD.scala:926) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$.mergeSchemasInParallel(ParquetRelation.scala:799) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$MetadataCache.org$apache$spark$sql$execution$datasources$parquet$ParquetRelation$MetadataCache$$readSchema(ParquetRelation.scala:517) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$MetadataCache$$anonfun$12.apply(ParquetRelation.scala:421) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$MetadataCache$$anonfun$12.apply(ParquetRelation.scala:421) at scala.Option.orElse(Option.scala:257) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$MetadataCache.refresh(ParquetRelation.scala:421) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.org$apache$spark$sql$execution$datasources$parquet$ParquetRelation$$metadataCache$lzycompute(ParquetRelation.scala:145) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.org$apache$spark$sql$execution$datasources$parquet$ParquetRelation$$metadataCache(ParquetRelation.scala:143) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$6.apply(ParquetRelation.scala:202) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$6.apply(ParquetRelation.scala:202) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.dataSchema(ParquetRelation.scala:202) at org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:636) at org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:635) at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:37) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109) at net.fnothaft.gnocchi.cli.RegressPhenotypes.run(RegressPhenotypes.scala:79) at org.bdgenomics.utils.cli.BDGSparkCommand$class.run(BDGCommand.scala:54) at net.fnothaft.gnocchi.cli.RegressPhenotypes.run(RegressPhenotypes.scala:70) at net.fnothaft.gnocchi.cli.GnocchiMain$.main(GnocchiMain.scala:48) at net.fnothaft.gnocchi.cli.GnocchiMain.main(GnocchiMain.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Failed to register classes with Kryo at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:128) at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:273) at org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:258) at org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:174) at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201) at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102) at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326) at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Caused by: java.lang.ClassNotFoundException: org.bdgenomics.adam.ADAMKryoRegistrator at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:123) at org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$5.apply(KryoSerializer.scala:123) at scala.Option.map(Option.scala:145) at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:123) ... 16 more

    Users with the same issue

    jokesterjokester
    1 times, last one,
    olle.hallinolle.hallin
    1 times, last one,
    poroszdporoszd
    1 times, last one,
    Unknown visitor
    Unknown visitor1 times, last one,
    Unknown visitor
    Unknown visitor1 times, last one,
    1580 more bugmates