java.lang.NullPointerException

JIRA | Sam Stoelinga | 1 year ago
tip
Click on the to mark the solution that helps you, Samebug will learn from it.
As a community member, you’ll be rewarded for you help.
  1. 0

    I am trying to get Spark 1.5.1 + Tachyon 0.7.1 + Swift working without success and believe to have encountered a bug. Spark, Tachyon and Swift are all running correctly, but when submitting a job in Spark which uses input from tachyon I get a nullpointer exception from code in the Tachyon client: tachyon.client.UfsUtils.loadUnderFs(UfsUtils.java:107), which looks like it's an code issue (I'm not a java guy): https://github.com/amplab/tachyon/blob/v0.7.1/clients/unshaded/src/main/java/tachyon/client/UfsUtils.java#L107 This is the spark job that's being run: {code:java} scala> val textFile = sc.textFile("tachyon://mesos-master-1:19998/pacman.log") textFile: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:21 scala> textFile.count() java.lang.NullPointerException at tachyon.client.UfsUtils.loadUnderFs(UfsUtils.java:107) at tachyon.hadoop.AbstractTFS.fromHdfsToTachyon(AbstractTFS.java:269) at tachyon.hadoop.AbstractTFS.getFileStatus(AbstractTFS.java:331) at tachyon.hadoop.TFS.getFileStatus(TFS.java:26) at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57) at org.apache.hadoop.fs.Globber.glob(Globber.java:252) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1644) at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:257) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1919) at org.apache.spark.rdd.RDD.count(RDD.scala:1121) {code} tachyon-default.sh {code:bash} export JAVA="$JAVA_HOME/bin/java" export TACHYON_MASTER_ADDRESS=192.168.111.54 export TACHYON_UNDERFS_ADDRESS=swift://spark.swift1 export TACHYON_WORKER_MEMORY_SIZE=1GB export TACHYON_UNDERFS_HDFS_IMPL=org.apache.hadoop.hdfs.DistributedFileSystem export TACHYON_WORKER_MAX_WORKER_THREADS=2048 export TACHYON_MASTER_MAX_WORKER_THREADS=2048 {code} I've build tachyon-0.7.1 with hadoop 2.4 as that's required for Swift with {code:bash} mvn -Dhadoop.version=2.4.0 -DskipTests=true install {code}

    JIRA | 1 year ago | Sam Stoelinga
    java.lang.NullPointerException
  2. 0

    I am trying to get Spark 1.5.1 + Tachyon 0.7.1 + Swift working without success and believe to have encountered a bug. Spark, Tachyon and Swift are all running correctly, but when submitting a job in Spark which uses input from tachyon I get a nullpointer exception from code in the Tachyon client: tachyon.client.UfsUtils.loadUnderFs(UfsUtils.java:107), which looks like it's an code issue (I'm not a java guy): https://github.com/amplab/tachyon/blob/v0.7.1/clients/unshaded/src/main/java/tachyon/client/UfsUtils.java#L107 This is the spark job that's being run: {code:java} scala> val textFile = sc.textFile("tachyon://mesos-master-1:19998/pacman.log") textFile: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:21 scala> textFile.count() java.lang.NullPointerException at tachyon.client.UfsUtils.loadUnderFs(UfsUtils.java:107) at tachyon.hadoop.AbstractTFS.fromHdfsToTachyon(AbstractTFS.java:269) at tachyon.hadoop.AbstractTFS.getFileStatus(AbstractTFS.java:331) at tachyon.hadoop.TFS.getFileStatus(TFS.java:26) at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57) at org.apache.hadoop.fs.Globber.glob(Globber.java:252) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1644) at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:257) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1919) at org.apache.spark.rdd.RDD.count(RDD.scala:1121) {code} tachyon-default.sh {code:bash} export JAVA="$JAVA_HOME/bin/java" export TACHYON_MASTER_ADDRESS=192.168.111.54 export TACHYON_UNDERFS_ADDRESS=swift://spark.swift1 export TACHYON_WORKER_MEMORY_SIZE=1GB export TACHYON_UNDERFS_HDFS_IMPL=org.apache.hadoop.hdfs.DistributedFileSystem export TACHYON_WORKER_MAX_WORKER_THREADS=2048 export TACHYON_MASTER_MAX_WORKER_THREADS=2048 {code} I've build tachyon-0.7.1 with hadoop 2.4 as that's required for Swift with {code:bash} mvn -Dhadoop.version=2.4.0 -DskipTests=true install {code}

    JIRA | 1 year ago | Sam Stoelinga
    java.lang.NullPointerException

    Root Cause Analysis

    1. java.lang.NullPointerException

      No message provided

      at tachyon.client.UfsUtils.loadUnderFs()
    2. Tachyon Project Core
      TFS.getFileStatus
      1. tachyon.client.UfsUtils.loadUnderFs(UfsUtils.java:107)
      2. tachyon.hadoop.AbstractTFS.fromHdfsToTachyon(AbstractTFS.java:269)
      3. tachyon.hadoop.AbstractTFS.getFileStatus(AbstractTFS.java:331)
      4. tachyon.hadoop.TFS.getFileStatus(TFS.java:26)
      4 frames
    3. Hadoop
      FileSystem.globStatus
      1. org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
      2. org.apache.hadoop.fs.Globber.glob(Globber.java:252)
      3. org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1644)
      3 frames
    4. Hadoop
      FileInputFormat.getSplits
      1. org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:257)
      2. org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
      3. org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
      3 frames
    5. Spark
      RDD$$anonfun$partitions$2.apply
      1. org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
      2. org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
      3. org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
      3 frames
    6. Scala
      Option.getOrElse
      1. scala.Option.getOrElse(Option.scala:120)
      1 frame
    7. Spark
      RDD$$anonfun$partitions$2.apply
      1. org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
      2. org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
      3. org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
      4. org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
      4 frames
    8. Scala
      Option.getOrElse
      1. scala.Option.getOrElse(Option.scala:120)
      1 frame
    9. Spark
      RDD.count
      1. org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
      2. org.apache.spark.SparkContext.runJob(SparkContext.scala:1919)
      3. org.apache.spark.rdd.RDD.count(RDD.scala:1121)
      3 frames