cluster.ClusterTaskSetManager: Loss was due to > java.io.FileNotFoundException > java.io.FileNotFoundException: > /tmp/spark-local-20140417145643-a055/3c/shuffle_1_218_1157 (Too many > open files) > > ulimit -n tells me I can open 32000 files. Here's a plot of lsof on a > worker node during a failed .distinct(): >  , you can see tasks fail when Spark > tries to open 32000 files. > > I never ran into this in 0.7.3. Is there a parameter I can set to tell > Spark to use less than 32000 files? > > On Mon, Mar 24, 2014 at 10:23 AM, Aaron Davidson < > wrote: >> Look up setting ulimit, though note the distinction between soft and hard >> limits, and that updating your hard limit may require changing >> /etc/security/limits.confand restarting each worker. >> >> >> On Mon, Mar 24, 2014 at 1:39 AM, Kane < > wrote: >>> >>> Got a bit further, i think out of memory error was caused by setting >>> spark.spill to false. Now i have this error, is there an easy way to >>> increase file limit for spark, cluster-wide?: >>> >>> java.io.FileNotFoundException: >>> >>> /tmp/spark-local-20140324074221-b8f1/01/temp_1ab674f9-4556-4239-9f21-688dfc9f17d2 >>> (Too many open files)

nabble.com | 8 months ago
tip
Do you know that we can give you better hits? Get more relevant results from Samebug’s stack trace search.
  1. 0

    Apache Spark User List - distinct on huge dataset

    nabble.com | 8 months ago
    cluster.ClusterTaskSetManager: Loss was due to > java.io.FileNotFoundException > java.io.FileNotFoundException: > /tmp/spark-local-20140417145643-a055/3c/shuffle_1_218_1157 (Too many > open files) > > ulimit -n tells me I can open 32000 files. Here's a plot of lsof on a > worker node during a failed .distinct(): >  , you can see tasks fail when Spark > tries to open 32000 files. > > I never ran into this in 0.7.3. Is there a parameter I can set to tell > Spark to use less than 32000 files? > > On Mon, Mar 24, 2014 at 10:23 AM, Aaron Davidson < > wrote: >> Look up setting ulimit, though note the distinction between soft and hard >> limits, and that updating your hard limit may require changing >> /etc/security/limits.confand restarting each worker. >> >> >> On Mon, Mar 24, 2014 at 1:39 AM, Kane < > wrote: >>> >>> Got a bit further, i think out of memory error was caused by setting >>> spark.spill to false. Now i have this error, is there an easy way to >>> increase file limit for spark, cluster-wide?: >>> >>> java.io.FileNotFoundException: >>> >>> /tmp/spark-local-20140324074221-b8f1/01/temp_1ab674f9-4556-4239-9f21-688dfc9f17d2 >>> (Too many open files)

    Root Cause Analysis

    1. cluster.ClusterTaskSetManager

      Loss was due to > java.io.FileNotFoundException > java.io.FileNotFoundException: > /tmp/spark-local-20140417145643-a055/3c/shuffle_1_218_1157 (Too many > open files) > > ulimit -n tells me I can open 32000 files. Here's a plot of lsof on a > worker node during a failed .distinct(): >  , you can see tasks fail when Spark > tries to open 32000 files. > > I never ran into this in 0.7.3. Is there a parameter I can set to tell > Spark to use less than 32000 files? > > On Mon, Mar 24, 2014 at 10:23 AM, Aaron Davidson < > wrote: >> Look up setting ulimit, though note the distinction between soft and hard >> limits, and that updating your hard limit may require changing >> /etc/security/limits.confand restarting each worker. >> >> >> On Mon, Mar 24, 2014 at 1:39 AM, Kane < > wrote: >>> >>> Got a bit further, i think out of memory error was caused by setting >>> spark.spill to false. Now i have this error, is there an easy way to >>> increase file limit for spark, cluster-wide?: >>> >>> java.io.FileNotFoundException: >>> >>> /tmp/spark-local-20140324074221-b8f1/01/temp_1ab674f9-4556-4239-9f21-688dfc9f17d2 >>> (Too many open files)

      at java.io.FileOutputStream.openAppend()
    2. Java RT
      FileOutputStream.<init>
      1. java.io.FileOutputStream.openAppend(Native Method)
      2. java.io.FileOutputStream.<init>(FileOutputStream.java:192)
      2 frames