java.io.FileNotFoundException

There are no available Samebug tips for this exception. Do you have an idea how to solve this issue? A short tip would help users who saw this issue last week.

  • With current Heritrix-1.12 I seem to repeatedly get those: (only seems to happens with .pdf documents so far, but I mid-fetch abort on anything but text/html and application/pdf anyways) 03/15/2007 21:44:28 +0000 SCHWERWIEGEND org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor innerProcess Failed write of Record: http://www.pandacom.de/produkte/hersteller-highlights/Witcom_spec_witview.pdf java.io.FileNotFoundException: /heritrix/jobs/h22-20070315210752398/scratch/tt104http.ris (Too many open files) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212) at org.archive.io.RandomAccessInputStream.<init>(RandomAccessInputStream.java:79) at org.archive.io.ReplayInputStream.<init>(ReplayInputStream.java:97) at org.archive.io.ReplayInputStream.<init>(ReplayInputStream.java:76) at org.archive.io.RecordingOutputStream.getReplayInputStream(RecordingOutputStream.java:356) at org.archive.io.RecordingOutputStream.getReplayInputStream(RecordingOutputStream.java:348) at org.archive.io.RecordingInputStream.getReplayInputStream(RecordingInputStream.java:150) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.writeResponse(ExperimentalV10WARCWriterProcessor.java:219) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.write(ExperimentalV10WARCWriterProcessor.java:164) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.innerProcess(ExperimentalV10WARCWriterProcessor.java:116) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) This issue simply renders my spider unusable as after this failure, the Writers do get "too many open files" very soon, too. I use the same setup that I used for Heritrix 1.10.2, I adapted my config to do exactly the same as it did before the version switch. This happened to me twice so far, so the issue seems reproducable... I presume there is some file descriptor leak somewhere. Regards Olaf Freyer P.S.: if it will happen a third time to me, I'll try to hand over order.xml and seeds.txt to one of you guys to try to reproduce it, too.
    via by Olaf Freyer,
  • With current Heritrix-1.12 I seem to repeatedly get those: (only seems to happens with .pdf documents so far, but I mid-fetch abort on anything but text/html and application/pdf anyways) 03/15/2007 21:44:28 +0000 SCHWERWIEGEND org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor innerProcess Failed write of Record: http://www.pandacom.de/produkte/hersteller-highlights/Witcom_spec_witview.pdf java.io.FileNotFoundException: /heritrix/jobs/h22-20070315210752398/scratch/tt104http.ris (Too many open files) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212) at org.archive.io.RandomAccessInputStream.<init>(RandomAccessInputStream.java:79) at org.archive.io.ReplayInputStream.<init>(ReplayInputStream.java:97) at org.archive.io.ReplayInputStream.<init>(ReplayInputStream.java:76) at org.archive.io.RecordingOutputStream.getReplayInputStream(RecordingOutputStream.java:356) at org.archive.io.RecordingOutputStream.getReplayInputStream(RecordingOutputStream.java:348) at org.archive.io.RecordingInputStream.getReplayInputStream(RecordingInputStream.java:150) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.writeResponse(ExperimentalV10WARCWriterProcessor.java:219) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.write(ExperimentalV10WARCWriterProcessor.java:164) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.innerProcess(ExperimentalV10WARCWriterProcessor.java:116) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) This issue simply renders my spider unusable as after this failure, the Writers do get "too many open files" very soon, too. I use the same setup that I used for Heritrix 1.10.2, I adapted my config to do exactly the same as it did before the version switch. This happened to me twice so far, so the issue seems reproducable... I presume there is some file descriptor leak somewhere. Regards Olaf Freyer P.S.: if it will happen a third time to me, I'll try to hand over order.xml and seeds.txt to one of you guys to try to reproduce it, too.
    via by Olaf Freyer,
  • Elasticsearch Users - Too Many Open Files
    via by Unknown author,
  • help,ImageIO
    via by Unknown author,
  • comp.dsp | help,ImageIO
    via by Unknown author,
  • Index File missing
    via by Kyle,
    • java.io.FileNotFoundException: /heritrix/jobs/h22-20070315210752398/scratch/tt104http.ris (Too many open files) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212) at org.archive.io.RandomAccessInputStream.<init>(RandomAccessInputStream.java:79) at org.archive.io.ReplayInputStream.<init>(ReplayInputStream.java:97) at org.archive.io.ReplayInputStream.<init>(ReplayInputStream.java:76) at org.archive.io.RecordingOutputStream.getReplayInputStream(RecordingOutputStream.java:356) at org.archive.io.RecordingOutputStream.getReplayInputStream(RecordingOutputStream.java:348) at org.archive.io.RecordingInputStream.getReplayInputStream(RecordingInputStream.java:150) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.writeResponse(ExperimentalV10WARCWriterProcessor.java:219) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.write(ExperimentalV10WARCWriterProcessor.java:164) at org.archive.crawler.writer.ExperimentalV10WARCWriterProcessor.innerProcess(ExperimentalV10WARCWriterProcessor.java:116) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151)

    Users with the same issue

    rp
    1 times, last one,
    johnxfly
    4 times, last one,
    Mark
    3 times, last one,
    andyglick
    3 times, last one,
    Andreas Häber
    1 times, last one,
    32 more bugmates