org.archive.net.FTPException: FTP error code: 550

JIRA | Noah Levitt | 8 years ago
tip
Your exception is missing from the Samebug knowledge base.
Here are the best solutions we found on the Internet.
Click on the to mark the helpful solution and get rewards for you help.
  1. 0

    Ftp entries in an arc file look like this currently: ftp://ftp.ksl.stanford.edu/welcome.msg 171.64.71.195 20081121190026 no-type 56 ***** ***** Stanford Knowledge Systems Laboratory ***** There is no header, only body content. When heritrix encounters an error trying to download a file, for example: 550 foo: Permission denied. it throws an exception which propagates to the logs: 11/21/2008 19:00:45 +0000 SEVERE org.archive.crawler.fetcher.FetchFTP innerProcess FTP server reported problem. org.archive.net.FTPException: FTP error code: 550 at org.archive.net.ClientFTP.openDataConnection(ClientFTP.java:130) at org.archive.crawler.fetcher.FetchFTP.fetch(FetchFTP.java:312) at org.archive.crawler.fetcher.FetchFTP.innerProcess(FetchFTP.java:252) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) Heritrix still tries to write to the ARC, but fails because there is no content: 11/21/2008 19:00:45 +0000 SEVERE org.archive.crawler.framework.ToeThread recoverableProblem Problem java.lang.NullPointerException occured when trying to process 'ftp://ftp.ksl.stanford.edu/dev/ticotsord' at step ABOUT_TO_BEGIN_PROCESSOR in Archiver java.lang.NullPointerException at org.archive.crawler.writer.ARCWriterProcessor.innerProcess(ARCWriterProcessor.java:122) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) So there is no record in the arc file at all. But this "550 foo: Permission denied." is essentially equivalent to a HTTP 403. It should be archived somehow and should not spew stack traces in the logs. So I propose we include a "header" section in the arc for ftp transactions. "550 foo: Permission denied." would go there. On a successful get, the message would be something like "150 Binary data connection for /welcome.msg (76.103.251.45,57342) (56 bytes)." Would this break anything?

    JIRA | 8 years ago | Noah Levitt
    org.archive.net.FTPException: FTP error code: 550
  2. 0

    Ftp entries in an arc file look like this currently: ftp://ftp.ksl.stanford.edu/welcome.msg 171.64.71.195 20081121190026 no-type 56 ***** ***** Stanford Knowledge Systems Laboratory ***** There is no header, only body content. When heritrix encounters an error trying to download a file, for example: 550 foo: Permission denied. it throws an exception which propagates to the logs: 11/21/2008 19:00:45 +0000 SEVERE org.archive.crawler.fetcher.FetchFTP innerProcess FTP server reported problem. org.archive.net.FTPException: FTP error code: 550 at org.archive.net.ClientFTP.openDataConnection(ClientFTP.java:130) at org.archive.crawler.fetcher.FetchFTP.fetch(FetchFTP.java:312) at org.archive.crawler.fetcher.FetchFTP.innerProcess(FetchFTP.java:252) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) Heritrix still tries to write to the ARC, but fails because there is no content: 11/21/2008 19:00:45 +0000 SEVERE org.archive.crawler.framework.ToeThread recoverableProblem Problem java.lang.NullPointerException occured when trying to process 'ftp://ftp.ksl.stanford.edu/dev/ticotsord' at step ABOUT_TO_BEGIN_PROCESSOR in Archiver java.lang.NullPointerException at org.archive.crawler.writer.ARCWriterProcessor.innerProcess(ARCWriterProcessor.java:122) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) So there is no record in the arc file at all. But this "550 foo: Permission denied." is essentially equivalent to a HTTP 403. It should be archived somehow and should not spew stack traces in the logs. So I propose we include a "header" section in the arc for ftp transactions. "550 foo: Permission denied." would go there. On a successful get, the message would be something like "150 Binary data connection for /welcome.msg (76.103.251.45,57342) (56 bytes)." Would this break anything?

    JIRA | 8 years ago | Noah Levitt
    org.archive.net.FTPException: FTP error code: 550

    Root Cause Analysis

    1. org.archive.net.FTPException

      FTP error code: 550

      at org.archive.net.ClientFTP.openDataConnection()
    2. webarchive-commons
      ClientFTP.openDataConnection
      1. org.archive.net.ClientFTP.openDataConnection(ClientFTP.java:130)
      1 frame
    3. org.archive.crawler
      ToeThread.run
      1. org.archive.crawler.fetcher.FetchFTP.fetch(FetchFTP.java:312)
      2. org.archive.crawler.fetcher.FetchFTP.innerProcess(FetchFTP.java:252)
      3. org.archive.crawler.framework.Processor.process(Processor.java:112)
      4. org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302)
      5. org.archive.crawler.framework.ToeThread.run(ToeThread.java:151)
      5 frames