java.lang.NullPointerException

There are no available Samebug tips for this exception. Do you have an idea how to solve this issue? A short tip would help users who saw this issue last week.

  • NullPointerException at org.archive.crawler.processor.recrawl.PersistLogProcessor.finalTasks(PersistLogProcessor.java:87) 03/09/2009 17:07:46 +0000 INFO org.archive.crawler.admin.CrawlJob postDeregister org.archive.crawler:host=crawling10.us.archive.org,jmxport=9093,mother=h1236289378518,name=1104-20090309170725217,type=CrawlService.Job unregistered from MBeanServerId=crawling10.us.archive.org_1236143748023, SpecificationVersion=1.4, ImplementationVersion=1.6.0_03-b05, SpecificationVendor=Sun Microsystems Exception in thread "ToeThread #75: " java.lang.NullPointerException at org.archive.crawler.processor.recrawl.PersistLogProcessor.finalTasks(PersistLogProcessor.java:87) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186) Exception in thread "ToeThread #63: " java.lang.RuntimeException: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:86) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186) Caused by: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1458) at com.sleepycat.je.Database.sync(Database.java:424) at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:83) ... 5 more Exception in thread "ToeThread #61: " java.lang.RuntimeException: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:86) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186) Caused by: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1458) at com.sleepycat.je.Database.sync(Database.java:424) at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:83) ... 5 more Exception in thread "ToeThread #64: " java.lang.RuntimeException: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:86) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186) Caused by: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1458) at com.sleepycat.je.Database.sync(Database.java:424) at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:83) ... 5 more Exception in thread "ToeThread #60: " java.lang.RuntimeException: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:86) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031)
    via by Noah Levitt,
  • NullPointerException at org.archive.crawler.processor.recrawl.PersistLogProcessor.finalTasks(PersistLogProcessor.java:87) 03/09/2009 17:07:46 +0000 INFO org.archive.crawler.admin.CrawlJob postDeregister org.archive.crawler:host=crawling10.us.archive.org,jmxport=9093,mother=h1236289378518,name=1104-20090309170725217,type=CrawlService.Job unregistered from MBeanServerId=crawling10.us.archive.org_1236143748023, SpecificationVersion=1.4, ImplementationVersion=1.6.0_03-b05, SpecificationVendor=Sun Microsystems Exception in thread "ToeThread #75: " java.lang.NullPointerException at org.archive.crawler.processor.recrawl.PersistLogProcessor.finalTasks(PersistLogProcessor.java:87) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186) Exception in thread "ToeThread #63: " java.lang.RuntimeException: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:86) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186) Caused by: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1458) at com.sleepycat.je.Database.sync(Database.java:424) at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:83) ... 5 more Exception in thread "ToeThread #61: " java.lang.RuntimeException: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:86) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186) Caused by: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1458) at com.sleepycat.je.Database.sync(Database.java:424) at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:83) ... 5 more Exception in thread "ToeThread #64: " java.lang.RuntimeException: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:86) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186) Caused by: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1458) at com.sleepycat.je.Database.sync(Database.java:424) at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:83) ... 5 more Exception in thread "ToeThread #60: " java.lang.RuntimeException: com.sleepycat.je.DatabaseException: (JE 3.3.75) Can't call Database.sync: Database state can't be DbState.CLOSED must be DbState.OPEN at org.archive.crawler.processor.recrawl.PersistOnlineProcessor.finalTasks(PersistOnlineProcessor.java:86) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031)
    via by Noah Levitt,
  • Ftp entries in an arc file look like this currently: ftp://ftp.ksl.stanford.edu/welcome.msg 171.64.71.195 20081121190026 no-type 56 ***** ***** Stanford Knowledge Systems Laboratory ***** There is no header, only body content. When heritrix encounters an error trying to download a file, for example: 550 foo: Permission denied. it throws an exception which propagates to the logs: 11/21/2008 19:00:45 +0000 SEVERE org.archive.crawler.fetcher.FetchFTP innerProcess FTP server reported problem. org.archive.net.FTPException: FTP error code: 550 at org.archive.net.ClientFTP.openDataConnection(ClientFTP.java:130) at org.archive.crawler.fetcher.FetchFTP.fetch(FetchFTP.java:312) at org.archive.crawler.fetcher.FetchFTP.innerProcess(FetchFTP.java:252) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) Heritrix still tries to write to the ARC, but fails because there is no content: 11/21/2008 19:00:45 +0000 SEVERE org.archive.crawler.framework.ToeThread recoverableProblem Problem java.lang.NullPointerException occured when trying to process 'ftp://ftp.ksl.stanford.edu/dev/ticotsord' at step ABOUT_TO_BEGIN_PROCESSOR in Archiver java.lang.NullPointerException at org.archive.crawler.writer.ARCWriterProcessor.innerProcess(ARCWriterProcessor.java:122) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) So there is no record in the arc file at all. But this "550 foo: Permission denied." is essentially equivalent to a HTTP 403. It should be archived somehow and should not spew stack traces in the logs. So I propose we include a "header" section in the arc for ftp transactions. "550 foo: Permission denied." would go there. On a successful get, the message would be something like "150 Binary data connection for /welcome.msg (76.103.251.45,57342) (56 bytes)." Would this break anything?
    via by Noah Levitt,
  • The following exception stack occurred when terminating a small test crawl via the web UI. A subsequent crawl terminated normally on same settings. com.sleepycat.util.RuntimeExceptionWrapper: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.collections.StoredContainer.convertException(StoredContainer.java:447) at com.sleepycat.collections.BlockIterator.hasNext(BlockIterator.java:380) at org.apache.commons.httpclient.cookie.CookieSpecBase.match(CookieSpecBase.java:607) at org.apache.commons.httpclient.HttpMethodBase.addCookieRequestHeader(HttpMethodBase.java:1193) at org.apache.commons.httpclient.HttpMethodBase.addRequestHeaders(HttpMethodBase.java:1327) at org.apache.commons.httpclient.HttpMethodBase.writeRequestHeaders(HttpMethodBase.java:2056) at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:1939) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1000) at org.archive.httpclient.HttpRecorderGetMethod.execute(HttpRecorderGetMethod.java:116) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:397) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346) at org.archive.crawler.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:500) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) Caused by: com.sleepycat.je.DatabaseException: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1069) at com.sleepycat.je.Database.openCursor(Database.java:359) at com.sleepycat.collections.CurrentTransaction.openCursor(CurrentTransaction.java:364) at com.sleepycat.collections.MyRangeCursor.openCursor(MyRangeCursor.java:53) at com.sleepycat.collections.MyRangeCursor.<init>(MyRangeCursor.java:30) at com.sleepycat.collections.DataCursor.init(DataCursor.java:171) at com.sleepycat.collections.DataCursor.<init>(DataCursor.java:59) at com.sleepycat.collections.BlockIterator.hasNext(BlockIterator.java:299) ... 15 more 07/05/2007 21:02:25 +0000 SEVERE org.archive.crawler.framework.ToeThread recoverableProblem Problem com.sleepycat.util.RuntimeExceptionWrapper: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN occured when trying to process 'http://www.landsbokasafn.is/Apps/WebObjects/HI.woa/wa/header_logo_neg.gif' at step ABOUT_TO_BEGIN_PROCESSOR in HTTP com.sleepycat.util.RuntimeExceptionWrapper: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.collections.StoredContainer.convertException(StoredContainer.java:447) at com.sleepycat.collections.BlockIterator.hasNext(BlockIterator.java:380) at org.apache.commons.httpclient.cookie.CookieSpecBase.match(CookieSpecBase.java:607) at org.apache.commons.httpclient.HttpMethodBase.addCookieRequestHeader(HttpMethodBase.java:1193) at org.apache.commons.httpclient.HttpMethodBase.addRequestHeaders(HttpMethodBase.java:1327) at org.apache.commons.httpclient.HttpMethodBase.writeRequestHeaders(HttpMethodBase.java:2056) at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:1939) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1000) at org.archive.httpclient.HttpRecorderGetMethod.execute(HttpRecorderGetMethod.java:116) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:397) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346) at org.archive.crawler.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:500) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) Caused by: com.sleepycat.je.DatabaseException: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1069) at com.sleepycat.je.Database.openCursor(Database.java:359) at com.sleepycat.collections.CurrentTransaction.openCursor(CurrentTransaction.java:364) at com.sleepycat.collections.MyRangeCursor.openCursor(MyRangeCursor.java:53) at com.sleepycat.collections.MyRangeCursor.<init>(MyRangeCursor.java:30) at com.sleepycat.collections.DataCursor.init(DataCursor.java:171) at com.sleepycat.collections.DataCursor.<init>(DataCursor.java:59) at com.sleepycat.collections.BlockIterator.hasNext(BlockIterator.java:299) ... 15 more 07/05/2007 21:02:25 +0000 SEVERE org.archive.crawler.framework.ToeThread run Fatal exception in ToeThread #29: http://www.landsbokasafn.is/Apps/WebObjects/HI.woa/wa/header_logo_neg.gif java.lang.NullPointerException at org.archive.crawler.framework.ToeThread.run(ToeThread.java:157)
    via by Kristinn Sigurðsson,
  • Ftp entries in an arc file look like this currently: ftp://ftp.ksl.stanford.edu/welcome.msg 171.64.71.195 20081121190026 no-type 56 ***** ***** Stanford Knowledge Systems Laboratory ***** There is no header, only body content. When heritrix encounters an error trying to download a file, for example: 550 foo: Permission denied. it throws an exception which propagates to the logs: 11/21/2008 19:00:45 +0000 SEVERE org.archive.crawler.fetcher.FetchFTP innerProcess FTP server reported problem. org.archive.net.FTPException: FTP error code: 550 at org.archive.net.ClientFTP.openDataConnection(ClientFTP.java:130) at org.archive.crawler.fetcher.FetchFTP.fetch(FetchFTP.java:312) at org.archive.crawler.fetcher.FetchFTP.innerProcess(FetchFTP.java:252) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) Heritrix still tries to write to the ARC, but fails because there is no content: 11/21/2008 19:00:45 +0000 SEVERE org.archive.crawler.framework.ToeThread recoverableProblem Problem java.lang.NullPointerException occured when trying to process 'ftp://ftp.ksl.stanford.edu/dev/ticotsord' at step ABOUT_TO_BEGIN_PROCESSOR in Archiver java.lang.NullPointerException at org.archive.crawler.writer.ARCWriterProcessor.innerProcess(ARCWriterProcessor.java:122) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) So there is no record in the arc file at all. But this "550 foo: Permission denied." is essentially equivalent to a HTTP 403. It should be archived somehow and should not spew stack traces in the logs. So I propose we include a "header" section in the arc for ftp transactions. "550 foo: Permission denied." would go there. On a successful get, the message would be something like "150 Binary data connection for /welcome.msg (76.103.251.45,57342) (56 bytes)." Would this break anything?
    via by Noah Levitt,
  • The following exception stack occurred when terminating a small test crawl via the web UI. A subsequent crawl terminated normally on same settings. com.sleepycat.util.RuntimeExceptionWrapper: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.collections.StoredContainer.convertException(StoredContainer.java:447) at com.sleepycat.collections.BlockIterator.hasNext(BlockIterator.java:380) at org.apache.commons.httpclient.cookie.CookieSpecBase.match(CookieSpecBase.java:607) at org.apache.commons.httpclient.HttpMethodBase.addCookieRequestHeader(HttpMethodBase.java:1193) at org.apache.commons.httpclient.HttpMethodBase.addRequestHeaders(HttpMethodBase.java:1327) at org.apache.commons.httpclient.HttpMethodBase.writeRequestHeaders(HttpMethodBase.java:2056) at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:1939) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1000) at org.archive.httpclient.HttpRecorderGetMethod.execute(HttpRecorderGetMethod.java:116) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:397) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346) at org.archive.crawler.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:500) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) Caused by: com.sleepycat.je.DatabaseException: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1069) at com.sleepycat.je.Database.openCursor(Database.java:359) at com.sleepycat.collections.CurrentTransaction.openCursor(CurrentTransaction.java:364) at com.sleepycat.collections.MyRangeCursor.openCursor(MyRangeCursor.java:53) at com.sleepycat.collections.MyRangeCursor.<init>(MyRangeCursor.java:30) at com.sleepycat.collections.DataCursor.init(DataCursor.java:171) at com.sleepycat.collections.DataCursor.<init>(DataCursor.java:59) at com.sleepycat.collections.BlockIterator.hasNext(BlockIterator.java:299) ... 15 more 07/05/2007 21:02:25 +0000 SEVERE org.archive.crawler.framework.ToeThread recoverableProblem Problem com.sleepycat.util.RuntimeExceptionWrapper: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN occured when trying to process 'http://www.landsbokasafn.is/Apps/WebObjects/HI.woa/wa/header_logo_neg.gif' at step ABOUT_TO_BEGIN_PROCESSOR in HTTP com.sleepycat.util.RuntimeExceptionWrapper: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.collections.StoredContainer.convertException(StoredContainer.java:447) at com.sleepycat.collections.BlockIterator.hasNext(BlockIterator.java:380) at org.apache.commons.httpclient.cookie.CookieSpecBase.match(CookieSpecBase.java:607) at org.apache.commons.httpclient.HttpMethodBase.addCookieRequestHeader(HttpMethodBase.java:1193) at org.apache.commons.httpclient.HttpMethodBase.addRequestHeaders(HttpMethodBase.java:1327) at org.apache.commons.httpclient.HttpMethodBase.writeRequestHeaders(HttpMethodBase.java:2056) at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:1939) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1000) at org.archive.httpclient.HttpRecorderGetMethod.execute(HttpRecorderGetMethod.java:116) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:397) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346) at org.archive.crawler.fetcher.FetchHTTP.innerProcess(FetchHTTP.java:500) at org.archive.crawler.framework.Processor.process(Processor.java:112) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) Caused by: com.sleepycat.je.DatabaseException: (JE 3.2.23) Can't open a cursor Database state can't be DbState.CLOSED must be DbState.OPEN at com.sleepycat.je.Database.checkRequiredDbState(Database.java:1069) at com.sleepycat.je.Database.openCursor(Database.java:359) at com.sleepycat.collections.CurrentTransaction.openCursor(CurrentTransaction.java:364) at com.sleepycat.collections.MyRangeCursor.openCursor(MyRangeCursor.java:53) at com.sleepycat.collections.MyRangeCursor.<init>(MyRangeCursor.java:30) at com.sleepycat.collections.DataCursor.init(DataCursor.java:171) at com.sleepycat.collections.DataCursor.<init>(DataCursor.java:59) at com.sleepycat.collections.BlockIterator.hasNext(BlockIterator.java:299) ... 15 more 07/05/2007 21:02:25 +0000 SEVERE org.archive.crawler.framework.ToeThread run Fatal exception in ToeThread #29: http://www.landsbokasafn.is/Apps/WebObjects/HI.woa/wa/header_logo_neg.gif java.lang.NullPointerException at org.archive.crawler.framework.ToeThread.run(ToeThread.java:157)
    via by Kristinn Sigurðsson,
  • From Kris: It's me again :-) Discovered a potential NPE when terminating a job. The Frontier hangs around for the threads to finish (at least it is supposed too) but (as the following stacktrace shows), the CrawlController or more likely the CrawlScope (unsure which) does not: java.lang.NullPointerException at org.archive.crawler.postprocessor.Postselector.schedule(Postselector.java:2 69) at org.archive.crawler.postprocessor.Postselector.handleLinkCollection(Postsel ector.java:358) at org.archive.crawler.postprocessor.Postselector.innerProcess(Postselector.ja va:166) at org.archive.crawler.framework.Processor.process(Processor.java:102) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:255) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:131) Exception in thread "ToeThread #5" java.lang.NullPointerException at org.archive.crawler.framework.ToeThread.run(ToeThread.java:137) The following patch 'handles' it (if in a very simplistic way), basically just catcht the NPE, and decide that the URI is not within scope if this occurs. Maybe the CrawlController (and it should not be null) should throw an EndedException on getScope() when the crawl has been terminated? Not really a big bug, but we really should have the crawler finish at least semi-gracefully, although I should note that it did not prevent the crawl reports from being written. - Kris Index: Postselector.java =================================================================== RCS file: /cvsroot/archive-crawler/ArchiveOpenCrawler/src/java/org/archive/crawler/po stprocessor/Postselector.java,v retrieving revision 1.13 diff -u -r1.13 Postselector.java --- Postselector.java 27 Oct 2004 00:47:23 -0000 1.13 +++ Postselector.java 17 Nov 2004 15:45:25 -0000 @@ -266,21 +266,26 @@ * @return true if CandidateURI was accepted by crawl scope, false otherwise */ private boolean schedule(CandidateURI caUri) { - if(getController().getScope().accepts(caUri)) { - logger.finer("Accepted: "+caUri); - getController().getFrontier().schedule(caUri); - return true; - } else { - // Run the curi through another set of filters to see - // if we should log it to the scope rejection log. - if (logger.isLoggable(Level.INFO)) { - CrawlURI curi = (caUri instanceof CrawlURI)? - (CrawlURI)caUri: new CrawlURI(caUri.getUURI()); - if (filtersAccept(this.rejectLogFilters, curi)) { - logger.info("Rejected " + curi.getUURI().toString()); + try{ + if(getController().getScope().accepts(caUri)) { + logger.finer("Accepted: "+caUri); + getController().getFrontier().schedule(caUri); + return true; + } else { + // Run the curi through another set of filters to see + // if we should log it to the scope rejection log. + if (logger.isLoggable(Level.INFO)) { + CrawlURI curi = (caUri instanceof CrawlURI)? + (CrawlURI)caUri: new CrawlURI(caUri.getUURI()); + if (filtersAccept(this.rejectLogFilters, curi)) { + logger.info("Rejected " + curi.getUURI().toString()); + } } } + } catch(NullPointerException e){ + // Return false if this happens. Most likely the crawl is ending. } + return false; }
    via by Michael Stack,
    • java.lang.NullPointerException at org.archive.crawler.processor.recrawl.PersistLogProcessor.finalTasks(PersistLogProcessor.java:87) at org.archive.crawler.framework.CrawlController.runProcessorFinalTasks(CrawlController.java:1676) at org.archive.crawler.framework.CrawlController.completeStop(CrawlController.java:1031) at org.archive.crawler.admin.CrawlJob$MBeanCrawlController.completeStop(CrawlJob.java:801) at org.archive.crawler.framework.CrawlController.toeEnded(CrawlController.java:1817) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:186)
    No Bugmate found.