java.lang.RuntimeException: After retry (Offset 218)

JIRA | Olaf Freyer | 1 decade ago
tip
Your exception is missing from the Samebug knowledge base.
Here are the best solutions we found on the Internet.
Click on the to mark the helpful solution and get rewards for you help.
  1. 0

    Dear IA-Team, it seems like there exists yet another issue with WARC files in Heritrix-1.12.0. I'm unable to read non-compressed WARC files with the current release. (happens either when I directly write non-compressed WARC files or when I uncompress compressed WARC files (which I were able to read prior to uncompressing them)) sh warcreader -f dump /heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc {content-type=text/plain, reader-identifier=/heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc, absolute-offset=0, subject-uri=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, record-identifier=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, length=216, creation-date=20070319123730, type=warcinfo, Filename=IAH-20070319123730-00002-t5.warc, version=0.10} TODO: Unimplemented 19.03.2007 13:47:27 org.archive.io.ArchiveReader$ArchiveRecordIterator hasNext WARNUNG: Trying skip of failed record cleanup of {content-type=text/plain, reader-identifier=/heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc, absolute-offset=0, subject-uri=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, record-identifier=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, length=216, creation-date=20070319123730, type=warcinfo, Filename=IAH-20070319123730-00002-t5.warc, version=0.10}: Unexpected character a(Expecting d) 19.03.2007 13:47:27 org.archive.io.ArchiveReader$ArchiveRecordIterator hasNext WARNUNG: Trying skip of failed record cleanup of {content-type=text/plain, reader-identifier=/heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc, absolute-offset=0, subject-uri=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, record-identifier=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, length=216, creation-date=20070319123730, type=warcinfo, Filename=IAH-20070319123730-00002-t5.warc, version=0.10}: Unexpected character 41(Expecting d) 19.03.2007 13:47:27 org.archive.io.ArchiveReader$ArchiveRecordIterator next WARNUNG: Bad Record. Trying skip (Current offset 218): Unexpected character 57(Expecting d) Exception processing /heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc: After retry (Offset 218) java.lang.RuntimeException: After retry (Offset 218) at org.archive.io.ArchiveReader$ArchiveRecordIterator.next(ArchiveReader.java:529) at org.archive.io.ArchiveReader$ArchiveRecordIterator.next(ArchiveReader.java:455) at org.archive.io.warc.v10.WARCReader.dump(WARCReader.java:106) at org.archive.io.ArchiveReader.output(ArchiveReader.java:649) at org.archive.io.warc.v10.WARCReader.output(WARCReader.java:157) at org.archive.io.warc.v10.WARCReader.main(WARCReader.java:301) Caused by: java.io.IOException: Unexpected character 52(Expecting d) at org.archive.io.warc.v10.WARCReader.readExpectedChar(WARCReader.java:82) at org.archive.io.warc.v10.WARCReader.gotoEOR(WARCReader.java:72) at org.archive.io.ArchiveReader.cleanupCurrentRecord(ArchiveReader.java:192) at org.archive.io.ArchiveReader.get(ArchiveReader.java:142) at org.archive.io.ArchiveReader$ArchiveRecordIterator.innerNext(ArchiveReader.java:579) at org.archive.io.ArchiveReader$ArchiveRecordIterator.exceptionNext(ArchiveReader.java:554) at org.archive.io.ArchiveReader$ArchiveRecordIterator.next(ArchiveReader.java:522) ... 5 more Basically the same issue exists for the v12 WARCReader, too... Regards Olaf Freyer

    JIRA | 1 decade ago | Olaf Freyer
    java.lang.RuntimeException: After retry (Offset 218)
  2. 0

    Dear IA-Team, it seems like there exists yet another issue with WARC files in Heritrix-1.12.0. I'm unable to read non-compressed WARC files with the current release. (happens either when I directly write non-compressed WARC files or when I uncompress compressed WARC files (which I were able to read prior to uncompressing them)) sh warcreader -f dump /heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc {content-type=text/plain, reader-identifier=/heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc, absolute-offset=0, subject-uri=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, record-identifier=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, length=216, creation-date=20070319123730, type=warcinfo, Filename=IAH-20070319123730-00002-t5.warc, version=0.10} TODO: Unimplemented 19.03.2007 13:47:27 org.archive.io.ArchiveReader$ArchiveRecordIterator hasNext WARNUNG: Trying skip of failed record cleanup of {content-type=text/plain, reader-identifier=/heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc, absolute-offset=0, subject-uri=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, record-identifier=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, length=216, creation-date=20070319123730, type=warcinfo, Filename=IAH-20070319123730-00002-t5.warc, version=0.10}: Unexpected character a(Expecting d) 19.03.2007 13:47:27 org.archive.io.ArchiveReader$ArchiveRecordIterator hasNext WARNUNG: Trying skip of failed record cleanup of {content-type=text/plain, reader-identifier=/heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc, absolute-offset=0, subject-uri=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, record-identifier=urn:uuid:4806edc7-9244-4d70-af1d-1d6ff3ddca75, length=216, creation-date=20070319123730, type=warcinfo, Filename=IAH-20070319123730-00002-t5.warc, version=0.10}: Unexpected character 41(Expecting d) 19.03.2007 13:47:27 org.archive.io.ArchiveReader$ArchiveRecordIterator next WARNUNG: Bad Record. Trying skip (Current offset 218): Unexpected character 57(Expecting d) Exception processing /heritrix/jobs/working3-20070319123718924/warcs/IAH-20070319123730-00002-t5.warc: After retry (Offset 218) java.lang.RuntimeException: After retry (Offset 218) at org.archive.io.ArchiveReader$ArchiveRecordIterator.next(ArchiveReader.java:529) at org.archive.io.ArchiveReader$ArchiveRecordIterator.next(ArchiveReader.java:455) at org.archive.io.warc.v10.WARCReader.dump(WARCReader.java:106) at org.archive.io.ArchiveReader.output(ArchiveReader.java:649) at org.archive.io.warc.v10.WARCReader.output(WARCReader.java:157) at org.archive.io.warc.v10.WARCReader.main(WARCReader.java:301) Caused by: java.io.IOException: Unexpected character 52(Expecting d) at org.archive.io.warc.v10.WARCReader.readExpectedChar(WARCReader.java:82) at org.archive.io.warc.v10.WARCReader.gotoEOR(WARCReader.java:72) at org.archive.io.ArchiveReader.cleanupCurrentRecord(ArchiveReader.java:192) at org.archive.io.ArchiveReader.get(ArchiveReader.java:142) at org.archive.io.ArchiveReader$ArchiveRecordIterator.innerNext(ArchiveReader.java:579) at org.archive.io.ArchiveReader$ArchiveRecordIterator.exceptionNext(ArchiveReader.java:554) at org.archive.io.ArchiveReader$ArchiveRecordIterator.next(ArchiveReader.java:522) ... 5 more Basically the same issue exists for the v12 WARCReader, too... Regards Olaf Freyer

    JIRA | 1 decade ago | Olaf Freyer
    java.lang.RuntimeException: After retry (Offset 218)

    Root Cause Analysis

    1. java.io.IOException

      Unexpected character 52(Expecting d)

      at org.archive.io.warc.v10.WARCReader.readExpectedChar()
    2. org.archive.io
      WARCReader.gotoEOR
      1. org.archive.io.warc.v10.WARCReader.readExpectedChar(WARCReader.java:82)
      2. org.archive.io.warc.v10.WARCReader.gotoEOR(WARCReader.java:72)
      2 frames
    3. webarchive-commons
      ArchiveReader$ArchiveRecordIterator.next
      1. org.archive.io.ArchiveReader.cleanupCurrentRecord(ArchiveReader.java:192)
      2. org.archive.io.ArchiveReader.get(ArchiveReader.java:142)
      3. org.archive.io.ArchiveReader$ArchiveRecordIterator.innerNext(ArchiveReader.java:579)
      4. org.archive.io.ArchiveReader$ArchiveRecordIterator.exceptionNext(ArchiveReader.java:554)
      5. org.archive.io.ArchiveReader$ArchiveRecordIterator.next(ArchiveReader.java:522)
      6. org.archive.io.ArchiveReader$ArchiveRecordIterator.next(ArchiveReader.java:455)
      6 frames
    4. org.archive.io
      WARCReader.dump
      1. org.archive.io.warc.v10.WARCReader.dump(WARCReader.java:106)
      1 frame
    5. webarchive-commons
      ArchiveReader.output
      1. org.archive.io.ArchiveReader.output(ArchiveReader.java:649)
      1 frame
    6. org.archive.io
      WARCReader.main
      1. org.archive.io.warc.v10.WARCReader.output(WARCReader.java:157)
      2. org.archive.io.warc.v10.WARCReader.main(WARCReader.java:301)
      2 frames