org.apache.commons.httpclient.URIException: Invalid URL encoding

JIRA | Gordon Mohr | 8 years ago
tip
Do you know that we can give you better hits? Get more relevant results from Samebug’s stack trace search.
  1. 0

    In a local-machine broad-but-shallow test crawl with H1 TRUNK, got this flurry of exceptions in _out and matching in-UI alerts. If the URIs are this illegal, they shouldn't have reached frontier-scheduling, and/or even if they are illegal, the getServerKey fallback behavior may be insufficient and the redundant errors for a single URI excessive. May affect H2/H3 too. However, beside the distracting errors, does not appear to cause other problems. 2009-09-21 21:26:21.156 SEVERE thread-19 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: 4:http://gt%1$d.google.com/mt?v/x3dgwh.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:429) at org.archive.crawler.frontier.AbstractFrontier.doJournalAdded(AbstractFrontier.java:451) at org.archive.crawler.frontier.WorkQueueFrontier.receive(WorkQueueFrontier.java:446) at org.archive.crawler.util.SetBasedUriUniqFilter.add(SetBasedUriUniqFilter.java:90) at org.archive.crawler.frontier.WorkQueueFrontier.schedule(WorkQueueFrontier.java:428) at org.archive.crawler.postprocessor.FrontierScheduler.schedule(FrontierScheduler.java:92) at org.archive.crawler.postprocessor.FrontierScheduler.innerProcess(FrontierScheduler.java:78) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:21.171 SEVERE thread-19 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: 4:http://gt%1$d.google.com/mt?v/x3dgwm.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:429) at org.archive.crawler.frontier.AbstractFrontier.doJournalAdded(AbstractFrontier.java:451) at org.archive.crawler.frontier.WorkQueueFrontier.receive(WorkQueueFrontier.java:446) at org.archive.crawler.util.SetBasedUriUniqFilter.add(SetBasedUriUniqFilter.java:90) at org.archive.crawler.frontier.WorkQueueFrontier.schedule(WorkQueueFrontier.java:428) at org.archive.crawler.postprocessor.FrontierScheduler.schedule(FrontierScheduler.java:92) at org.archive.crawler.postprocessor.FrontierScheduler.innerProcess(FrontierScheduler.java:78) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:45.828 SEVERE thread-25 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwh.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.prefetch.PreconditionEnforcer.considerDnsPreconditions(PreconditionEnforcer.java:227) at org.archive.crawler.prefetch.PreconditionEnforcer.innerProcess(PreconditionEnforcer.java:111) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:45.828 SEVERE thread-25 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwh.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.postprocessor.CrawlStateUpdater.innerProcess(CrawlStateUpdater.java:61) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:45.828 SEVERE thread-25 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwh.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:429) at org.archive.crawler.frontier.AbstractFrontier.doJournalFinishedFailure(AbstractFrontier.java:465) at org.archive.crawler.frontier.WorkQueueFrontier.finished(WorkQueueFrontier.java:918) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:157) 2009-09-21 21:26:46.250 SEVERE thread-33 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwm.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.prefetch.PreconditionEnforcer.considerDnsPreconditions(PreconditionEnforcer.java:227) at org.archive.crawler.prefetch.PreconditionEnforcer.innerProcess(PreconditionEnforcer.java:111) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:46.250 SEVERE thread-33 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwm.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.postprocessor.CrawlStateUpdater.innerProcess(CrawlStateUpdater.java:61) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:46.250 SEVERE thread-33 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwm.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:429) at org.archive.crawler.frontier.AbstractFrontier.doJournalFinishedFailure(AbstractFrontier.java:465) at org.archive.crawler.frontier.WorkQueueFrontier.finished(WorkQueueFrontier.java:918) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:157)

    JIRA | 8 years ago | Gordon Mohr
    org.apache.commons.httpclient.URIException: Invalid URL encoding
  2. 0

    In a local-machine broad-but-shallow test crawl with H1 TRUNK, got this flurry of exceptions in _out and matching in-UI alerts. If the URIs are this illegal, they shouldn't have reached frontier-scheduling, and/or even if they are illegal, the getServerKey fallback behavior may be insufficient and the redundant errors for a single URI excessive. May affect H2/H3 too. However, beside the distracting errors, does not appear to cause other problems. 2009-09-21 21:26:21.156 SEVERE thread-19 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: 4:http://gt%1$d.google.com/mt?v/x3dgwh.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:429) at org.archive.crawler.frontier.AbstractFrontier.doJournalAdded(AbstractFrontier.java:451) at org.archive.crawler.frontier.WorkQueueFrontier.receive(WorkQueueFrontier.java:446) at org.archive.crawler.util.SetBasedUriUniqFilter.add(SetBasedUriUniqFilter.java:90) at org.archive.crawler.frontier.WorkQueueFrontier.schedule(WorkQueueFrontier.java:428) at org.archive.crawler.postprocessor.FrontierScheduler.schedule(FrontierScheduler.java:92) at org.archive.crawler.postprocessor.FrontierScheduler.innerProcess(FrontierScheduler.java:78) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:21.171 SEVERE thread-19 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: 4:http://gt%1$d.google.com/mt?v/x3dgwm.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:429) at org.archive.crawler.frontier.AbstractFrontier.doJournalAdded(AbstractFrontier.java:451) at org.archive.crawler.frontier.WorkQueueFrontier.receive(WorkQueueFrontier.java:446) at org.archive.crawler.util.SetBasedUriUniqFilter.add(SetBasedUriUniqFilter.java:90) at org.archive.crawler.frontier.WorkQueueFrontier.schedule(WorkQueueFrontier.java:428) at org.archive.crawler.postprocessor.FrontierScheduler.schedule(FrontierScheduler.java:92) at org.archive.crawler.postprocessor.FrontierScheduler.innerProcess(FrontierScheduler.java:78) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:45.828 SEVERE thread-25 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwh.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.prefetch.PreconditionEnforcer.considerDnsPreconditions(PreconditionEnforcer.java:227) at org.archive.crawler.prefetch.PreconditionEnforcer.innerProcess(PreconditionEnforcer.java:111) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:45.828 SEVERE thread-25 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwh.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.postprocessor.CrawlStateUpdater.innerProcess(CrawlStateUpdater.java:61) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:45.828 SEVERE thread-25 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwh.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:429) at org.archive.crawler.frontier.AbstractFrontier.doJournalFinishedFailure(AbstractFrontier.java:465) at org.archive.crawler.frontier.WorkQueueFrontier.finished(WorkQueueFrontier.java:918) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:157) 2009-09-21 21:26:46.250 SEVERE thread-33 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwm.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.prefetch.PreconditionEnforcer.considerDnsPreconditions(PreconditionEnforcer.java:227) at org.archive.crawler.prefetch.PreconditionEnforcer.innerProcess(PreconditionEnforcer.java:111) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:46.250 SEVERE thread-33 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwm.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.postprocessor.CrawlStateUpdater.innerProcess(CrawlStateUpdater.java:61) at org.archive.crawler.framework.Processor.process(Processor.java:109) at org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:151) 2009-09-21 21:26:46.250 SEVERE thread-33 org.archive.crawler.datamodel.ServerCache.getServerFor() Invalid URL encoding: invalid:4:http://gt%1$d.google.com/mt?v/x3dgwm.fresh/x26 org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1775) at org.apache.commons.httpclient.URI.decode(URI.java:1731) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027) at org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360) at org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:429) at org.archive.crawler.frontier.AbstractFrontier.doJournalFinishedFailure(AbstractFrontier.java:465) at org.archive.crawler.frontier.WorkQueueFrontier.finished(WorkQueueFrontier.java:918) at org.archive.crawler.framework.ToeThread.run(ToeThread.java:157)

    JIRA | 8 years ago | Gordon Mohr
    org.apache.commons.httpclient.URIException: Invalid URL encoding
  3. 0

    Following appeared in alerts for 24 hour broad crawl, 10,000 seeds. The crawl continued afterwards with no apparent harm, but please investigate. Nov 19, 2007 4:51:23 AM org.archive.modules.net.ServerCacheUtil getServerFor SEVERE: 2:(i%60g%U0100'?1:0))}s.sa(un%60Svl_l='%60lID,vmk,ppu,^v,%60l%60fspace,c%60N,%60m (in thread 'org.archive.crawler.frontier.BdbFrontier@1d27069.managerThread') org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1767) at org.apache.commons.httpclient.URI.decode(URI.java:1723) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3003) at org.archive.modules.net.CrawlServer.getServerKey(CrawlServer.java:282) at org.archive.modules.net.ServerCacheUtil.getServerFor(ServerCacheUtil.java:54) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:641) at org.archive.crawler.frontier.AbstractFrontier.doJournalAdded(AbstractFrontier.java:668) at org.archive.crawler.frontier.WorkQueueFrontier.processScheduleAlways(WorkQueueFrontier.java:267) at org.archive.crawler.frontier.AbstractFrontier$ScheduleAlways.process(AbstractFrontier.java:1426) at org.archive.crawler.frontier.InEventQueue.doOrEnqueue(InEventQueue.java:108) at org.archive.crawler.frontier.AbstractFrontier.doOrEnqueue(AbstractFrontier.java:1395) at org.archive.crawler.frontier.AbstractFrontier.receive(AbstractFrontier.java:564) at org.archive.crawler.util.SetBasedUriUniqFilter.add(SetBasedUriUniqFilter.java:93) at org.archive.crawler.frontier.WorkQueueFrontier.processScheduleIfUnique(WorkQueueFrontier.java:286) at org.archive.crawler.frontier.AbstractFrontier$ScheduleIfUnique.process(AbstractFrontier.java:1440) at org.archive.crawler.frontier.InEventQueue.drainAndProcess(InEventQueue.java:166) at org.archive.crawler.frontier.InEventQueue.doNext(InEventQueue.java:155) at org.archive.crawler.frontier.InEventQueue.drainAndProcess(InEventQueue.java:130) at org.archive.crawler.frontier.AbstractFrontier.drainInbound(AbstractFrontier.java:457) at org.archive.crawler.frontier.AbstractFrontier.managementTasks(AbstractFrontier.java:386) at org.archive.crawler.frontier.AbstractFrontier$ManagerThread.run(AbstractFrontier.java:1404)

    JIRA | 9 years ago | Paul Jack
    org.apache.commons.httpclient.URIException: Invalid URL encoding
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    Following appeared in alerts for 24 hour broad crawl, 10,000 seeds. The crawl continued afterwards with no apparent harm, but please investigate. Nov 19, 2007 4:51:23 AM org.archive.modules.net.ServerCacheUtil getServerFor SEVERE: 2:(i%60g%U0100'?1:0))}s.sa(un%60Svl_l='%60lID,vmk,ppu,^v,%60l%60fspace,c%60N,%60m (in thread 'org.archive.crawler.frontier.BdbFrontier@1d27069.managerThread') org.apache.commons.httpclient.URIException: Invalid URL encoding at org.apache.commons.httpclient.URI.decode(URI.java:1767) at org.apache.commons.httpclient.URI.decode(URI.java:1723) at org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3003) at org.archive.modules.net.CrawlServer.getServerKey(CrawlServer.java:282) at org.archive.modules.net.ServerCacheUtil.getServerFor(ServerCacheUtil.java:54) at org.archive.crawler.frontier.AbstractFrontier.tally(AbstractFrontier.java:641) at org.archive.crawler.frontier.AbstractFrontier.doJournalAdded(AbstractFrontier.java:668) at org.archive.crawler.frontier.WorkQueueFrontier.processScheduleAlways(WorkQueueFrontier.java:267) at org.archive.crawler.frontier.AbstractFrontier$ScheduleAlways.process(AbstractFrontier.java:1426) at org.archive.crawler.frontier.InEventQueue.doOrEnqueue(InEventQueue.java:108) at org.archive.crawler.frontier.AbstractFrontier.doOrEnqueue(AbstractFrontier.java:1395) at org.archive.crawler.frontier.AbstractFrontier.receive(AbstractFrontier.java:564) at org.archive.crawler.util.SetBasedUriUniqFilter.add(SetBasedUriUniqFilter.java:93) at org.archive.crawler.frontier.WorkQueueFrontier.processScheduleIfUnique(WorkQueueFrontier.java:286) at org.archive.crawler.frontier.AbstractFrontier$ScheduleIfUnique.process(AbstractFrontier.java:1440) at org.archive.crawler.frontier.InEventQueue.drainAndProcess(InEventQueue.java:166) at org.archive.crawler.frontier.InEventQueue.doNext(InEventQueue.java:155) at org.archive.crawler.frontier.InEventQueue.drainAndProcess(InEventQueue.java:130) at org.archive.crawler.frontier.AbstractFrontier.drainInbound(AbstractFrontier.java:457) at org.archive.crawler.frontier.AbstractFrontier.managementTasks(AbstractFrontier.java:386) at org.archive.crawler.frontier.AbstractFrontier$ManagerThread.run(AbstractFrontier.java:1404)

    JIRA | 9 years ago | Paul Jack
    org.apache.commons.httpclient.URIException: Invalid URL encoding

    Root Cause Analysis

    1. org.apache.commons.httpclient.URIException

      Invalid URL encoding

      at org.apache.commons.httpclient.URI.decode()
    2. HttpClient
      URI.getCurrentHierPath
      1. org.apache.commons.httpclient.URI.decode(URI.java:1775)
      2. org.apache.commons.httpclient.URI.decode(URI.java:1731)
      3. org.apache.commons.httpclient.URI.getCurrentHierPath(URI.java:3027)
      3 frames
    3. org.archive.crawler
      ToeThread.run
      1. org.archive.crawler.datamodel.CrawlServer.getServerKey(CrawlServer.java:360)
      2. org.archive.crawler.datamodel.ServerCache.getServerFor(ServerCache.java:120)
      3. org.archive.crawler.postprocessor.CrawlStateUpdater.innerProcess(CrawlStateUpdater.java:61)
      4. org.archive.crawler.framework.Processor.process(Processor.java:109)
      5. org.archive.crawler.framework.ToeThread.processCrawlUri(ToeThread.java:302)
      6. org.archive.crawler.framework.ToeThread.run(ToeThread.java:151)
      6 frames