org.apache.tika.exception.TikaException

Unable to extract PDF content

Samebug tips0

We couldn't find tips for this exception.

Don't give up yet. Paste your full stack trace to get a solution.

Solutions on the web69

  • via Unknown by Allison, Timothy B.,
  • via Unknown by Unknown author,
  • Stack trace

    • org.apache.tika.exception.TikaException: Unable to extract PDF content at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:150) at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:146) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:117) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.server.TikaResource$5.write(TikaResource.java:368) at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataProvider.java:164) at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.java:1363) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage(JAXRSOutInterceptor.java:244) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(JAXRSOutInterceptor.java:117) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JAXRSOutInterceptor.java:80) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:83) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:251) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:261) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:70) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1088) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1024) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Unknown Source) Caused by: java.io.IOException at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:109) at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:379) at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:291) at org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:225) at org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:117) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215) at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:460) at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:385) at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:344) at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:134) ... 35 more Caused by: java.util.zip.DataFormatException: incorrect header check at java.util.zip.Inflater.inflateBytes(Native Method) at java.util.zip.Inflater.inflate(Unknown Source) at java.util.zip.Inflater.inflate(Unknown Source) at org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:128) at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:101) ... 46 more

    Write tip

    You have a different solution? A short tip here would help you and many other users who saw this issue last week.

    Users with the same issue

    Unknown visitor
    Unknown visitorOnce,
    Unknown visitor
    Unknown visitorOnce,
    Unknown visitor
    Unknown visitorOnce,
    Unknown visitor
    Unknown visitorOnce,
    Unknown visitor
    Unknown visitorOnce,
    9 more bugmates