com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Excel document: Invalid header signature; read 8236850760414359372, expect ed -2226271756974174256

Atlassian JIRA | Andrew Moise | 7 years ago
tip
Your exception is missing from the Samebug knowledge base.
Here are the best solutions we found on the Internet.
Click on the to mark the helpful solution and get rewards for you help.
  1. 0

    This problem occurs due to the browser sending the wrong MIME type during a file upload. It appears that Windows boxes where MS Excel handles CSV files uploads CSV files with the "application/vnd.ms-excel" MIME type. It can cause the search index to be only partially built, resulting in missing pages in search results. Sample logs: {noformat} 2010-02-22 11:09:56,038 WARN [Indexer: 2] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: textures- streaming.csv v.1 (3014859) kteich) -- url: /confluence/admin/reindex.action | userName: moise | referer: https://qix.demiurgestudios.com/confluence/admin/search-indexes.action | action: reind ex com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Excel document: Invalid header signature; read 8236850760414359372, expect ed -2226271756974174256 at com.atlassian.bonnie.search.extractor.MsExcelContentExtractor.extractText(MsExcelContentExtractor.java:101) at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:39) at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:43) at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:102) at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:41) at com.atlassian.bonnie.index.TempIndexWriter.perform(TempIndexWriter.java:72) at com.atlassian.confluence.search.lucene.TempIndexWriterStrategy.perform(TempIndexWriterStrategy.java:43) at com.atlassian.confluence.search.lucene.tasks.TempIndexBackedIndexTaskPerformer.perform(TempIndexBackedIndexTaskPerformer.java:21) at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.indexCollection(DefaultObjectQueueWorker.java:73) at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker$1.doInTransactionWithoutResult(DefaultObjectQueueWorker.java:61) at org.springframework.transaction.support.TransactionCallbackWithoutResult.doInTransaction(TransactionCallbackWithoutResult.java:33) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:127) at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.run(DefaultObjectQueueWorker.java:50) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) at java.lang.Thread.run(Thread.java:595) Caused by: java.io.IOException: Invalid header signature; read 8236850760414359372, expected -2226271756974174256 at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:103) at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:90) at com.atlassian.bonnie.search.extractor.MsExcelContentExtractor.extractText(MsExcelContentExtractor.java:87) ... 16 more {noformat} h3. Workaround Stop Confluence, edit the {{confluence/WEB-INF/classes/mime.types}} file and add the following entry: {code} text/csv csv {code} This ensures that all files with the CSV extension are mapped to the text/csv MIME type regardless of what the browser sends. Next, run the following query against the database and then start Confluence: {code:sql} update attachments set contenttype='text/csv' where lower(title) like '%.csv'; {code} To make the content in the CSV files searchable you will also need to [run a reindex|http://confluence.atlassian.com/display/DOC/Content+Index+Administration#ContentIndexAdministration-RebuildingtheContentIndexes].

    Atlassian JIRA | 7 years ago | Andrew Moise
    com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Excel document: Invalid header signature; read 8236850760414359372, expect ed -2226271756974174256
  2. 0

    This problem occurs due to the browser sending the wrong MIME type during a file upload. It appears that Windows boxes where MS Excel handles CSV files uploads CSV files with the "application/vnd.ms-excel" MIME type. It can cause the search index to be only partially built, resulting in missing pages in search results. Sample logs: {noformat} 2010-02-22 11:09:56,038 WARN [Indexer: 2] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: textures- streaming.csv v.1 (3014859) kteich) -- url: /confluence/admin/reindex.action | userName: moise | referer: https://qix.demiurgestudios.com/confluence/admin/search-indexes.action | action: reind ex com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Excel document: Invalid header signature; read 8236850760414359372, expect ed -2226271756974174256 at com.atlassian.bonnie.search.extractor.MsExcelContentExtractor.extractText(MsExcelContentExtractor.java:101) at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:39) at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:43) at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:102) at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:41) at com.atlassian.bonnie.index.TempIndexWriter.perform(TempIndexWriter.java:72) at com.atlassian.confluence.search.lucene.TempIndexWriterStrategy.perform(TempIndexWriterStrategy.java:43) at com.atlassian.confluence.search.lucene.tasks.TempIndexBackedIndexTaskPerformer.perform(TempIndexBackedIndexTaskPerformer.java:21) at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.indexCollection(DefaultObjectQueueWorker.java:73) at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker$1.doInTransactionWithoutResult(DefaultObjectQueueWorker.java:61) at org.springframework.transaction.support.TransactionCallbackWithoutResult.doInTransaction(TransactionCallbackWithoutResult.java:33) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:127) at com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.run(DefaultObjectQueueWorker.java:50) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675) at java.lang.Thread.run(Thread.java:595) Caused by: java.io.IOException: Invalid header signature; read 8236850760414359372, expected -2226271756974174256 at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:103) at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:90) at com.atlassian.bonnie.search.extractor.MsExcelContentExtractor.extractText(MsExcelContentExtractor.java:87) ... 16 more {noformat} h3. Workaround Stop Confluence, edit the {{confluence/WEB-INF/classes/mime.types}} file and add the following entry: {code} text/csv csv {code} This ensures that all files with the CSV extension are mapped to the text/csv MIME type regardless of what the browser sends. Next, run the following query against the database and then start Confluence: {code:sql} update attachments set contenttype='text/csv' where lower(title) like '%.csv'; {code} To make the content in the CSV files searchable you will also need to [run a reindex|http://confluence.atlassian.com/display/DOC/Content+Index+Administration#ContentIndexAdministration-RebuildingtheContentIndexes].

    Atlassian JIRA | 7 years ago | Andrew Moise
    com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Excel document: Invalid header signature; read 8236850760414359372, expect ed -2226271756974174256
  3. 0

    Getting the Exception error while converting Docx file to XML using Apache POi

    Stack Overflow | 5 years ago | Abhishek
    java.io.IOException: Invalid header signature; read 0x4353414E2023233C, expected 0xE11AB1A1E011CFD0
  4. Speed up your debug routine!

    Automated exception search integrated into your IDE

  5. 0

    Reading Microsoft Word Document in JAVA - Techie Zone

    hiteshagrawal.com | 10 months ago
    java.io.IOException: Unable to read entire header; 6 bytes read; expected 512 bytes
  6. 0

    java.io.IOException: Cannot remove block[ 11024 ]; out of range[ 0 - 9406 ]

    Apache Bugzilla | 8 years ago | jariniskala
    java.io.IOException: Cannot remove block[ 1148 ]; out of range[ 0 - 694 ]

    2 unregistered visitors
    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. java.io.IOException

      Invalid header signature; read 8236850760414359372, expected -2226271756974174256

      at org.apache.poi.poifs.storage.HeaderBlockReader.<init>()
    2. POI
      POIFSFileSystem.<init>
      1. org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:103)
      2. org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:90)
      2 frames
    3. com.atlassian.bonnie
      BaseAttachmentContentExtractor.addFields
      1. com.atlassian.bonnie.search.extractor.MsExcelContentExtractor.extractText(MsExcelContentExtractor.java:87)
      2. com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:39)
      2 frames
    4. com.atlassian.confluence
      ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields
      1. com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:43)
      1 frame
    5. com.atlassian.bonnie
      BaseDocumentBuilder.getDocument
      1. com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104)
      1 frame
    6. com.atlassian.confluence
      AddDocumentIndexTask.perform
      1. com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:102)
      2. com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:41)
      2 frames
    7. com.atlassian.bonnie
      TempIndexWriter.perform
      1. com.atlassian.bonnie.index.TempIndexWriter.perform(TempIndexWriter.java:72)
      1 frame
    8. com.atlassian.confluence
      DefaultObjectQueueWorker$1.doInTransactionWithoutResult
      1. com.atlassian.confluence.search.lucene.TempIndexWriterStrategy.perform(TempIndexWriterStrategy.java:43)
      2. com.atlassian.confluence.search.lucene.tasks.TempIndexBackedIndexTaskPerformer.perform(TempIndexBackedIndexTaskPerformer.java:21)
      3. com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.indexCollection(DefaultObjectQueueWorker.java:73)
      4. com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker$1.doInTransactionWithoutResult(DefaultObjectQueueWorker.java:61)
      4 frames
    9. Spring Tx
      TransactionTemplate.execute
      1. org.springframework.transaction.support.TransactionCallbackWithoutResult.doInTransaction(TransactionCallbackWithoutResult.java:33)
      2. org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:127)
      2 frames
    10. com.atlassian.confluence
      DefaultObjectQueueWorker.run
      1. com.atlassian.confluence.search.lucene.DefaultObjectQueueWorker.run(DefaultObjectQueueWorker.java:50)
      1 frame
    11. Java RT
      Thread.run
      1. java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
      2. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
      3. java.lang.Thread.run(Thread.java:595)
      3 frames