com.aspose.words.FileCorruptedException

There are no available Samebug tips for this exception. Do you have an idea how to solve this issue? A short tip would help users who saw this issue last week.

  • The saving of a modified page opened over WebDAV / "Edit in Word" functionality is reported to be broken for NeoOffice 3.0.2 in Studio 2.5. Also not mentioned specifically, [this|http://confluence.atlassian.com/display/DOC/Office+Connector+Prerequisites] page doesn't mention that this version is incompatible. The exception occurs if the customer saves a modified page back to the server while reading in the document from the request InputStream in the underlying Aspose Words library. {code} @400000004d59686a16abeabc -- url: /wiki/plugins/servlet/confluence/editinword/7372872/content/*****.doc | userName: ***** @400000004d59686a16abeea4 org.apache.jackrabbit.webdav.DavException @400000004d59686a16abeea4 at com.benryan.servlet.webdav.PageAsDocResource.saveData(PageAsDocResource.java:186) @400000004d59686a16abfa5c at com.benryan.servlet.webdav.PageResource.addMember(PageResource.java:64) @400000004d59686a16abfe44 at org.apache.jackrabbit.webdav.server.AbstractWebdavServlet.doPut(AbstractWebdavServlet.java:503) @400000004d59686a16ac022c at org.apache.jackrabbit.webdav.server.AbstractWebdavServlet.execute(AbstractWebdavServlet.java:240) @400000004d59686a16ac0614 at com.atlassian.confluence.extra.webdav.servlet.ConfluenceWebdavServlet.service(ConfluenceWebdavServlet.java:104) ... @400000004d59686a16b0096c Caused by: com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded. @400000004d59686a16b00d54 at com.aspose.words.Document.a(Unknown Source) @400000004d59686a16b024c4 at com.aspose.words.Document.b(Unknown Source) @400000004d59686a16b028ac at com.aspose.words.Document.a(Unknown Source) @400000004d59686a16b028ac at com.aspose.words.Document.<init>(Unknown Source) @400000004d59686a16b02c94 at com.aspose.words.Document.<init>(Unknown Source) @400000004d59686a16b0307c at com.aspose.words.Document.<init>(Unknown Source) @400000004d59686a16b0307c at com.benryan.servlet.webdav.PageAsDocResource.saveData(PageAsDocResource.java:176) @400000004d59686a16b0401c ... 135 more @400000004d59686a16b0401c Caused by: java.lang.IllegalStateException: java.nio.charset.UnsupportedCharsetException: UTF-7 @400000004d59686a16b04404 at asposewobfuscated.mf.vb(Unknown Source) @400000004d59686a16b047ec at asposewobfuscated.mf.uX(Unknown Source) @400000004d59686a16b047ec at asposewobfuscated.mf.U(Unknown Source) @400000004d59686a16b04bd4 at com.aspose.words.ha.iE(Unknown Source) @400000004d59686a16b0578c at com.aspose.words.ha.g(Unknown Source) @400000004d59686a16b0578c ... 141 more @400000004d59686a16b05b74 Caused by: java.nio.charset.UnsupportedCharsetException: UTF-7 @400000004d59686a16b05f5c at java.nio.charset.Charset.forName(Charset.java:505) @400000004d59686a16b06344 ... 146 more {code} Here are the version changes between Studio 2.4 and 2.5 in the involved code: {code} Confluence 3.3.3 WebDAV Plugin 2.4 (Jackrabbit 1.4) Office Connector Plugin 1.13 (Aspose Words 3.2.1) Confluence 3.4.7 WebDAV Plugin 2.5 (Jackrabbit 1.4) Office Connector Plugin 1.15 (Aspose Words 3.2.1) {code} [This|http://www.aspose.com/community/forums/thread/165090.aspx] post makes me suspicious that the used UTF-7 encoding might just be a fallback / hiding the original cause. I'm on Linux, thus couldn't verify / reproduce the bug. I asked the customer to edit a page I've verified to be modifiable over OpenOffice 3.2, but he just switched versions to OpenOffice 3.3 and reported it to be working / closed the issue.
    via by Fabian Krämer,
  • One of our users uploaded a file with a .dot extension to Confluence. The file is not a word template. (In this case it was a http://en.wikipedia.org/wiki/DOT_language file). The extractor should really go to more effort to detect the type of a file before just assuming based on file extension and then logging stack traces like this one: {noformat} 2012-03-08 23:36:00,087 WARN [scheduler_Worker-5] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: orgtree.dot v.1 (1973452911) jp olley) com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Word document: The document appears to be corrupted and cannot be loaded. at com.atlassian.confluence.extra.officeconnector.index.word.WordTextExtractor.extractText(WordTextExtractor.java:41) at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:40) at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:36) at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:97) at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:43) at com.atlassian.confluence.search.lucene.tasks.UpdateDocumentIndexTask.perform(UpdateDocumentIndexTask.java:40) at com.atlassian.confluence.search.lucene.tasks.BulkWriteIndexTask.perform(BulkWriteIndexTask.java:44) at com.atlassian.bonnie.LuceneConnection.withWriter(LuceneConnection.java:331) at com.atlassian.confluence.search.lucene.tasks.LuceneConnectionBackedIndexTaskPerformer.perform(LuceneConnectionBackedIndexTaskPerformer.java:20) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$BatchUpdateAction.perform(DefaultConfluenceIndexManager.java:424) at com.atlassian.bonnie.LuceneConnection.withBatchUpdate(LuceneConnection.java:405) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.processTasks(DefaultConfluenceIndexManager.java:197) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.flushQueue(DefaultConfluenceIndexManager.java:149) at sun.reflect.GeneratedMethodAccessor1860.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at $Proxy44.flushQueue(Unknown Source) at com.atlassian.confluence.search.lucene.IndexQueueFlusher.executeJob(IndexQueueFlusher.java:30) at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.surroundJobExecutionWithLogging(AbstractClusterAwareQuartzJobBean.java:63) at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.executeInternal(AbstractClusterAwareQuartzJobBean.java:46) at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86) at org.quartz.core.JobRunShell.run(JobRunShell.java:199) at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool$1.run(ConfluenceQuartzThreadPool.java:20) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549) Caused by: com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded. at com.aspose.words.Document.a(Unknown Source) at com.aspose.words.Document.b(Unknown Source) at com.aspose.words.Document.a(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.atlassian.confluence.extra.officeconnector.index.word.WordTextExtractor.extractText(WordTextExtractor.java:37) ... 30 more {noformat}
    via by Don Willis [Atlassian],
  • One of our users uploaded a file with a .dot extension to Confluence. The file is not a word template. (In this case it was a http://en.wikipedia.org/wiki/DOT_language file). The extractor should really go to more effort to detect the type of a file before just assuming based on file extension and then logging stack traces like this one: {noformat} 2012-03-08 23:36:00,087 WARN [scheduler_Worker-5] [bonnie.search.extractor.BaseAttachmentContentExtractor] addFields Error indexing attachment (Attachment: orgtree.dot v.1 (1973452911) jp olley) com.atlassian.bonnie.search.extractor.ExtractorException: Error reading content of Word document: The document appears to be corrupted and cannot be loaded. at com.atlassian.confluence.extra.officeconnector.index.word.WordTextExtractor.extractText(WordTextExtractor.java:41) at com.atlassian.bonnie.search.extractor.BaseAttachmentContentExtractor.addFields(BaseAttachmentContentExtractor.java:40) at com.atlassian.confluence.plugin.descriptor.ExtractorModuleDescriptor$BackwardsCompatibleExtractor.addFields(ExtractorModuleDescriptor.java:36) at com.atlassian.bonnie.search.BaseDocumentBuilder.getDocument(BaseDocumentBuilder.java:104) at com.atlassian.confluence.search.lucene.ConfluenceDocumentBuilder.getDocument(ConfluenceDocumentBuilder.java:97) at com.atlassian.confluence.search.lucene.tasks.AddDocumentIndexTask.perform(AddDocumentIndexTask.java:43) at com.atlassian.confluence.search.lucene.tasks.UpdateDocumentIndexTask.perform(UpdateDocumentIndexTask.java:40) at com.atlassian.confluence.search.lucene.tasks.BulkWriteIndexTask.perform(BulkWriteIndexTask.java:44) at com.atlassian.bonnie.LuceneConnection.withWriter(LuceneConnection.java:331) at com.atlassian.confluence.search.lucene.tasks.LuceneConnectionBackedIndexTaskPerformer.perform(LuceneConnectionBackedIndexTaskPerformer.java:20) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager$BatchUpdateAction.perform(DefaultConfluenceIndexManager.java:424) at com.atlassian.bonnie.LuceneConnection.withBatchUpdate(LuceneConnection.java:405) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.processTasks(DefaultConfluenceIndexManager.java:197) at com.atlassian.confluence.search.lucene.DefaultConfluenceIndexManager.flushQueue(DefaultConfluenceIndexManager.java:149) at sun.reflect.GeneratedMethodAccessor1860.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:106) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at $Proxy44.flushQueue(Unknown Source) at com.atlassian.confluence.search.lucene.IndexQueueFlusher.executeJob(IndexQueueFlusher.java:30) at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.surroundJobExecutionWithLogging(AbstractClusterAwareQuartzJobBean.java:63) at com.atlassian.confluence.setup.quartz.AbstractClusterAwareQuartzJobBean.executeInternal(AbstractClusterAwareQuartzJobBean.java:46) at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86) at org.quartz.core.JobRunShell.run(JobRunShell.java:199) at com.atlassian.confluence.schedule.quartz.ConfluenceQuartzThreadPool$1.run(ConfluenceQuartzThreadPool.java:20) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549) Caused by: com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded. at com.aspose.words.Document.a(Unknown Source) at com.aspose.words.Document.b(Unknown Source) at com.aspose.words.Document.a(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source) at com.atlassian.confluence.extra.officeconnector.index.word.WordTextExtractor.extractText(WordTextExtractor.java:37) ... 30 more {noformat}
    via by Don Willis [Atlassian],
    • com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded. @400000004d59686a16b00d54 at com.aspose.words.Document.a(Unknown Source) @400000004d59686a16b024c4 at com.aspose.words.Document.b(Unknown Source) at com.aspose.words.Document.a(Unknown Source) at com.aspose.words.Document.<init>(Unknown Source)
    No Bugmate found.