unsupported/disabled operation EI on apche solr indexing

Go To StackoverFlow.com

3

Hi I m using apache solr 3.1 on windows server
I see exceptions when indexing in cmd that "unsupported/disabled operation EI" PDFStreamEngine
I have Google this around but couldn't find any solution for that

Apr 4, 2012 3:33:21 AM org.apache.solr.common.SolrException log
SEVERE: Exception in entity : null:org.apache.solr.handler.dataimport.DataImport
HandlerException: Unable to read content Processing Document # 3029
        at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAnd
Throw(DataImportHandlerException.java:72)
        at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEn
tityProcessor.java:130)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Ent
ityProcessorWrapper.java:238)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde
r.java:591)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde
r.java:617)
        at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.j
ava:267)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java
:186)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImpo
rter.java:353)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.j
ava:411)
        at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.ja
va:392)
Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException
from org.apache.tika.parser.ParserDecorator$1@1a8e75a
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199
)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
35)
        at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEn
tityProcessor.java:128)
        ... 8 more
Caused by: java.lang.NullPointerException
        at org.apache.pdfbox.pdmodel.PDPageNode.getCount(PDPageNode.java:109)
        at org.apache.pdfbox.pdmodel.PDDocument.getNumberOfPages(PDDocument.java
:943)
        at org.apache.tika.parser.pdf.PDFParser.extractMetadata(PDFParser.java:1
07)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:88)
        at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91)

        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197
)
        ... 10 more

Apr 4, 2012 3:33:22 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: EI  

Please help
Thanks

2012-04-04 07:34
by Kamran Akhter
can you please paste full stack trace? btw, what app server you are using - UVM 2012-04-04 07:45
Thanks for reply, i m using apache solr 3.1 on windows server 200 - Kamran Akhter 2012-04-04 07:54
are you not using any app server? only windows server 2008 - UVM 2012-04-04 07:55
apache solr is built on jetty,it uses port 8983, i just integrated it, it pulls data form mysql database and indexed its recor - Kamran Akhter 2012-04-04 08:08
please make sure you are using latest versions of the libraries.Also, make sure the pdf documents are having some size - UVM 2012-04-04 08:25


1

This is actually a message from PDFBox. It means that the PDF contains an operator not supported by PDFBox. More details can be found here:

http://mail-archives.apache.org/mod_mbox/pdfbox-users/201304.mbox/%3C128CBE37-40F7-4948-BAE2-67151D7527A7@fileaffairs.de%3E

2013-05-31 20:00
by José Ricardo
Ads