Apache Tika 0.3
You need to be logged in to post messages in the forums. New users may register here.
Bartek Modzelewski
Member since: 09 January 2008
Posts: 9
|
Wednesday 22 April 2009 6:07:18 am
First of all thanks for this great extension! You should do promote it more on ez.no site :)
I've found that eztika-1.0.zip is probably based on Apache Tika version 0.2 . Recently version 0.3 has been released with MS Office 2007 file format support. I've managed to compile Tika sources with Maven2, and got a working jar file, but results of ripping .docx is not perfect. 20% of txt file is a xml rubish. Have you tried new Tika version ? Is only my example .docx file problem or general Tika problem?
When do you plan to update eZTika extension with new Tika version?
Thanks
|
|
You need to be logged in to post messages in the forums. New users may register here.