Apache Tika 0.3

You need to be logged in to post messages in the forums. New users may register here.

Bartek Modzelewski

Member since:
09 January 2008

Posts: 9

Wednesday 22 April 2009 6:07:18 am

First of all thanks for this great extension! You should do promote it more on ez.no site :)

I've found that eztika-1.0.zip is probably based on Apache Tika version 0.2 . Recently version 0.3 has been released with MS Office 2007 file format support. I've managed to compile Tika sources with Maven2, and got a working jar file, but results of ripping .docx is not perfect. 20% of txt file is a xml rubish. Have you tried new Tika version ? Is only my example .docx file problem or general Tika problem?

When do you plan to update eZTika extension with new Tika version?

Thanks
Up

You need to be logged in to post messages in the forums. New users may register here.