Apache Tika text extraction on Google App Enginer

I need to extract text from a few document types (.doc .docx .pdf and .txt primarily) from email attachments. The application is running on Google App Engine. Apache Tika does exactly what I need it to, but I'm running to a SecurityException when it tries to create temporary files on GAE. I know GAE does not support this. Is there a way to force Tika to use memcache or some other storage besides temporary files? Are there any other document parsers which might handle this without temporary files?
Some of the libraries that Tika uses will only work with a File, while others are happy with an InputStream. Could you be hitting that?

以上就是Apache Tika text extraction on Google App Enginer的详细内容,更多请关注web前端其它相关文章!

赞(0) 打赏
未经允许不得转载:web前端首页 » JavaScript 答疑

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址

前端开发相关广告投放 更专业 更精准