Stackoverflow on Elasticsearch file indexation with ingest-attachment

tallison · November 4, 2020, 8:11pm

We strongly encourage keeping Tika processing out of the same JVM/VM/M/rack/data center, as your indexer or even the ingest process.

This can be done with tika-batch, the ForkParser or tika-server. These three options remove the potential for catastrophic problems affecting the indexing process.

We do what we can when we find problems on Apache Tika, but we know and loudly proclaim that robust parsing of untrusted documents must be run in an isolated JVM.

We're happy to help you @dadoonet make FSCrawler and/or ingest-attachment more robust if you have an interest...

Topic		Replies	Views
ElasticsearchParseException using Ingest Attachment Processor Plugin in Elasticsearch 6.4.2 Elasticsearch	9	1831	April 17, 2019
Getting error while parsing documents Elasticsearch	13	6537	June 8, 2017
Error while using ingest attachment plugin on some docs Elasticsearch	13	1845	November 29, 2018
Mapper Attachment in ES Elasticsearch	20	3800	July 5, 2017
Elasticsearch ingestnode crashes on a specific document? Elasticsearch ingest-pipeline	3	771	December 28, 2022

Stackoverflow on Elasticsearch file indexation with ingest-attachment

Related topics