Full list of supported document formats by ES


I found following list of formats supported by Tika

What all file formats are supported by Ingest attachment plugin? Does it support all formats supported by Tika?


No. Only a subset are supported.

Mainly open office documents, office documents but Visio and PDF documents.

Adding to this that FSCrawler project supports all format as it's running outside an elasticsearch node.

Is this the complete set of supported types?

MS Office docs: .doc, .docx, .xls, .xlsx, .ppt, .pptx
TXT docs: .rtf, .txt, .csv
oOo docs: .odt, .sxw, .ods, .sxc, .odp, .sxi
PDF docs: .pdf

I'm just gathering the list of supported types. Please add/remove the supported types accordingly.

Only way to have a precise answer is to test it.

But here the list of all what we are testing: https://github.com/elastic/elasticsearch/tree/master/plugins/ingest-attachment/src/test/resources/org/elasticsearch/ingest/attachment/test/tika-files

@dadoonet: Thank you, this information should be enough for me :slight_smile:

