Request Entity Too Large when index file json has size large 100mb

Vo_Van_Huy · October 9, 2019, 5:21am

How to index file json has size large 100mb with fscrawler and elastic.
I use fscrawler 2.6 and elastic 6.8.0.

spinscale · October 9, 2019, 6:58am

Hey,

see default http max content length is 100MB, see https://www.elastic.co/guide/en/elasticsearch/reference/7.4/modules-http.html

You should however not increase that limit, but rather reduce your batches if possible. Hope that makes sense.

--Alex

Vo_Van_Huy · October 9, 2019, 7:11am

yes, I know. but my data is books, it has data large 100mb, It cannot be split into small files.
I did changes heap settings. (-Xms10g, -Xmx-10g) and "indexded_chars" : "100%"

Vo_Van_Huy · October 9, 2019, 8:13am

I fixed it.
Done

dadoonet · October 9, 2019, 8:19am

Note that there is a difference between what FSCrawler collects and what it generates. If you don't store the source (BASE64 binary document), then hopefully the extracted content is much less than the source itself.

Also, if you have very big documents, you can change https://fscrawler.readthedocs.io/en/latest/admin/fs/elasticsearch.html#bulk-settings and make sure it's always under 100mb.

system · November 6, 2019, 8:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
FSCrawler - Indexing mix of Big and small files - HTTP Entity too large error Elasticsearch	9	205	February 28, 2024
[elastic/elasticsearch] Cannot bulk index a JSON file greater than 100MB in Elasticsearch. Tried changing HTTP content length but it doesn't work Elasticsearch	14	2166	September 18, 2018
How to index file have content json with FSCrawler Elasticsearch	3	747	November 5, 2019
Default value http.content_content_length does not restricts ingestion of large documents Elasticsearch	12	931	August 22, 2018
Elasticsearch Max document length for indexing files Elasticsearch	4	520	May 15, 2019

Request Entity Too Large when index file json has size large 100mb

Related topics