A huge sized document(3.5 mb) is not searchable in Appsearch

I have a single document(too huge) indexed in App Search , which has 3.5 mb content in it.
I reiterate, a single document's property having 3.5 mb size of words indexed in it.

App Search does not support search on that document. I mean the search query returns 0 results.
App Search UI also does not load that particular document. This is a secondary issue. The foremost thing is search.

However doing a GetById call, I can find the huge contents inside that document.

Do we have a solution to solve this please?

1 Like

The maximum size of a document is 100kb. How did you get a document that large indexed? For reference: Limits | Elastic App Search Documentation [7.15] | Elastic.

The document size is configurable for app search,
app_search.engine.document_size.limit: 100kb

we have it configured to 10MB now,

If we try to index a book and its contents to appsearch, sometimes we get this issue that the document is not searchable, however Get By ID returns the document.

Can you think of anything that could cause this issue?

1 Like

I'm fairly certain you're having that issue because your document is just too large.

The limits are configurable, but if you configure a limit to an extreme like that, you might certainly find edges.

The best I can do is refer you to this recommendation from the Elasticsearch docs: General recommendations | Elasticsearch Guide [7.15] | Elastic

1 Like

Thanks @JasonStoltz for pointing the Elastic recommendations,

A follow up clarification,

  1. If we split the file into multiple chunks, we will have multple documents (say 15 chunks and 15 documents) in Appsearch for a single file.
  2. Lets say every document has 4 properties, fileId, chunkId, fileName and fileContent, if the user searches for filename we cannot give 15 search responses. However it works for the fileContent search.
  3. Do we have the possibility, to address this as a parent/child mapped documents, so that only the parent document will have the fileId and filename.

Thanks to the recommendations.

A document of size ~3.5 mb, may not be fairly large size, based on this
[Indexing very large document in ES](https://elastic search discussion)
Do we have a quick workaround like increasing the memory , to address this problem as temporary fix, till we come up with a scalable solution?

If you chunk your responses as you describe, you may be able to use grouping to ensure you only have 1 result for each file.