Large Array Performance Optimization, Resource Costs

I'm interested in discovering how to optimize Elasticsearch. Requirements:

[1] Using an index that contains 100,000 documents.
[2] Each document contains 2 fields: "id", "my_array"
[3] MATCH a KEYWORD against the "myArray" field. "myArray" could have the following number of elements listed in the four scenarios given below, each containing a string no more than 30 chars.
A. 100
B. 1000
C. 10000
D. 100000

What kind of configuration would allow me to match a single string in the scenarios above in less than 250 milliseconds under a load of 50 concurrent requests?

I think someone asked a simpler version of this question previously, but the Elasticsearch employee who responded did not provide any meaningful information in their response, they just said be "reasonable".

https://discuss.elastic.co/t/your-post-in-what-are-the-limitations-of-array-size-in-elastic-search/146275

Your link didn't work for me. The response I saw was this: What are the limitations of array size in elastic search? - #2 by jpountz

If so it seemed a perfectly reasonable answer to me. i.e. don't create a single JSON document with an array of a million ID strings which have to be rewritten entirely each time there's any change to that set.

The workaround would be to break this doc up into smaller ones.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.