We are using Elasticsearch version 8.11.3 self hosted, in a cluster with 11 nodes (16 gb ram 16 cpu each).
We have an index with ~140k documents that contain fields of various types (mostly keywords and a few text ones) and 3 vector fields. The index has 5 shards and 9 replicas - tuned for query throughput and response time.
All queries currently use only the keyword and text fields. The vectors are not yet used in queries.
The workload is mainly query, but there is a fair amount of indexing - about 1k RPS for searches and ~200 RPS for doc updates/adds.
Now, the issue is that we are indexing updates on documents, but only on the non vector fields. We are seeing way slower indexing (and querying) throughput if the index contains the vectors as opposed to updates on docs if the index is scraped of the vector fields.
Question is, does ES recompute KNN trees even if some random non-vector field gets updated in the index? If so, is there any way to stop this ?
would splitting the indices in two, one for vector search, one for the rest of the fields somewhat fix the issue ? This would keep the fields updates in the main index while having minimal updates on the vectors one.