Following is the use case:
I am storing vectors and meta data for a document in a single index. However, if there is any change in the vector, I also need to fetch the meta data and then update the document in the index . This is also true vice vera.
If i go for partial updates , this can result in data inconsistency considering changes in vector and also meta data updates events are ingested at high frequency.
Is there any way to store the two data in different indexes and use a common value while filtering based on some common id. First by meta data and then vector search based on the filtered metadata in a single query.
I don't want to execute two queries in a two step process as it is not optimal for large datasets with millions of documents
I am not sure I understand the update logic and constraints you are describing, so some clarification woiuld be useful.
This is basically a join operation, which Elasticsearch does not support. If you want to store the vector and metadata in separate documents so you can update them individually you might want to test using the join data type to create a parent/child relationship within a single index. This does add overhead and make queries more complex and expensive but may be an option. I am not sure this will work but it is the only option I can think of.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.