I have the following architecture dilemma -
my requirements are to index a lot of data which is mostly a single write.
document are "tagged" on index as types X/Y/Z...
this means the system adds some metadata on each document at index time.
unfortunately users can also edit the document tagging manually, which may reside in some reprocessing of the original data and re-indexing of multiple/many documents.
the system has a steady stream of input data, it indexes all the time while the users can temper with the data also.
up until now we had the "tagging" metadata embedded in each document, which caused a lot of reprocessing upon user related changes.
we're thinking of separating the tag metadata from the documents, but that will result in a semi SQL like DB (document will have a reference to the metadata).
so, my question is -
what is the best design pattern here to get the best of ES (aggregations, search, geo....) without falling into the SQL like trap too deep?