Let's say in a Microservices architecture there are 100s of different entities and search needs to be supported that can look for matches in all those entities. I see a few ways of doing this.
Create one index for each entity - make as many search queries as number of indexes in parallel and then merge the results.
Create multiple really wide indexes ( going all the way to max allowed columns in an index). Create multiple such indexes as need. Do same as 1 for these indexes.
Add only a limited number of pre-defined columns that are common to all entities. Add one more column like "searchableData" that holds data from other entity fields but not the file names. This means that all fields from individual entities are not provided as "filters" but their data is indexed and can be searched. This is not ideal but seems like a reasonable solution.
Well, what if there are that many different denormalized objects in a system? It can be a huge system depending on the business domain. Consider something like Salesforce that lets customers even create custom objects.
Let's say 100 is exaggerated, what is good way to go for say 25 such already denormalized objects?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.