Hi All,
I have a requirement to search across multiple indexes (they will have a number of common fields) for a Global Search feature.
I wish to do per-index filtering to get rid of rows on individual indexes based on business rules specific to those indexes.
For example if there is a worker index and a manager index, I may wish to restrict to workers of different ratings and managers on certain scales.
For this I am using filters that reference the index via a terms filter and apply a "sub-filter", something like:
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"filter": {
"bool": {
"should": [
{
"bool": {
"must": [
{ // Restrict to the following indexes ...
"terms": {
"myIndex.raw": [
"worker"
]
}
},
{
"bool":
// specific rules for worker
{
"must": [
{
"range": {
"rating": {
"gte": 0,
"lt": 5
}
}
}
]
}
}
]
}
},
// Second clause over manager
{
"bool": {
"must": [
{
"terms": {
"myIndex.raw": [
"manager"
]
}
},
{
"bool": {
"must": [
{
"terms": {
"scale.raw": [
"M4",
"M5"
]
}
}
]
}
}
]
}
}
]
}
}
}
}
I have tried this and it seems to work (returns the correct data).
I have seen techniques where the _index meta-field is used to identify the index, however for cases where worker is made up of multiple indexes (and worker is actually an alias) this does not work.
I have also seen techniques using the _type meta-field to do the same (using a consistent naming convention for the type), however type is going away, so this seems a non-starter.
This leaves me with with creating a dedicated field and populating it with the same value for an index (the "myIndex.raw" keyword field above). I have seen in doc somewhere this is the recommended approach to solving this problem.
What does this mean for the inverted index on this field when you only have one value? Do you get an inverted index with 1 value containing bitmap of every doc in the index (almost a list) that becomes very un-performant when queried?
How far will this solution scale? 5 indexes, 10, 100 or just hardware dependent?
Is there a better way?
I know I can _msearch and combine the results in code, but this seem like I am writing part of the search engine in my UI now, with lots of extra network overhead etc).
All thoughts appreciated,
Dave