Considering these two indices:
## Mappings
PUT breeds
{
"mappings": {
"properties": {
"size": { "type": "keyword" },
"guarding_skills": { "type": "keyword" }
}
}
}
PUT dogs
{
"mappings": {
"properties": {
"breed": { "type": "keyword" },
"name": { "type": "keyword" },
}
}
}
## Datas
PUT breeds/_doc/german_sheperd
{
"size": "large",
"guarding_skills": "high"
}
PUT breeds/_doc/poodle
{
"size": "small",
"guarding_skills": "low"
}
PUT dogs/_doc/1
{
"breed": "german_sheperd",
"name": "cooper",
}
PUT dogs/_doc/2
{
"breed": "poodle",
"name": "duke",
}
PUT dogs/_doc/3
{
"breed": "beagle",
"name": "toby",
}
I make an aggregation on /dogs
index to get unique list of breeds like this:
GET /dogs/_search
{
"size": 0,
"aggs": {
"urls": {
"terms": {
"field": "breed"
}
}
}
}
But I also would like to filter out breeds witch aren't in the breeds
index, in this example, I would only get beagle
.
I can make a query first to get a list of breeds from the breeds
index and then use a must_not with terms like this:
GET /dogs/_search
{
"size": 0,
"query": {
"bool": {
"must_not": [
{
"terms": {
"breed": [
"german_sheperd",
"poodle"
]
}
}
]
},
"aggs": {
"urls": {
"terms": {
"field": "breed"
}
}
}
}
}
But I wish there is a more elegant way in a single query. Besides, my case involves a large amount of values (~10k) to use in the must_not clause.
I'm pretty sure this kind of question has been asked many times, but I didn't find any similar case.