Hey,
I made a working request but it kinda sucks ! Was wondering if a pleasing soul could help me get to something better...
The problem
Lets say I have 2 types of documents for simplicty's sake.
-
The type
squareandovalhave auuidkey which acts as a primary key. -
The type
shape_infoalso uses theuuidkey but the document is optional.{ "type": "square", "uuid": "xxxx" }, { "type": "oval", "uuid": "zzzz" }, { "type": "square", "uuid": "xxxx" }, { "type": "square", "uuid": "xxxx" }, { "type": "oval", "uuid": "zzzz" }, { "type": "shape_info", "uuid": "zzzz" }
Now, I would like to search for all UUIDs documents that don't already have a type: shape_info.
For instance, with the following document:
{
"type": "shape_info",
"uuid": "zzzz"
}
I would like all documents with uuid: zzzz to be excluded as this uuid has shape_info document.
What I did
To achieve this I had to do 2 requests:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"terms": {
"type.keyword": ["shape_info"]
}
}
]
}
},
"aggs": {
"uuids": {
"composite": {
"size": 50,
"sources": [
{"myfield": {"terms": {"field": "uuid.keyword"}}}
]
}
}
}
}
=> This returns the list of uuids that have the type:shape_info. Now I must get all records that do NOT have this uuid...
GET /datasets-1.3.0/_search
{
"query": {
"bool": {
"must": [
#not relevant
],
"must_not": [
{ "match" : { "uuid": "zzzz"} }
]
}
}
}
I added the list of UUIDs we got from the previous request in the must_not section and it works...
But when I'll have thousand of uuids returned by the first query the must_not will then have thousand of UUIDs too. My god...
I don't think it's any good performance wise and I would like to find a way to merge the two queries so that I d'ont have to use a fat array of strings inside the second query !
Any help is appreciated,
Thx !