Background
So for some context before my question, I have an enum, let's call it my_enum
. It has 5 possible values, all single characters: Z
, Y
, X
, W
, and V
. My documents can have 0 or more of these values in their my_enum
field.
My goal is to write a search query where a user can specify any subset of those enums (e.g. [Z, X, W]
) and I will return any documents that have a my_enum
field which is a subset of the specified set. So for a query asking for [Z, X, W]
I would return any documents that have the following my_enum
values:
[Z, X, W]
[Z, X]
[X, W]
[Z, W]
[Z]
[X]
[W]
[]
Solutions
I believe I've written two queries that return equivalent results. Here is an example for a search for the set [Z, X, W]
:
Using query_string
POST /oracle_cards/_search
{
"size": 100,
"query": {
"query_string": {
"default_field": "my_enum",
"query": "(Z -Y X W -V) OR (-Z -Y -X -W -V)"
}
},
"sort": [
{
"_id": {
"order": "desc"
}
}
]
}
Using bool
POST /oracle_cards/_search
{
"size": 100,
"query": {
"bool": {
"should": [
{
"match": {
"my_enum": "Z"
}
},
{
"match": {
"my_enum": "W"
}
},
{
"match": {
"my_enum": "X"
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "my_enum"
}
}
]
}
}
],
"must_not": [
{
"match": {
"my_enum": "Y"
}
},
{
"match": {
"my_enum": "V"
}
}
]
}
},
"sort": [
{
"_id": {
"order": "desc"
}
}
]
}
My Question
This query should be as fast as possible, so my question is: will one of these perform better than the other? If so, why? I imagine the query_string
will be less performant because it has to parse the query string first, but I'm not sure as I can't find any documentation about it.