Hello there,
I'm running tests in my elasticsearch cluster trying to optimize parallel queries, sorprisinly the _msearch querie is constantly and way slower than the individual _search API.
My test:
from the same endpoint:
im doing 10 times:
[GET] {index}/document/_search
{
"size": 10,
"sort" : {
"_script" : {
"script" : "Math.random()",
"type" : "number",
"params" : {},
"order" : "asc"
}
},
"query": {
"query_string": {
"query": "*:*"
}
}
}
then:
sending just one call with
[POST] (binary) /_msearch
{"index":"jobs","type":"document"}
{"size": 10,"sort" : {"_script" : {"script" : "Math.random()","type" : "number","params" : {},"order" : "asc"}},"query": {"query_string": {"query": "*:*" }}}
{"index":"jobs","type":"document"}
{"size": 10,"sort" : {"_script" : {"script" : "Math.random()","type" : "number","params" : {},"order" : "asc"}},"query": {"query_string": {"query": "*:*" }}}
{"index":"jobs","type":"document"}
{"size": 10,"sort" : {"_script" : {"script" : "Math.random()","type" : "number","params" : {},"order" : "asc"}},"query": {"query_string": {"query": "*:*" }}}
{"index":"jobs","type":"document"}
{"size": 10,"sort" : {"_script" : {"script" : "Math.random()","type" : "number","params" : {},"order" : "asc"}},"query": {"query_string": {"query": "*:*" }}}
{"index":"jobs","type":"document"}
{"size": 10,"sort" : {"_script" : {"script" : "Math.random()","type" : "number","params" : {},"order" : "asc"}},"query": {"query_string": {"query": "*:*" }}}
{"index":"jobs","type":"document"}
{"size": 10,"sort" : {"_script" : {"script" : "Math.random()","type" : "number","params" : {},"order" : "asc"}},"query": {"query_string": {"query": "*:*" }}}
{"index":"jobs","type":"document"}
{"size": 10,"sort" : {"_script" : {"script" : "Math.random()","type" : "number","params" : {},"order" : "asc"}},"query": {"query_string": {"query": "*:*" }}}
.
.
.
both queries return exactly the smae thing, notice im using random to prevent elasticsearch from caching any filter or results.
doing 10 individual queries (not even using persistent conections, AKA we need to add the HELO 10x50ms = 500ms extra) is cosistently faster than using _msearch
results individual ~4400ms
results multi ~10230ms
which is more than x2 the time... ideas?
i really hope im doing something stupid, but im applying exactly as the documentation says, and im not getting any "good" thing about this multi crap stuff
thanks in advance for any help
Daniel.