Multiple bool-should-match phrase query optimization

dangnh · December 19, 2016, 8:29pm

Hello all, we are now query elasticsearch to get exactly documents which have ID should match one in the ID array. It's smiliar to the following SQL query:

SELECT * FROM myindex WHERE transaction_id IN (id1, id2, id3)

translate it to Elasticsearch Query:

{
	"from": 0,
	"size": 10000,
	"query": {
		"bool": {
			"must": {
				"bool": {
					"should": [
						{
							"match": {
								"transaction_id": {
									"query": "a",
									"type": "phrase"
								}
							}
						},
						{
							"match": {
								"transaction_id": {
									"query": "b",
									"type": "phrase"
								}
							}
						},
						{
							"match": {
								"transaction_id": {
									"query": "c",
									"type": "phrase"
								}
							}
						}
					]
				}
			}
		}
	}
}

However, with ~136 million document (continue growing) and size of the ID array is ~5000, this query come extremely slow.
Any suggestion to optimize this?

nik9000 · December 19, 2016, 9:10pm

The terms query (docs) is designed to match one of many terms. Have a look at that one. It isn't analyzed and doesn't support phrase queries, only single terms, but it might help here if you can use it.

dangnh · January 12, 2017, 9:51pm

I was profile both queries using profile API and it seems to be they are identical. Confused now

nik9000 · January 13, 2017, 12:28am

Terms query rewrites to a bool query with a bunch of should clauses if
there are fewer than a certain never of terms iirc.

dangnh · January 13, 2017, 12:30am

Thank you So what happens if there are many terms? Can you guide me to some resources about this?

dangnh · January 13, 2017, 1:07am

@nik9000 I was test with 1056 difference term in one query, from profile API I can see that it still rewrite to many of should clauses.

littlepoint · January 18, 2017, 8:07am

how about using multi-match?
{
"query":{
"multi_match": {
"query": q,
"fields": ["transaction_id"],
"type": "cross_fields",
}
}
}

system · February 15, 2017, 8:08am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to ensure all words are matched across multiple bool.should.clauses? Elasticsearch	1	532	August 20, 2017
Bool Query with multiple term queries inside should OR MultiSearch Queries Elasticsearch	3	5213	January 29, 2018
Speed up filtered multi_match bool phrase query Elasticsearch	4	1099	July 5, 2017
Match only N queries in a large boolean query for a single result Elasticsearch	1	201	November 11, 2021
Exact + partial match of text documents (with bool query?) Elasticsearch	2	1059	March 8, 2020

Multiple bool-should-match phrase query optimization

Related topics