About ElasticSearch Queries generated by SQL Workbench

loco_d.iwamoto · July 24, 2023, 6:03am

Thank you for your assistance.
I would like to ask about the accuracy and speed of Elasticsearch Queries generated from QueryWorkBench.

This is IndexMapping.

"mappings": {
      "properties": {
        "a": {
          "type": "long"
        },
        "b": {
          "type": "long"
        },
        "c": {
          "type": "long"
        },
        "d": {
          "type": "long"
        },
        "e": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }

Now consider a query that performs AND searches on multiple criteria.
As in SQL, "WHERE condition AND condition AND condition ..." in SQL.

To create a query for Elasticsearch, the following SQL was created in QueryWorkBench and Explain was performed.

SELECT *
FROM test 
WHERE a = 1 AND b = 1 AND c = 3

Here are the results as analyzed by Query WorkBench.

{
    "from": 0,
    "size": 200,
    "timeout": "1m",
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "filter": [
                            {
                                "term": {
                                    "a": {
                                        "value": 1,
                                        "boost": 1.0
                                    }
                                }
                            },
                            {
                                "term": {
                                    "b": {
                                        "value": 1,
                                        "boost": 1.0
                                    }
                                }
                            }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1.0
                    }
                },
                {
                    "term": {
                        "c": {
                            "value": 3,
                            "boost": 1.0
                        }
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
        }
    },
    "_source": {
        "includes": [
            "a",
            "b",
            "c",
            "d",
            "e"
        ],
        "excludes": []
    },
    "sort": [
        {
            "_doc": {
                "order": "asc"
            }
        }
    ]
}

At this time, we see that the filter element always contains two elements.
This was true even when we increased the number of conditions searched for in Where.

Here's my question: why is this happening?
What is the advantage over putting everything in the same hierarchy in the FILTER and searching?

For example, you could do this, and it would be easier to generate the query dynamically in the API program.

{
    "from": 0,
    "size": 200,
    "timeout": "1m",
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "a": {
                            "value": 1,
                            "boost": 1.0
                        }
                    }
                },
                {
                    "term": {
                        "b": {
                            "value": 1,
                            "boost": 1.0
                        }
                    }
                },
                {
                    "term": {
                        "c": {
                            "value": 3,
                            "boost": 1.0
                        }
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
        }
    },
    "_source": {
        "includes": [
            "a",
            "b",
            "c",
            "d",
            "e"
        ],
        "excludes": []
    },
    "sort": [
        {
            "_doc": {
                "order": "asc"
            }
        }
    ]
}

That is all.
Thank you in advance.

dadoonet · July 24, 2023, 6:30am

Welcome.

I'm not sure about what you are asking. But indeed a generation tool is almost always not as smart as a human...

I'd even simplify the query actually like this:

{
    "query": {
        "bool": {
            "filter": [
                { "term": { "a": 1 } },
                { "term": { "b": 1 } },
                { "term": { "c": 3 } }
            ]
        }
    }
}

loco_d.iwamoto · July 24, 2023, 6:37am

Thank you for your answer.

I understand that the English is a little strange because it is automatically converted from Japanese. My apologies.

My question is, which is faster, that simple query or the complex nested query?
What is the meaning of the complex nested query? If you don't need it, then you don't have to do it.

{
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "filter": [
                            { "term": { "b": 1 } },
                            { "term": { "c": 3 } }
                        ]
                    }
                },
                { "term": { "a": 1 } }
            ]
        }
    }
}

If it is not necessary, then we will not use nesting for visibility purposes.

That is all.
Thank you in advance.

dadoonet · July 24, 2023, 8:38am

I think you can safely use the version I shared. It will do the same thing and is more readable.

system · August 21, 2023, 8:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
What's the best performance, to execute two different queries or a single? Elasticsearch	3	442	April 20, 2018
Is ElasticSearch the Right Tool for This Elasticsearch	3	495	July 6, 2017
Performance and curiosity question on combined filter queries + nested queries Elasticsearch	1	380	September 17, 2019
[SQL Translate API] Wrong query generated when querying for nested fields Elasticsearch	3	505	March 16, 2021
Filters vs Queries Elasticsearch	5	588	July 6, 2017

About ElasticSearch Queries generated by SQL Workbench

Related topics