About ElasticSearch Queries generated by SQL Workbench

Thank you for your assistance.
I would like to ask about the accuracy and speed of Elasticsearch Queries generated from QueryWorkBench.

This is IndexMapping.

"mappings": {
      "properties": {
        "a": {
          "type": "long"
        },
        "b": {
          "type": "long"
        },
        "c": {
          "type": "long"
        },
        "d": {
          "type": "long"
        },
        "e": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }

Now consider a query that performs AND searches on multiple criteria.
As in SQL, "WHERE condition AND condition AND condition ..." in SQL.

To create a query for Elasticsearch, the following SQL was created in QueryWorkBench and Explain was performed.

SELECT *
FROM test 
WHERE a = 1 AND b = 1 AND c = 3

Here are the results as analyzed by Query WorkBench.

{
    "from": 0,
    "size": 200,
    "timeout": "1m",
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "filter": [
                            {
                                "term": {
                                    "a": {
                                        "value": 1,
                                        "boost": 1.0
                                    }
                                }
                            },
                            {
                                "term": {
                                    "b": {
                                        "value": 1,
                                        "boost": 1.0
                                    }
                                }
                            }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1.0
                    }
                },
                {
                    "term": {
                        "c": {
                            "value": 3,
                            "boost": 1.0
                        }
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
        }
    },
    "_source": {
        "includes": [
            "a",
            "b",
            "c",
            "d",
            "e"
        ],
        "excludes": []
    },
    "sort": [
        {
            "_doc": {
                "order": "asc"
            }
        }
    ]
}

At this time, we see that the filter element always contains two elements.
This was true even when we increased the number of conditions searched for in Where.

Here's my question: why is this happening?
What is the advantage over putting everything in the same hierarchy in the FILTER and searching?

For example, you could do this, and it would be easier to generate the query dynamically in the API program.

{
    "from": 0,
    "size": 200,
    "timeout": "1m",
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "a": {
                            "value": 1,
                            "boost": 1.0
                        }
                    }
                },
                {
                    "term": {
                        "b": {
                            "value": 1,
                            "boost": 1.0
                        }
                    }
                },
                {
                    "term": {
                        "c": {
                            "value": 3,
                            "boost": 1.0
                        }
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
        }
    },
    "_source": {
        "includes": [
            "a",
            "b",
            "c",
            "d",
            "e"
        ],
        "excludes": []
    },
    "sort": [
        {
            "_doc": {
                "order": "asc"
            }
        }
    ]
}

That is all.
Thank you in advance.

Welcome.

I'm not sure about what you are asking. But indeed a generation tool is almost always not as smart as a human... :wink:

I'd even simplify the query actually like this:

{
    "query": {
        "bool": {
            "filter": [
                { "term": { "a": 1 } },
                { "term": { "b": 1 } },
                { "term": { "c": 3 } }
            ]
        }
    }
}

Thank you for your answer.

I understand that the English is a little strange because it is automatically converted from Japanese. My apologies.

My question is, which is faster, that simple query or the complex nested query?
What is the meaning of the complex nested query? If you don't need it, then you don't have to do it.

{
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "filter": [
                            { "term": { "b": 1 } },
                            { "term": { "c": 3 } }
                        ]
                    }
                },
                { "term": { "a": 1 } }
            ]
        }
    }
}

If it is not necessary, then we will not use nesting for visibility purposes.

That is all.
Thank you in advance.

I think you can safely use the version I shared. It will do the same thing and is more readable.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.