Search on Multiple Indices

I doing a query to search for multiple indices on Elasticsearch. The problem here is the priority of the result doesn't follow what I think.

For example, I have 2 indices: index_1 and index_2

In Index_1 have ~80 docs, contain:

  • Doc_1_A: contain 2 keyword "test search"
  • Doc_1_B: contain 1 keyword "test search"

In Index_2 have ~10 docs, contain:

  • Doc_2_A: contain 2 keyword "test search"

my request:

***/index_1,index_2/_search)

my query:

{
    "query": {
        "bool": {
            "must": [
                {
                    "query_string": {
                        "fields": [
                            "Name",
                            "Desc"
                        ],
                        "query": "test search",
                        "analyzer": "whitespace",
                        "fuzziness": "AUTO",
                        "default_operator": "AND"
                    }
                }
            ]
        }
    },
    "size": 20,
    "highlight": {
        "require_field_match": true,
        "fields": {
            "*": {
                "fragment_size": 100,
                "number_of_fragments": 1
            }
        }
    }
}

The result I got after the request search show the priority like this:
Doc_1_A >> Doc_1_B >> Doc_2_A
What I expect is: (since Doc_2_A has more match keyword than Doc_1_B)
Doc_1_A >> Doc_2_A >> Doc_1_B

I do read about the score of ES, it got affected by the TF and IDF (the result got mess up cause the number of docs in Index_1 is much more larger than the Index_2). The score of Doc_2_A is much more smaller than Doc_1_B, this is why the priority doesn't look like what I expect.

When I reindex my two indices into one ( index_merged ), and then search on that mutual index, the result would be like what I expect. But with this new index, I can't update this when there is a new doc or a doc got removed.

But if I search on multi index like my query, I can't get what I expect.

So there is any way to help me with this?

I've been through a similar situation, my indexes had different mappings with some fields in common and I chose multi search instead of your approach.
I was reading the doc and saw that you can boost indices individually, I don't know if it can help you.

Thanks for your reply,

I did use the boost indices, it does help me to boost the index I want a little higher.

I did try the new approach with multi search that u mentioned, but it seem a little off and I don't quite get the example show in the page. Can u show me an example with this or any sources that mention about it, it would mean much for me.

What multi search proposes is to receive one or more queries and return the results to the queries, it will not be just one result.
Look this example:

GET _msearch
{"index":"idx_movies"}
{"size":1, "_source": ["title"], "query":{"match_all":{}}}
{"index":"my-index"}
{"size":1, "query":{"match_all":{}}}

Response:

{
  "took" : 3,
  "responses" : [
    {
      "took" : 0,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1001,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "idx_movies",
            "_type" : "_doc",
            "_id" : "vpwMOYEBBR941NShMW3o",
            "_score" : 1.0,
            "_source" : {
              "title" : "Guardians of the Galaxy"
            }
          }
        ]
      },
      "status" : 200
    },
    {
      "took" : 2,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 2,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "my-index",
            "_type" : "_doc",
            "_id" : "1",
            "_score" : 1.0,
            "_source" : {
              "my_text" : "text1",
              "my_vector" : [
                0.5,
                10,
                6
              ]
            }
          }
        ]
      },
      "status" : 200
    }
  ]
}
1 Like

Thanks, I will try to do this with your example.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.