Missing results using query_string wildcard query with rewrite top_terms_n

I'm using Elasticsearch version 7.1.1.

I'm experiencing a weird behaviour.
When using top_terms_n in order to obtain a score in wildcard query_string queries almost all results are missing. Without top_terms_n I obtain 300 results out of 10M, however with the rewrite I obtain 3 results. l Is this behaviour expected? Is there any way I can obtain all the results and still have relevant scores calculated in wildcard matches?

The following query returns about 300 results (out of 10M), without rewrite.

{
  "query": {
    "query_string": {
      "query": " +_data.cern_id:(ELG-*) +object_type:d +_data.author.full_name:eric"
    }
  }
}

However if I search with rewrite top_terms_n I only get 3 results back.

{
  "query": {
    "query_string": {
      "query": " +_data.cern_id:(ELG-*) +object_type:d +_data.author.full_name:eric",
      "rewrite":"top_terms_1000"
    }
  }
}

The most confusing part is that if I change the wildcard to be more specific, eg instead of ELG-* searching ELG-GENNET-*, maintain the remaining queries I obtain more results, 8 instead of 3.

{
  "query": {
    "query_string": {
      "query": " +_data.cern_id:(ELG-GENNET-*) +object_type:d +_data.author.full_name:eric",
      "rewrite":"top_terms_1000"
    }
  }
}

Aditional details: I'm using dfs_query_then_fetch.
I'm setting the mentioned field cern_id as a keyword:

          "cern_id": {
            "type": "keyword",
            "normalizer": "case_accent_normalizer",
            "boost": 5
          },

And the mentioned case_accent_normalizer normalizer is

        "case_accent_normalizer": {
          "type": "custom",
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }

Thanks in advance!

1 Like

bump

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.