Exclude certain fields from free text search - Elasticsearch

Hi all,
I have indexed documents with each over 100 field each analysed using Edge gram tokenizer to support Auto-Suggestion. I do require free text search that searches on all fields. When i am trying to do so, search is also happening fields with auto complete analyzed(ex. Data.autocomplete_analyzed). I have to restrict this by searching only fields analysed with type "text"(ex. Data). Is there a method to do so in 1. Index time 2. Query time.
Mapping file:

    "mappings": {
                "_doc": {
                    "properties": {
                        "Data": {
                            "type": "text",
                            "fields": {
                                "autocomplete_analyzed": {
                                    "type": "text",
                                    "analyzer": "autocomplete"
                                },
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        }
                } 
               }

Search query :

 {
"query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "aim",
            "type": "phrase",
            "slop": "2",
            "fields": []
          }
        },
        {
          "multi_match": {
            "query": "aim",
            "fuzziness": "1",
            "fields": []
          }
        }
      ],
      "minimum_should_match": 1 
}

@elastic can you please answer my query?

You could create a specific mapping with the default analyzer, then search on this field version.

PUT index
{
  "mappings": {
    "_doc": {
      "properties": {
        "Data": {
          "type": "text",
          "fields": {
            "default": {
              "type": "text",
              "analyzer": "standard"
            },
            "autocomplete_analyzed": {
              "type": "text",
              "analyzer": "autocomplete"
            },
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

And your query would be :

GET /index/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "aim",
            "type": "phrase",
            "slop": "2",
            "fields": [
              "*.default"
            ]
          }
        },
        {
          "multi_match": {
            "query": "aim",
            "fuzziness": "1",
            "fields": [
              "*.default"
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

@klof Thanks for your answer and time.
This is one way to solve my issue, but I have around millions of documents with over 100 fields indexed. Indexing this way will increase storage consumption. Is there any other method to avoid this duplicacy ? Or is there any way to avoid certain fields in query time?

I think it's fine, to have this new field version. But if you really want to optimize the mapping, you can set the main field as keyword and have only 2 other field versions.

PUT index
{
  "mappings": {
    "_doc": {
      "properties": {
        "Data": {
          "type": "keyword",
          "ignore_above": 256,
          "fields": {
            "default": {
              "type": "text",
              "analyzer": "standard"
            },
            "autocomplete_analyzed": {
              "type": "text",
              "analyzer": "autocomplete"
            }
          }
        }
      }
    }
  }
}

From what I know, I don't think it's possible to "deselect" fields at query time.

Thanks for your response
Is there a method to mention a set of fields to be ignored for search at the index time?

A post was split to a new topic: Problems with Elasticsearch after new download

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.