Inquiry on Elastic Search Terms for Prepositions

Hello Elastic,

My dev team did a testing on the search terms for Elasticsearch on our apps site and the findings is that Elastic didn't highlight I would group it as preposition terms, such as 'the', 'with', are and etc.

Attached is the images where for example we search for terms "we are excited", however, the "are" terms did not highlighted.

Appreciate your guidance on this.

Thanks!

It's a question of analyzer. What is the mapping for this field ?

I'm not sure on the mapping part, but the way I ingest the data is through MSSQL Connectors.

They way you are collecting data (connectors, direct push, ETL...) does not influence the mapping.
What is your index name?
Check its settings in Kibana.

We have multiple fields involved and from what I observed, they have the same structure as below :

   {
        "type": "text",
        "fields": {
          "delimiter": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "iq_text_delimiter"
   }

Not sure if this is your asks, let me know if you require more info.

Thanks!

Could you share the full index settings (including the mapping)?

You can get this with:

GET /INDEXNAME

Hi yea sure :

{
  "search-#########": {
    "aliases": {},
    "mappings": {
      "dynamic": "true",
      "dynamic_templates": [
        {
          "data": {
            "match_mapping_type": "string",
            "mapping": {
              "analyzer": "iq_text_base",
              "fields": {
                "prefix": {
                  "search_analyzer": "q_prefix",
                  "analyzer": "i_prefix",
                  "type": "text",
                  "index_options": "docs"
                },
                "delimiter": {
                  "analyzer": "iq_text_delimiter",
                  "type": "text",
                  "index_options": "freqs"
                },
                "joined": {
                  "search_analyzer": "q_text_bigram",
                  "analyzer": "i_text_bigram",
                  "type": "text",
                  "index_options": "freqs"
                },
                "enum": {
                  "ignore_above": 2048,
                  "type": "keyword"
                },
                "stem": {
                  "analyzer": "iq_text_stem",
                  "type": "text"
                }
              },
              "index_options": "freqs",
              "type": "text"
            }
          }
        }
      ],
      "properties": {
        "_subextracted_as_of": {
          "type": "date"
        },
        "_subextracted_version": {
          "type": "keyword"
        },
        "_timestamp": {
          "type": "date"
        },
        "database": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_channelid": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_commentcount": {
          "type": "long"
        },
        "dbo_post_description": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_filecount": {
          "type": "long"
        },
        "dbo_post_files_in_post": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_id": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_isprivate": {
          "type": "boolean"
        },
        "dbo_post_likecount": {
          "type": "long"
        },
        "dbo_post_publishdate": {
          "type": "date"
        },
        "dbo_post_slug": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_thumbnailimage": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_thumbnailvideourl": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_title": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_users_in_channel": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "dbo_post_viewcount": {
          "type": "long"
        },
        "id": {
          "type": "keyword"
        },
        "schema": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        },
        "table": {
          "type": "text",
          "fields": {
            "delimiter": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "iq_text_delimiter"
            },
            "enum": {
              "type": "keyword",
              "ignore_above": 2048
            },
            "joined": {
              "type": "text",
              "index_options": "freqs",
              "analyzer": "i_text_bigram",
              "search_analyzer": "q_text_bigram"
            },
            "prefix": {
              "type": "text",
              "index_options": "docs",
              "analyzer": "i_prefix",
              "search_analyzer": "q_prefix"
            },
            "stem": {
              "type": "text",
              "analyzer": "iq_text_stem"
            }
          },
          "index_options": "freqs",
          "analyzer": "iq_text_base"
        }
      }
    },
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "2",
        "auto_expand_replicas": "0-3",
        "provided_name": "search-#########",
        "creation_date": "1710725489323",
        "analysis": {
          "filter": {
            "front_ngram": {
              "type": "edge_ngram",
              "min_gram": "1",
              "max_gram": "12"
            },
            "bigram_joiner": {
              "max_shingle_size": "2",
              "token_separator": "",
              "output_unigrams": "false",
              "type": "shingle"
            },
            "bigram_max_size": {
              "type": "length",
              "max": "16",
              "min": "0"
            },
            "en-stem-filter": {
              "name": "light_english",
              "type": "stemmer",
              "language": "light_english"
            },
            "bigram_joiner_unigrams": {
              "max_shingle_size": "2",
              "token_separator": "",
              "output_unigrams": "true",
              "type": "shingle"
            },
            "delimiter": {
              "split_on_numerics": "true",
              "generate_word_parts": "true",
              "preserve_original": "false",
              "catenate_words": "true",
              "generate_number_parts": "true",
              "catenate_all": "true",
              "split_on_case_change": "true",
              "type": "word_delimiter_graph",
              "catenate_numbers": "true",
              "stem_english_possessive": "true"
            },
            "en-stop-words-filter": {
              "type": "stop",
              "stopwords": "_english_"
            }
          },
          "analyzer": {
            "i_prefix": {
              "filter": [
                "cjk_width",
                "lowercase",
                "asciifolding",
                "front_ngram"
              ],
              "type": "custom",
              "tokenizer": "standard"
            },
            "iq_text_delimiter": {
              "filter": [
                "delimiter",
                "cjk_width",
                "lowercase",
                "asciifolding",
                "en-stop-words-filter",
                "en-stem-filter"
              ],
              "type": "custom",
              "tokenizer": "whitespace"
            },
            "q_prefix": {
              "filter": [
                "cjk_width",
                "lowercase",
                "asciifolding"
              ],
              "type": "custom",
              "tokenizer": "standard"
            },
            "iq_text_base": {
              "filter": [
                "cjk_width",
                "lowercase",
                "asciifolding",
                "en-stop-words-filter"
              ],
              "type": "custom",
              "tokenizer": "standard"
            },
            "iq_text_stem": {
              "filter": [
                "cjk_width",
                "lowercase",
                "asciifolding",
                "en-stop-words-filter",
                "en-stem-filter"
              ],
              "type": "custom",
              "tokenizer": "standard"
            },
            "i_text_bigram": {
              "filter": [
                "cjk_width",
                "lowercase",
                "asciifolding",
                "en-stem-filter",
                "bigram_joiner",
                "bigram_max_size"
              ],
              "type": "custom",
              "tokenizer": "standard"
            },
            "q_text_bigram": {
              "filter": [
                "cjk_width",
                "lowercase",
                "asciifolding",
                "en-stem-filter",
                "bigram_joiner_unigrams",
                "bigram_max_size"
              ],
              "type": "custom",
              "tokenizer": "standard"
            }
          }
        },
        "number_of_replicas": "2",
        "uuid": "02yV18gHT5mERWQvKnT9-Q",
        "version": {
          "created": "8500010"
        }
      }
    }
  }
}

Which field is used when doing the search?

Actually could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script is something anyone can copy and paste in Kibana dev console, click on the run button to reproduce your use case. It will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

Have a look at the Elastic Stack and Solutions Help · Forums and Slack | Elastic page. It contains also lot of useful information on how to ask for help.

Fields involved for search :

  1. dbo_post_description
  2. dbo_post_title
  3. dbo_post_files_in_post
  4. dbo_post_slug

Below is our request script :

GET /search-portal/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "dbo_post_isprivate": {
              "value": "false"
            }
          }
        },
        {
          "multi_match": {
            "query": "we are a team",
            "fields": ["dbo_post_title",
          "dbo_post_title.stem",
          "dbo_post_description",
          "dbo_post_description.stem",
          "dbo_post_files_in_post",
          "dbo_post_files_in_post.stem",
          "dbo_post_slug",
          "dbo_post_slug.stem"]
          }
        },
        {
          "range": {
            "dbo_post_publishdate": {
              "gte": "2015-09-01T08:15:11.626015",
              "lte": "2024-09-01T08:15:11.626015"
            }
          }
        }
      ]
    }
  }
}

Could you complete the script so we can reproduce this locally and gives advices?
It needs a sample index to be created with sample data.
Note that it might not needed to reproduce on many fields but with one single field...

So the example is even easier to understand and build upon.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.