Partial matching against dates

I'm sure this is is easy but couldn't find a simple answer.

If I have 'XXXX2000' in an index and I search '01022000' how can I accept that as a partial match, whereby the more of the word that matches the higher the score.

For example there could also be 01XX2000 stored, so here the match would be higher.

If you use ngrams to analyze your text, you will produce a lot of sub tokens. The more sub tokens match, the higher the score will be.

My 2 cents.

ngrams ... googling!

Hi David,

Quick eyes over, does this look correct (ngram from 1-8)?

      "settings": {
        "index": {
          "max_ngram_diff": 50
        },
        "analysis": {
          "analyzer": {
            "my_analyzer": {
              "filter": [
              ],
              "type": "custom",
              "tokenizer": "my_tokenizer"
            }
          },
          "tokenizer": {
            "my_tokenizer": {
              "token_chars": [
                "letter",
                "digit"
              ],
              "min_gram": 1,
              "type": "ngram",
              "max_gram": 8
            }
          }
        }
      },
      "mappings": {
        "properties": {
          "dob": {
            "type": "text",
            "analyzer": "my_analyzer",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }

Tested and is working. Thank you!

Hello again,

Requirements change means I need to revisit this. The date in question I can split onto the three indexes or combine. With this in mind, what is the best way to do:

DOB_full = "10102000" or DOB_YearMonth = "102000" or DOB_Year = "2000"

and for a bonus point a means to weigh/boost them differently.

One more, this is along the lines of what I'm after, however the syntax is incorrect with

reason":"[match] unknown token [START_OBJECT] after [match_all]
should.push({
  match: {
    query: {
      match_all: {}
    },
    functions: [
      {
        filter: {
          match: { fields:['dob1'], query }
        },
        weight: 60
      },
      {
        filter: {
          simple_query_string: { fields:['dob1'], query: '**'+query.slice(2) }
        },
        weight: 40
      },
      {
        filter: {
          simple_query_string: { fields:['dob1'], query: '****'+query.slice(4) }
        },
        weight: 20
      }
    ],
    score_mode: "max",
  }
});

Sorted I think.

function_score: {
  query: { match_all: {} },
  functions: [
    {
      filter: {
        match: {
          dob1: query
        }
      },
      weight: 1.1
    },
    {
      filter: {
        simple_query_string: {
          fields: ['dob1'],
          query: '**'+query.slice(2)
        }
      },
      weight: 1
    },
    {
      filter: {
        simple_query_string: {
          fields: ['dob1'],
          query: '****'+query.slice(4)
        }
      },
      weight: 0.85
    }
  ],
  score_mode: "max",
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.