Word_delimiter_graph with pattern_replace

Wonder_Garance · March 28, 2023, 4:49pm

In the settings I use the filter word_delimiter_graph and other filters to uniform some reference numbers, and I want to keep the original.

For these references: 01/01234 11-2-3.4.5/67/8
I wish to get the output from conditional_number_word_delimiter_graph:
01/01234, 01_01234
11-2-3.4.5/67/8, 11_2_3_4_5_67_8

How can I do it?

Here is my filter, at the output I get only:
01_01234
11_2_3_4_5_67_8

      "number_word_delimiter_graph": {
        "type": "word_delimiter_graph",
        "catenate_words": false,
        "catenate_numbers": false,
        "generate_number_parts": false,
        "generate_word_parts": false,
        "split_on_case_change": false,
        "split_on_numerics": false,
        "catenate_all": false,
        "preserve_original": true,
        "adjust_offsets": true
      },
      "pattern_number_uniform": {
        "type": "pattern_replace",
        "pattern": "[.\\-/]",
        "replacement": "_"
      },
      "conditional_number_word_delimiter_graph": {
        "type": "condition",
        "filter": ["number_word_delimiter_graph", "pattern_number_uniform"],
        "script": {
          "lang": "painless",
          "source": "!token.isKeyword() && token.getTerm().toString().indexOf('%') < 0"
        }
      }

RabBit_BR · March 29, 2023, 1:46pm

Hi @Wonder_Garance

I believe the problem is the pattern_number_uniform filter. After you apply the number_word_delimiter_graph the next filter will replace characters and with that you will have only one token.

I see that to solve this you could create a subfield where you apply the replace filter. So you have a field with the original value and a sub with the replace rule. In the search, you will apply match the two fields.

system · April 26, 2023, 1:47pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Issue with using word delimiter filter Elasticsearch	5	569	July 6, 2017
Issue with using word delimiter Elasticsearch	1	605	July 6, 2017
Word Delimiter Filter Elasticsearch	1	303	July 6, 2017
Word delimiter filter with preserve_original Elasticsearch	4	747	December 18, 2019
Word_delimiter_graph + preserve_original = token position matters for "match" query Elasticsearch	2	429	April 29, 2020

Word_delimiter_graph with pattern_replace

Related topics