ESE : Practice Exam - Task 5 Solution - Alternative

suselena22 · April 29, 2021, 4:58am

Hi There:
The solution was given in the Practice Exam, just a bit confused with the custom analyzer and script query. Please advise, appreciate your helps. Thank you so much, see below Task 5 question and solutions

Task 5 Question
Create a new index on cluster1 named task5 that satisfies the following requirements:

contains all of the documents from the blogs index
whenever the string UK (both capital letters) appears in the content field, it gets replaced with United Kingdom

Practice Exam Given solution

PUT task5
{
  "settings": {
    "analysis": {
      "char_filter": {
        "uk_filter": {
          "type": "mapping",
          "mappings": [
            "UK => United Kingdom"
            ]
        }
      },
      "analyzer": {
        "content_analyzer": {
          "tokenizer": "standard",
          "char_filter": ["uk_filter"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text",
        "analyzer": "content_analyzer"
      }
    }
  }
}

POST _reindex
{
  "source": {
    "index": "blogs"
  },
  "dest": {
    "index": "task5"
  }
}

Why not this solution? (my own)

POST _reindex
{
  "source": {
    "index": "blogs"
  },
  "dest": {
    "index": "task5"
  },
  "script": {
    "source": "if (ctx['content'] == 'UK') {ctx['content']=='United Kingdom'}",
    "lang": "painless"
  }
}

Peter_Steenbergen · April 29, 2021, 6:46am

Your solution would work, but it would be a one time operation.
When doing the _reindex only with a script, you would have to rerun your script, or do an update by query to make new data coming in to be analyzed also for UK => United Kingdom.

With setting it in the mapping new data ingested gets analyzed automatically with the new index in play. So it 'fixes' the current data of the old index while keeping it correct with new incoming data.

dschneiter · April 29, 2021, 6:56am

Hi @suselena22

I’d like to add another aspect to Peter’s response: Often there is not just one way of getting a task solved. Having said so, if your solution is different, but solves the task set, it should be fine.

Just carefully read the task in the exam, whether you are asked to apply a “one-time” fix or a “general” fix.

Daniel

suselena22 · April 29, 2021, 10:18am

This is awesome, I'm now fully understand the advantage of having the custom analyzer and script query. So, the custom analyzer is definitely a better solution because it fixes the current data and future data.

Thank you Peter

suselena22 · April 29, 2021, 10:20am

Thank you for the message and tips, Daniel. I'm now have more confident to sit for the exam

pmusa · April 30, 2021, 9:17am

You could easily add your script to an ingest pipeline and also fix new/future data.

That said, I don't think that this is the best solution as it changes the data itself and many times that is not what users want. The main advantage of the analysis option is that it only changes the internal data structures used to search.

suselena22 · April 30, 2021, 9:30am

Nice, thanks Pablo

system · May 30, 2021, 9:30am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Custom Analyzer versus Ingest Pipeline Elasticsearch	3	731	May 19, 2020
POST analyzed, Index Elasticsearch	19	1219	April 25, 2018
Change/specify custom analyzer to existing index or how to reindex with custom analyzer Elasticsearch	3	88	October 28, 2024
Can I modify analysis setting of an exist index? Elasticsearch	4	721	July 5, 2017
Some thoughts on adding new Analyzer to existing Index without reindex or close Elasticsearch	6	2106	November 7, 2021

ESE : Practice Exam - Task 5 Solution - Alternative

Related topics