ESE : Practice Exam - Task 5 Solution - Alternative

Hi There:
The solution was given in the Practice Exam, just a bit confused with the custom analyzer and script query. Please advise, appreciate your helps. Thank you so much, see below Task 5 question and solutions

Task 5 Question
Create a new index on cluster1 named task5 that satisfies the following requirements:

  • contains all of the documents from the blogs index
  • whenever the string UK (both capital letters) appears in the content field, it gets replaced with United Kingdom

Practice Exam Given solution

PUT task5
{
  "settings": {
    "analysis": {
      "char_filter": {
        "uk_filter": {
          "type": "mapping",
          "mappings": [
            "UK => United Kingdom"
            ]
        }
      },
      "analyzer": {
        "content_analyzer": {
          "tokenizer": "standard",
          "char_filter": ["uk_filter"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text",
        "analyzer": "content_analyzer"
      }
    }
  }
}

POST _reindex
{
  "source": {
    "index": "blogs"
  },
  "dest": {
    "index": "task5"
  }
}

Why not this solution? (my own) :slight_smile:

POST _reindex
{
  "source": {
    "index": "blogs"
  },
  "dest": {
    "index": "task5"
  },
  "script": {
    "source": "if (ctx['content'] == 'UK') {ctx['content']=='United Kingdom'}",
    "lang": "painless"
  }
}

Your solution would work, but it would be a one time operation.
When doing the _reindex only with a script, you would have to rerun your script, or do an update by query to make new data coming in to be analyzed also for UK => United Kingdom.

With setting it in the mapping new data ingested gets analyzed automatically with the new index in play. So it 'fixes' the current data of the old index while keeping it correct with new incoming data.

1 Like

Hi @suselena22

I’d like to add another aspect to Peter’s response: Often there is not just one way of getting a task solved. Having said so, if your solution is different, but solves the task set, it should be fine.

Just carefully read the task in the exam, whether you are asked to apply a “one-time” fix or a “general” fix.

Daniel

1 Like

This is awesome, I'm now fully understand the advantage of having the custom analyzer and script query. So, the custom analyzer is definitely a better solution because it fixes the current data and future data.

Thank you Peter :star_struck: :star_struck:

Thank you for the message and tips, Daniel. I'm now have more confident to sit for the exam :star_struck: :star_struck:

You could easily add your script to an ingest pipeline and also fix new/future data.

That said, I don't think that this is the best solution as it changes the data itself and many times that is not what users want. The main advantage of the analysis option is that it only changes the internal data structures used to search.

1 Like

Nice, thanks Pablo