Analyzer Problems

Hey guys, ES newbie here - I'm currently working on a data-set having the format of the following:

// "desc": "The Quick and Brown Fox",
// "amount": 1000

I'm passing the "desc" field through a stop analyzer and it is functioning correctly as verified when searching. However, I would like something to show me the results of the analyzer as they are seen by the ES Backend. Essentially, I would like the keywords produced by the analyzer for the entire data-set to be produced as an output. similar to this:

// "desc": ["quick", "brown", "fox"]
// "amount": [1000]

Thanks in advance

It won't run on your entire dataset, but are you aware of the Analyze API? For a given text, it will show you the effect of an analyzer. For example:

GET _analyze
{
  "analyzer" : "stop",
  "text" : "The Quick and Brown Fox"
}

returns:

{
  "tokens": [
    {
      "token": "quick",
      "start_offset": 4,
      "end_offset": 9,
      "type": "word",
      "position": 1
    },
    {
      "token": "brown",
      "start_offset": 14,
      "end_offset": 19,
      "type": "word",
      "position": 3
    },
    {
      "token": "fox",
      "start_offset": 20,
      "end_offset": 23,
      "type": "word",
      "position": 4
    }
  ]
}

Hi Abdon,

Thanks for the reply. I did use the analyze API, however, as you rightly said, it only gives the effect of the analyzer on the given text. I have found what I was looking for however - Term Vectors provide me with the tokens that the analyzer creates when indexing.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.