Analyzer in Kibana


(Aarthini) #1

In kibana I m analyzing a phrase
POST /lrptest/_analyze
{
"analyzer": "my_analyzer",
"text": "BLISS LOGISTICS & SHIPPING PRIVATE LIMITED"
}
and output is
{
"tokens": [
{
"token": "bliss",
"start_offset": 0,
"end_offset": 5,
"type": "",
"position": 0
},
{
"token": "logist",
"start_offset": 6,
"end_offset": 15,
"type": "",
"position": 1
},
{
"token": "ship",
"start_offset": 18,
"end_offset": 26,
"type": "",
"position": 2
},
{
"token": "privat",
"start_offset": 27,
"end_offset": 34,
"type": "",
"position": 3
},
{
"token": "limit",
"start_offset": 35,
"end_offset": 42,
"type": "",
"position": 4
}
]
}

Why the word private ,logistics and limited is shortend ?


(David Pilato) #2

It depends on what "my_analyzer" is.


(Aarthini) #3

I m using this as analyzer
{
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "english",
"stopwords": "english"
}
}
}
}


(David Pilato) #4

AFAIK the english analyzer is using https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-porterstem-tokenfilter.html which is doing some stemming:

POST _analyze
{
  "text": "Limitation limited limit",
  "analyzer": "english"
}

Gives:

{
  "tokens": [
    {
      "token": "limit",
      "start_offset": 0,
      "end_offset": 10,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "limit",
      "start_offset": 11,
      "end_offset": 18,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "limit",
      "start_offset": 19,
      "end_offset": 24,
      "type": "<ALPHANUM>",
      "position": 2
    }
  ]
}

(Aarthini) #5

Thank you


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.