Different results with fingerprint analyzer

wshatford · March 12, 2017, 3:47pm

The first test is missing the "e" in "sante"

GET /_analyze?analyzer=fingerprint&text="Santé the in a Monica"
=> "token": "a in monica sant the"

GET /_analyze
{
"analyzer": "fingerprint",
"text": "Santé the in a Monica"
}
=> "token": "a in monica sante the"

polyfractal · March 12, 2017, 5:58pm

Hm, I think it's something about passing it as a URI parameter. If you run it in the body (like your second example), it seems to work as expected:

POST _analyze
{
  "analyzer": "fingerprint",
  "text": "Santé the in a Monica"
}

# POST _analyze
{
  "tokens": [
    {
      "token": "a in monica sante the",
      "start_offset": 0,
      "end_offset": 21,
      "type": "fingerprint",
      "position": 0
    }
  ]
}

GET /_analyze?analyzer=fingerprint&text="Santé the in a Monica"
# GET /_analyze?analyzer=fingerprint&text="Santé the in a Monica"
{
  "tokens": [
    {
      "token": "a in monica sant the",
      "start_offset": 0,
      "end_offset": 23,
      "type": "fingerprint",
      "position": 0
    }
  ]
}

wshatford · March 12, 2017, 6:12pm

Thanks. I'd like to be able to run it either way and get the same results. I'll run in the "body" format for now.

dadoonet · March 12, 2017, 6:17pm

Yeah. Was probably caused by a non UTF8 encoding IMO.

system · April 9, 2017, 6:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Field not "analyzer":"fingerprint" Elasticsearch	1	350	April 7, 2017
Using analyzer defined in index leads not to the same result when using _analyze API Elasticsearch	3	377	February 26, 2020
Analyzer API does not work for Elasticsearch 1.7 Elasticsearch	3	592	May 3, 2017
Dynamic Analyzer in Search Elasticsearch	4	1043	July 5, 2017
Asciifolding analyzer Elasticsearch	9	1623	July 6, 2017

Different results with fingerprint analyzer

Related topics