Serch text with special character

I really need help.
I indexed my data using logstash.
The data is as follows:

20200807	00:10:02.934	 Mes  émis à l'appli Hôte 3 3 -1

The issue is that, I need to search "3 3-1", but it seems that the caracter "-" cause problem !

POST /topia72-gest2019.07.12/_search
{
  "size": 10000,
  "query": {
    "match_phrase": {
      "data": "3 3 -1"
    }
  }
}

but I also get the lines that contain 3 3 1 !!!

Two options here:

  1. Use a custom choice of Analyzer on your text field that preserves the characters you want to keep or
  2. Use a keyword field that keeps your string as a single token

can you explain more please ?

Your choice of Analyzer dictates what characters are kept in the index and what characters are punctuation that can be thrown away. This blog is old but provides some background.

If your strings are very short structured fields like a product code you may decide an Analyzer is not useful and want to use a simpler keyword field that keeps all the characters.

Do you think that specifying "index": "not_analyzed" could be a solution?
I've read several articles, but I still can't find a solution!

A keyword field is effectively the preferred way of saying that.

I tried with this:

PUT demo_index
{
  "mappings": {
      "properties": {
        "field": {
          "type": "keyword"
        }
      }
    }
}

PUT demo_index/_doc/1
{
  "field": "Msg 3 1 -1"
}

POST demo_index/_search
{
  "query": {
    "match": {
      "field": "3 1 -1"
    }
  }
}

But it's not working!

The choice between matching in text fields and matching keyword is a choice between matching words or full-values.
If your content doesn't hold what most people would consider to be a word ("quick brown foxes" etc) then the text field is probably not for you. No one can agree where one word begins and one word ends so your index and the values people type into the search engine are not underpinned by a shared understanding of the vocabulary.
So, we switched tack to the keyword field, treating the string as a whole value. You didn't match because your query is missing part of the value ("Msg"). You will need to use a RegExp or wildcard query to declare that you're only providing part of a value.

Do you have an example?

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.