Is there a NOOP analyzer?


(James) #1

I need to add a custom analyzer for a field during INDEX but I do not want any analysis done of the search term. I tried looking for an analyzer that would just group the entire search term as a complete phrase, like:

"dummy_analyzer":{
        "type":"pattern",
        "pattern":"00xyzzy00"  <-- a dummy string trying to never separate words
}

I have my field mapped to use my custom analyzer during index and dummy analyzer during search, but I'm not convinced that's working (here is a long description of why I think it's not working).

What is a good (/the right) way to put a custom analyzer on field for INDEX and have the field not analyzed during search?


(Magnus B├Ąck) #2

Are you looking for the keyword analyzer?


(Nik Everett) #3

Keyword


(James) #4

Thanks for your response. I tried keyword, but this still seems to parse individual words? Or, am I misinterpreting?

localhost:9200/myindex/_validate/query?explain=true&analyzer=keyword

with

{ "query" : {
   "query_string" : {
      {"query" : "Hello there",
       "fields" : ["name"],
       "analyzer":"keyword"  <=put here and in validate call for grins
      } 
   }
}

my response is:

<explanations>
<e>
   <explanation>part.name:Hello part.name:there</explanation>
   <index>myindex</index>
   <valid>true</valid>
</e>
</explanations>

Doesn't this mean the keyword analyzer is still splitting my search into two tokens, "Hello" and "there"?


(Nik Everett) #5

Sadly, query_string always splits terms on spaces no matter what you put in the analyzer field. That is part of its query language. It just uses the analyzer to analyze the terms. I suspect match query will do what you want though. It doesn't support all of the flexibility that query_string supports but that is generally a good thing.


(James) #6

Wow, thank you for that! (I've been struggling with that one for that last two days.)

I'm a noob, but doesn't that seem like a bug (I will submit)? -- that the query_string query ignores the tokenizer on my analyzer?


(Nik Everett) #7

Its how its always worked and how it will forever work. If you are willing, have a read through the query_string [docs][https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html] and click the edit button and fix anything that is unclear. I'm not sure I can do it because I'm so used to query_string's quirks that I assume too much when reading the docs.


(James) #8

Ok, will do. Regarding the docs, a few things additionally confused me:

  • "phrase_slop" - I read the docs to say that if I have phrase_slop = 0 that the search term will NOT be tokenized.
  • "auto_generate_phrase_queries" = true - To me says it does the same thing as phrase_slop=0.

Is my understanding correct, and if so should the docs reflect they don't work?

(also, "fields" is missing from the list of top-level parameters - I can add that)


(system) #9