How to index the word_delimiter itself?

(lea) #1

When analyzing alpha+beta delta, I want the outcome of tokens to be [ALPHA+BETA DELTA, ALPHABETADELTA, ALPHA, BETA, DELTA, ALPHA+, ALPHA+BETA]. My anlyzer gives me the results that I am looking for, except for [ALPHA+, ALPHA+BETA]. How can I include them?

  "index": {
    "number_of_shards": 1,
    "analysis": {
      "filter": {
        "word_joiner": {
          "type": "word_delimiter",
          "catenate_all": true,
          "preserve_original": "true"
      "analyzer": {
        "word_join_analyzer": {
          "type": "custom",
          "filter": [
          "tokenizer": "keyword"

(Val Crettaz) #2

I'm not certain the word_delimiter token filter can do what you need. You probably need something more involved. Especially, I don't see how you could get the ALPHA+ part using word_delimiter.

(lea) #3

But how can I get the other tokens without using it?

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.