Indexing annotation as they are in {@this.Example} and searching by them


(perrohunter) #1

Hello, I'm trying to figure out a way to index a special type of annotations that I developed so I then I can look by them later at query time, for example I may have a couple of documents with text

"this is a cool {@this.Example}"
"I think this is super cool"

I want to search for "I want some cool {@this.Example}" and match only the first document, however right now I'm matching both since there's an overlap of terms, I was trying to subquery my way around this, but seems like my annotations get indexed in a different way that I cannot match

{
        'query': {
            "bool":{
                "must":{
                    "match": { 
                        "doc_field":"{@this.Example}"
                    }
                },
                "should":{
                    "doc_field":{
                            "query":"I want something cool",
                            "fuzziness":"1"
                        }
                }
            }
        }
    }

I'm using the following Analyzer but without much results

"annotated_analyzer": {
                    "type": "custom",
                    "filter": [
                        "lowercase",
                        "english_stop",
                        "porter_stem"
                    ],
                    "tokenizer": "whitespace",
                }

oddly enough when creating the mapping for a field the search_analyzer is ignored, I'm not even sure if this analyzer is being used at search time

if I run _analyze on the field I do get what I want

GET my_index/_analyze 
{
  "field": "doc_field",
  "text": "I want some cool {@this.Example}"
}

yields

{
  "tokens": [
    {
      "token": "i",
      "start_offset": 0,
      "end_offset": 1,
      "type": "word",
      "position": 0
    },
    {
      "token": "want",
      "start_offset": 2,
      "end_offset": 6,
      "type": "word",
      "position": 1
    },
    {
      "token": "some",
      "start_offset": 7,
      "end_offset": 11,
      "type": "word",
      "position": 2
    },
    {
      "token": "cool",
      "start_offset": 12,
      "end_offset": 16,
      "type": "word",
      "position": 3
    },
    {
      "token": "{@this.example}",
      "start_offset": 17,
      "end_offset": 32,
      "type": "word",
      "position": 4
    }
  ]
}

and _validate/query says everything is fine

{
  "valid": true,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  }
}

Any ideas on how I can achieve this behavior? I'm sure I must use an analyzer for this task to make sure elastic indexes the annotation as is.

Even if I query only for {@this.Example} I get no results, even if the analyzer is doing what I expect it to do, the search query is not hitting this token :confused:


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.