Wildcard for Nested object


(Vikentyi) #1

Hello.

I have the following mapping (Elasticsearch 5.4.0):

{
      "relatedData": {
                        "properties": {
                            "subject": {
                                "type": "nested",
                                "properties": {
                                    "person": {
                                        "properties": {
                                            "firstName": {
                                                "type": "keyword",
                                                "fields": {
                                                    "english": {
                                                        "type": "text",
                                                        "analyzer": "english"
                                                    }
                                                }
                                            }
                                        }
                                    },
                                    "relation": {
                                        "type": "keyword",
                                        "fields": {
                                            "english": {
                                                "type": "text",
                                                "analyzer": "english"
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    } 
               }

I get the the result for fulltext search using analysed field .english and nested path = it is fine.

But I have problems with wildcard for nested non-analyzed field:

   "bool": {
                              "must": [
                                {
                                  "nested": {
                                    "query": {
                                      "wildcard": {
                                        "relatedData.subject.relation": {
                                          "query": "author",
                                          "operator": "OR",
                                          "prefix_length": 0,
                                          "max_expansions": 50,
                                          "fuzzy_transpositions": true,
                                          "lenient": false,
                                          "zero_terms_query": "NONE",
                                          "boost": 1
                                        }
                                      }
                                    },
                                    "path": "relatedData.subject",
                                    "ignore_unmapped": false,
                                    "score_mode": "none",
                                    "boost": 1
                                  }
                                }
                              ],
                              "disable_coord": false,
                              "adjust_pure_negative": true,
                              "boost": 1
                            }
                          }

Here I get a response:

{
    "error": {
        "root_cause": [
            {
                "type": "parsing_exception",
                "reason": "[wildcard] query does not support [query]",
                "line": 276,
                "col": 53
            }
        ],
        "type": "parsing_exception",
        "reason": "[wildcard] query does not support [query]",
        "line": 276,
        "col": 53
    },
    "status": 400
}

Could you explain how to build wildcard query for this example?

P.S. My mapping is more complex but I put here it's simple version.


(David Pilato) #2

Where did you find all those parameters on a wildcard query? Can't find any reference in https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-wildcard-query.html


(Vikentyi) #3

I'have used BoolQueryBuilder in Java. My initial query was for fulltext search (part of query): ("relatedData.subject.relation.english" is analysed field)

        "bool": {
                          "must": [
                            {
                              "nested": {
                                "query": {
                                  "match": {
                                    "relatedData.subject.relation.english": {
                                      "query": "author",
                                      "operator": "OR",
                                      "prefix_length": 0,
                                      "max_expansions": 50,
                                      "fuzzy_transpositions": true,
                                      "lenient": false,
                                      "zero_terms_query": "NONE",
                                      "boost": 1
                                    }
                                  }
                                },
                                "path": "relatedData.subject",
                                "ignore_unmapped": false,
                                "score_mode": "none",
                                "boost": 1
                              }
                            }
                          ],
                          "disable_coord": false,
                          "adjust_pure_negative": true,
                          "boost": 1
                        }
                      }

And I got the correct result:

{
    "took": 15,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": null,
        "hits": [
            {
                "_index": "elastic",
                "_type": "search_type",
                "_id": "7",
                "_score": null,
                "_source": 
                    "relatedData": {
                        "subject": [
                            {
                                "relation": "author",
                                "person": {
                                    "firstName": "aut fn"
                                }
                            }
                        ]
                    }
                },
                "sort": [
                    7
                ]
            }
        ]
    }
}

Here I have correct wildcard request but it does not give me correct response: ("relatedData.subject.relation" is non-analysed field)

{
                            "bool": {
                              "must": [
                                {
                                  "wildcard": {
                                    "relatedData.subject.relation": {
                                      "wildcard": "*author*",
                                      "boost": 1
                                    }
                                  }
                                }
                              ],
                              "disable_coord": false,
                              "adjust_pure_negative": true,
                              "boost": 1
                            }
                          }

On my previous wildcard example I've changed match to wildcard. It is wrong way. But how to make correct? (it is a nested object)


(David Pilato) #4

But now I'm confused. This can not give you the error message you wrote earlier.

Whatever running something like "wildcard": "*author*" is one of the worst query you can run on an elasticsearch cluster. Basically you are going to run a full scan of the index which is bad.

Look at: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html

Note that this query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow wildcard queries, a wildcard term should not start with one of the wildcards * or ?.

It would help if you provide a full recreation script as described in

It will help to better understand what you are doing.
Please, try to keep the example as simple as possible. I mean try without nested things at first...


(Vikentyi) #5

Yes, I understand it but it is a customer's request for this query. (((
All other simple wildcard queries work fine but I also need it for nested.


(David Pilato) #6

If it's a one time operation, then that's ok. If you mean to implement that as is, then it's bad IMO.

Better to use ngrams in your analysis chain. It will produce much more token, will use more space on disk but will be more efficient.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.