Fuzzy query


(Linlin Fu) #1

Followed this https://github.com/elasticsearch/elasticsearch guide,
setup elasticSearch and the twitter like information.

As shown below, fuzzy search { "user": "kimc" } can get expected result,
but fuzzy search { "user": "kim" } cannot get results.

What is the cause?

Thank you.

curl -XGET 'http://localhost:9200/twitter/tweet/_search?pretty=true' -d '
*{ *

  • "query" : { *
  •    "fuzzy" : { "user": "kimc" }*
    
  • } *
    }'
    {
    "took" : 3,
    "timed_out" : false,
    "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
    },
    "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [ {
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "2",
    "_score" : 1.0, "_source" :
    {
    "user": "kimchy",
    "postDate": "2009-11-15T14:12:12",
    "message": "Another tweet, will it be indexed?"
    }
    }, {
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "1",
    "_score" : 0.30685282, "_source" :
    {
    "user": "kimchy",
    "postDate": "2009-11-15T13:12:00",
    "message": "Trying out Elastic Search, so far so good?"
    }
    } ]
    }
    }

curl -XGET 'http://localhost:9200/twitter/tweet/_search?pretty=true' -d '
*{ *

  • "query" : { *
  •    "fuzzy" : { "user": "kim" }*
    
  • } *
    }'
    {
    "took" : 2,
    "timed_out" : false,
    "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
    },
    "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
    }
    }
    linlinfu-mac:~ linlinfu$

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5f543af7-7758-459f-b5c7-c1f856690b72%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Andrew Cholakian-2) #2

Hi, the reason is that fuzzy queries have a configurable edit distance. The
max distance is 2. The edit distance between "kim" and "kimchy" is 3 (since
3 characters must be inserted). Lucene (the underlying search engine of
elasticsearch / SOLR) cannot efficiently search words with an edit distance

  1. You may want to try either ngrams
    http://exploringelasticsearch.com/book/searching-natural-language/searching-non-word-text.html(as
    in this example) or using wildcardshttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html
    .

Please see the Fuzzy documentation for more
detail: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html

On Wednesday, December 11, 2013 6:22:11 PM UTC-8, Linlin Fu wrote:

Followed this https://github.com/elasticsearch/elasticsearch guide,
setup elasticSearch and the twitter like information.

As shown below, fuzzy search { "user": "kimc" } can get expected result,
but fuzzy search { "user": "kim" } cannot get results.

What is the cause?

Thank you.

curl -XGET 'http://localhost:9200/twitter/tweet/_search?pretty=true
http://localhost:9200/twitter/tweet/_search?pretty=true' -d '

*{ *

  • "query" : { *
  •    "fuzzy" : { "user": "kimc" }*
    
  • } *
    }'
    {
    "took" : 3,
    "timed_out" : false,
    "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
    },
    "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [ {
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "2",
    "_score" : 1.0, "_source" :
    {
    "user": "kimchy",
    "postDate": "2009-11-15T14:12:12",
    "message": "Another tweet, will it be indexed?"
    }
    }, {
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "1",
    "_score" : 0.30685282, "_source" :
    {
    "user": "kimchy",
    "postDate": "2009-11-15T13:12:00",
    "message": "Trying out Elastic Search, so far so good?"
    }
    } ]
    }
    }

curl -XGET 'http://localhost:9200/twitter/tweet/_search?pretty=true
http://localhost:9200/twitter/tweet/_search?pretty=true' -d '

*{ *

  • "query" : { *
  •    "fuzzy" : { "user": "kim" }*
    
  • } *
    }'
    {
    "took" : 2,
    "timed_out" : false,
    "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
    },
    "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
    }
    }
    linlinfu-mac:~ linlinfu$

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/38279284-ad29-43dd-b91f-218c016ea2fd%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3