How to get results for missspelled query not using fuzzy based query?

Michal_Orzechowski · February 3, 2011, 1:43pm

Hi,

I am new to ElasticSearch and I am trying to handel such a case:

I have indexed a products:

curl -XPUT 'http://localhost:9200/shop/cameras/1' -d '
{
"product_name": "Digital SLR Camera Nikon D3100 14.2MP"
}
'

curl -XPUT 'http://localhost:9200/shop/cameras/2' -d '
{
"product_name": "Camera Nikon D60 10.2MP"
}
'

and I am trying to get them when the query is misspelled. It works quite
fine for fuzzy based query like:

curl -XGET 'http://localhost:9200/shop/cameras/_search' -d '
{
"query" : {
"fuzzy" : {
"product_name" : {
"value" : "nikko",
"min_similarity" : 0.5
}
}
}
}
'
However in docs for fuzzy based query there is a warrning that this
solution is not scalable. Is there another way to get results for
misspeled query
not using fuzzy based queries?

Thanks in advance.
Michal

Karussell1 · February 4, 2011, 11:50pm

You could try to create your own analyzer (or take one from Solr ;))
Either via n-grams or 'phonetic terms' - terms that sounds equally
will get the same transformed term: see Soundex - Wikipedia
etc

Another start could be to look at Solr's spellchecking mechanism and
copy them to use the (analyzer) into ES

http://wiki.apache.org/solr/SpellCheckComponent

On the other side I would try fuzzy or at least if no results were
returned query the index via fuzzy (in background) ...

Regards,
Peter.

On 3 Feb., 14:43, Michał Orzechowski michal.orzechow...@nokaut.pl
wrote:

Hi,

I am new to Elasticsearch and I am trying to handel such a case:

I have indexed a products:

curl -XPUT 'http://localhost:9200/shop/cameras/1'-d '
{
"product_name": "Digital SLR Camera Nikon D3100 14.2MP"}

'

curl -XPUT 'http://localhost:9200/shop/cameras/2'-d '
{
"product_name": "Camera Nikon D60 10.2MP"}

'

and I am trying to get them when the query is misspelled. It works quite
fine for fuzzy based query like:

curl -XGET 'http://localhost:9200/shop/cameras/_search'-d '
{
"query" : {
"fuzzy" : {
"product_name" : {
"value" : "nikko",
"min_similarity" : 0.5
}
}
}}

'
However in docs for fuzzy based query there is a warrning that this
solution is not scalable. Is there another way to get results for
misspeled query
not using fuzzy based queries?

Thanks in advance.
Michal

Karussell1 · February 4, 2011, 11:51pm

also take a look here:

http://elasticsearch-users.115913.n3.nabble.com/Terms-API-for-Spellchecker-td1691838.html

On 5 Feb., 00:50, Karussell tableyourt...@googlemail.com wrote:

You could try to create your own analyzer (or take one from Solr ;))
Either via n-grams or 'phonetic terms' - terms that sounds equally
will get the same transformed term: seehttp://en.wikipedia.org/wiki/Soundex
etc

Another start could be to look at Solr's spellchecking mechanism and
copy them to use the (analyzer) into ES

SpellCheckComponent - Solr - Apache Software Foundation

On the other side I would try fuzzy or at least if no results were
returned query the index via fuzzy (in background) ...

Regards,
Peter.

On 3 Feb., 14:43, Michał Orzechowski michal.orzechow...@nokaut.pl
wrote:

Hi,

I am new to Elasticsearch and I am trying to handel such a case:

I have indexed a products:

curl -XPUT 'http://localhost:9200/shop/cameras/1'-d'
{
"product_name": "Digital SLR Camera Nikon D3100 14.2MP"}

'

curl -XPUT 'http://localhost:9200/shop/cameras/2'-d'
{
"product_name": "Camera Nikon D60 10.2MP"}

'

and I am trying to get them when the query is misspelled. It works quite
fine for fuzzy based query like:

curl -XGET 'http://localhost:9200/shop/cameras/_search'-d'
{
"query" : {
"fuzzy" : {
"product_name" : {
"value" : "nikko",
"min_similarity" : 0.5
}
}
}}

'
However in docs for fuzzy based query there is a warrning that this
solution is not scalable. Is there another way to get results for
misspeled query
not using fuzzy based queries?

Thanks in advance.
Michal

Michal_Orzechowski · February 8, 2011, 12:53pm

Thanks for help! I am going to look into those Solr components.

On 5 Lut, 00:51, Karussell tableyourt...@googlemail.com wrote:

also take a look here:

http://elasticsearch-users.115913.n3.nabble.com/Terms-API-for-Spellch...

On 5 Feb., 00:50, Karussell tableyourt...@googlemail.com wrote:

You could try to create your own analyzer (or take one from Solr ;))
Either via n-grams or 'phonetic terms' - terms that sounds equally
will get the same transformed term: seehttp://en.wikipedia.org/wiki/Soundex
etc

Another start could be to look at Solr's spellchecking mechanism and
copy them to use the (analyzer) into ES

SpellCheckComponent - Solr - Apache Software Foundation

On the other side I would try fuzzy or at least if no results were
returned query the index via fuzzy (in background) ...

Regards,
Peter.

On 3 Feb., 14:43, Michał Orzechowski michal.orzechow...@nokaut.pl
wrote:

Hi,

I am new to Elasticsearch and I am trying to handel such a case:

I have indexed a products:

curl -XPUT 'http://localhost:9200/shop/cameras/1'-d'
{
"product_name": "Digital SLR Camera Nikon D3100 14.2MP"}

'

curl -XPUT 'http://localhost:9200/shop/cameras/2'-d'
{
"product_name": "Camera Nikon D60 10.2MP"}

'

and I am trying to get them when the query is misspelled. It works quite
fine for fuzzy based query like:

curl -XGET 'http://localhost:9200/shop/cameras/_search'-d'
{
"query" : {
"fuzzy" : {
"product_name" : {
"value" : "nikko",
"min_similarity" : 0.5
}
}
}}

'
However in docs for fuzzy based query there is a warrning that this
solution is not scalable. Is there another way to get results for
misspeled query
not using fuzzy based queries?

Thanks in advance.
Michal

kimchy · February 8, 2011, 2:22pm

Those analyzers are already provided in ES (soundex and ngram). Regarding the spell check component, it is problematic since it requires another index to be built alongside the original index, which gets really complicated when it comes to distributed system.

The reason for not tackling this currently is that there is a really cool work done in lucene trunk (upcoming 4.0) that will provide spell check like functionality while working on the original index.
On Tuesday, February 8, 2011 at 2:53 PM, MichaÅ Orzechowski wrote:

Thanks for help! I am going to look into those Solr components.

On 5 Lut, 00:51, Karussell tableyourt...@googlemail.com wrote:

also take a look here:

http://elasticsearch-users.115913.n3.nabble.com/Terms-API-for-Spellch...

On 5 Feb., 00:50, Karussell tableyourt...@googlemail.com wrote:

You could try to create your own analyzer (or take one from Solr ;))
Either via n-grams or 'phonetic terms' - terms that sounds equally
will get the same transformed term: seehttp://en.wikipedia.org/wiki/Soundex
etc

Another start could be to look at Solr's spellchecking mechanism and
copy them to use the (analyzer) into ES

SpellCheckComponent - Solr - Apache Software Foundation

On the other side I would try fuzzy or at least if no results were
returned query the index via fuzzy (in background) ...

Regards,
Peter.

On 3 Feb., 14:43, MichaÅ Orzechowski michal.orzechow...@nokaut.pl
wrote:

Hi,

I am new to Elasticsearch and I am trying to handel such a case:

I have indexed a products:

curl -XPUT 'http://localhost:9200/shop/cameras/1'-d'
{
"product_name": "Digital SLR Camera Nikon D3100 14.2MP"}

'

curl -XPUT 'http://localhost:9200/shop/cameras/2'-d'
{
"product_name": "Camera Nikon D60 10.2MP"}

'

and I am trying to get them when the query is misspelled. It works quite
fine for fuzzy based query like:

curl -XGET 'http://localhost:9200/shop/cameras/_search'-d'
{
"query" : {
"fuzzy" : {
"product_name" : {
"value" : "nikko",
"min_similarity" : 0.5
}
}
}}

'
However in docs for fuzzy based query there is a warrning that this
solution is not scalable. Is there another way to get results for
misspeled query
not using fuzzy based queries?

Thanks in advance.
Michal

srrin · April 22, 2011, 5:53am

Hi Shay,
Is there any example on how to use this phonetic analyzers and search documents?
This may help to implement one of my clients request.

kimchy · April 28, 2011, 9:44pm

This page has an example of how to set it up: Elasticsearch Platform — Find real-time answers at scale | Elastic. Then, you can reference the constructed analyzer by name in your mappings (where it applies).
On Friday, April 22, 2011 at 8:53 AM, srrIN wrote:

Hi Shay,
Is there any example on how to use this phonetic analyzers and search
documents?
This may help to implement one of my clients request.

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/How-to-get-results-for-missspelled-query-not-using-fuzzy-based-query-tp2413584p2850491.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

srrin · April 30, 2011, 3:15pm

Thank you, will check with this and get back to you for any clarifications.

SRR

Topic		Replies	Views
Fuzzy Elasticsearch	2	249	July 6, 2017
Elastic Search for misspelled words Elasticsearch	15	11880	July 6, 2017
Fuzzy query Elasticsearch	3	340	July 6, 2017
Search using misspelled querys? Elasticsearch	1	225	July 12, 2021
Confusing results from fuzzy query (1 term, 1 field) Elasticsearch	2	417	July 6, 2017

How to get results for missspelled query not using fuzzy based query?

Related topics