Autocompletion

Hi everyone,

I'm looking for a way to implement an autocomplete feature using elasticsearch.
Does someone have tips about the way to achieve this ?
Kimchy, the search bar on elasticsearch sitenis exactly what i'd like to do. Can you tell us how you did it?
What kind of mapping? Hownyou used the api to implement it ?

Thanks,
Frederic

use facet

在 2012年5月31日星期四UTC+8上午4时10分03秒,Frederic Esnault写道:

Hi everyone,

I'm looking for a way to implement an autocomplete feature using
elasticsearch.
Does someone have tips about the way to achieve this ?
Kimchy, the search bar on elasticsearch sitenis exactly what i'd like to
do. Can you tell us how you did it?
What kind of mapping? Hownyou used the api to implement it ?

Thanks,
Frederic

I usually implement autocomplete with multi field types. Let's say for
example you have two document types: Articles and Comments (for a blog) and
you want autocomplete on the article title and commenter name. When you map
these types, set these fields up as multifields
(Elasticsearch Platform — Find real-time answers at scale | Elastic),
with one of the sub-fields named "autocomplete". To query these you can use
a simple query_string search, and use the wildcard syntax to search only
the fields that are mapped with the autocomplete sub-field. In your query
set the "fields" param to "*.autocomplete", which will run the search on
any field with an autocomplete sub-field.

This way, I write my autocomplete query once, and then define what fields
are searchable using my type mappings.

On Wednesday, May 30, 2012 9:10:03 PM UTC+1, Frederic Esnault wrote:

Hi everyone,

I'm looking for a way to implement an autocomplete feature using
elasticsearch.
Does someone have tips about the way to achieve this ?
Kimchy, the search bar on elasticsearch sitenis exactly what i'd like to
do. Can you tell us how you did it?
What kind of mapping? Hownyou used the api to implement it ?

Thanks,
Frederic

"...use the wildcard syntax to search only the fields that are mapped
with the autocomplete sub-field"

Is your index big? What about your index workload, is updated
frecuently?
I'm really interested to see how this performs in big indices with
high updates ratings.

Interesting approach. I actually need to do a slightly different
auto-complete - I have a bunch of article text and would like to
autocomplete by topic. So in the example below, if there were titles
"phillips screwdrivers" and "flathead screwdrivers", if the user types
"screw", I want to autocomplete to "screwdrivers" in the search box. Then
the user can submit a "screwdrivers" query and both titles should match. I
tried a match all query with a terms facet that did a pattern match
(screw.*) but I got an OutOfMemoryException. Is there a better way to do
this or some way to estimate how much memory I need to do the facet query?

On Thursday, May 31, 2012 4:33:07 AM UTC-4, Nick Dunn wrote:

I usually implement autocomplete with multi field types. Let's say for
example you have two document types: Articles and Comments (for a blog) and
you want autocomplete on the article title and commenter name. When you map
these types, set these fields up as multifields (
Elasticsearch Platform — Find real-time answers at scale | Elastic),
with one of the sub-fields named "autocomplete". To query these you can use
a simple query_string search, and use the wildcard syntax to search only
the fields that are mapped with the autocomplete sub-field. In your query
set the "fields" param to "*.autocomplete", which will run the search on
any field with an autocomplete sub-field.

This way, I write my autocomplete query once, and then define what fields
are searchable using my type mappings.

On Wednesday, May 30, 2012 9:10:03 PM UTC+1, Frederic Esnault wrote:

Hi everyone,

I'm looking for a way to implement an autocomplete feature using
elasticsearch.
Does someone have tips about the way to achieve this ?
Kimchy, the search bar on elasticsearch sitenis exactly what i'd like to
do. Can you tell us how you did it?
What kind of mapping? Hownyou used the api to implement it ?

Thanks,
Frederic

Thx Nick for this tip.
Actually i'm going to search on a FAQ questions and answers.
The autocomplete method is not going to be a bit too much on answers (say 5 to 10 sentences in each answer)?

My index is going to have a maximum of 15 categories of questions, each holding an unlimited amount of question/answer pairs).
I think we won't go up to 50 questions in each category.
this is not such a big index, sorry :slight_smile:

About facets i'm not sure how it would help. Facets can help to make stats on, for example, how many question/answer pairs contain a certain word.
As an example, (the FAQ is about cars), i could display the number of questions dealing with gas, with security, and so on. Or do i misunderstand facets ?

I'm actually quite intested in the multifields method. I'll check this out.

Any more opinions ?

Thx Nick for this tip.
Actually i'm going to search on a FAQ questions and answers.
The autocomplete method is not going to be a bit too much on answers (say 5 to 10 sentences in each answer)?

My index is going to have a maximum of 15 categories of questions, each holding an unlimited amount of question/answer pairs).
I think we won't go up to 50 questions in each category.
this is not such a big index, sorry :slight_smile:

About facets i'm not sure how it would help. Facets can help to make stats on, for example, how many question/answer pairs contain a certain word.
As an example, (the FAQ is about cars), i could display the number of questions dealing with gas, with security, and so on. Or do i misunderstand facets ?

I'm actually quite intested in the multifields method. I'll check this out.

Any more opinions ?

Just a dumb question after reading the doc on multi fields. Does it mean the value is indexed twice ? I mean it takes two times the space on the disk/memory ?

Yes, your field will be mapped n times and more space will be required,
depending on the mapping.

-- Tanguy

Le jeudi 31 mai 2012 21:59:55 UTC+2, Frederic Esnault a écrit :

Just a dumb question after reading the doc on multi fields. Does it mean
the value is indexed twice ? I mean it takes two times the space on the
disk/memory ?

Ok thanx, i'll see how it can fit, but the approach seems really
interesting.
And by the way i saw an article about autocomplete with solar, and they
created also an autocomplete field.

Another question, are NGrams useful for this usage? And if yes, how ?

On Friday, June 1, 2012 9:24:54 AM UTC+2, Tanguy wrote:

Yes, your field will be mapped n times and more space will be required,
depending on the mapping.

-- Tanguy

Le jeudi 31 mai 2012 21:59:55 UTC+2, Frederic Esnault a écrit :

Just a dumb question after reading the doc on multi fields. Does it mean
the value is indexed twice ? I mean it takes two times the space on the
disk/memory ?

On Fri, Jun 1, 2012 at 10:14 AM, Frederic Esnault
esnault.frederic@gmail.com wrote:

Another question, are NGrams useful for this usage? And if yes, how ?

Ngrams are useful to do matches with partial strings (particularly
useful for autocomplete since you're matching partially input words)
and matches with typographical errors (generally useful). The ngram
features on ES are quite powerful, but do note that if you have lots
of terms to be analyzed by a wide range of ngrams, it does take up a
lot of memory.

  • Ifty.

"...are NGrams useful for this usage? And if yes, how ?..."

ngrams are the way I have gone to do autocompletion and probably the
way almost everyone here uses.
Thats why I'm asking how was performing something that IMHO is a
beast.
Wildcard queries over a large index is not the best idea unless you
dont mind the response time or has sharded with lots of machines.

To autocomplete with ngram just create a custom analyzer that do what
you want plus ngram, thats making the index bigger but
you have the possibility of search by ngram which are essentially
parts of a word.

Does this makes sense to U?

Well i think i get the point, i just need to get up to the speed with
ngrams, and how to implement an analyzer.
I guess the best source to look for this is lucene, right ?

On Friday, June 1, 2012 12:45:39 PM UTC+2, David G Ortega wrote:

"...are NGrams useful for this usage? And if yes, how ?..."

ngrams are the way I have gone to do autocompletion and probably the
way almost everyone here uses.
Thats why I'm asking how was performing something that IMHO is a
beast.
Wildcard queries over a large index is not the best idea unless you
dont mind the response time or has sharded with lots of machines.

To autocomplete with ngram just create a custom analyzer that do what
you want plus ngram, thats making the index bigger but
you have the possibility of search by ngram which are essentially
parts of a word.

Does this makes sense to U?

I tried creating an index with a mapping, specifying a tokenizer and
analyzers.

XContentBuilder settings = jsonBuilder()
.startObject()
.startObject("analysis")
.startObject("analyzer")
.startObject("response_search_analyzer")
.field("tokenizer", "responseTokenizer")
.field("filter", "lowercase")
.endObject()
.startObject("response_index_analyzer")
.field("tokenizer", "responseTokenizer")
.field("filter", "lowercase", "nGram")
.endObject()
.endObject()
.startObject("tokenizer")
.startObject("responseTokenizer")
.field("type", "whitespace")
.endObject()
.endObject()
.startObject("filter")
.startObject("nGram")
.field("type", "nGram")
.field("min_ngram", 3)
.field("max_ngram", 6)
.endObject()
.endObject()
.endObject()
.endObject();

XContentBuilder mapping = jsonBuilder()
.startObject()
.startObject("question")
.startObject("properties")
.startObject("responseDescription")
.field("type", "string")
.field("search_analyzer",
"response_search_analyzer")
.field("index_analyzer",
"response_index_analyzer")
.endObject()
.endObject()
.endObject()
.endObject();

Then i create the index this way :
CreateIndexResponse response =
client.admin().indices().prepareCreate("faq-ze").setSettings(createSettings())
.addMapping("question",
createMapping()).execute().actionGet();

Resulting settings are these :
curl -XGET
'http://192.168.6.159:9202/faq-ze/_settings?pretty=1'
{
"faq-ze" : {
"settings" : {
"index.analysis.analyzer.response_index_analyzer.filter.0" :
"lowercase",
"index.analysis.analyzer.response_index_analyzer.filter.1" : "nGram",
"index.analysis.tokenizer.responseTokenizer.type" : "whitespace",
"index.analysis.analyzer.response_index_analyzer.tokenizer" :
"responseTokenizer",
"index.analysis.analyzer.response_search_analyzer.filter" :
"lowercase",
"index.analysis.filter.nGram.min_ngram" : "3",
"index.analysis.filter.nGram.type" : "nGram",
"index.analysis.filter.nGram.max_ngram" : "6",
"index.analysis.analyzer.response_search_analyzer.tokenizer" :
"responseTokenizer",
"index.number_of_shards" : "2",
"index.number_of_replicas" : "1",
"index.version.created" : "190499"
}
}
}

And the mapping :

curl -XGET
'http://192.168.6.159:9202/faq-ze/_mapping?pretty=1'

{
"faq-ze" : {
"category" : {
"properties" : {
"id" : {
"type" : "long"
},
"name" : {
"type" : "string"
}
}
},
"question" : {
"properties" : {
"categoryTitle" : {
"type" : "string"
},
"id" : {
"type" : "long"
},
"questionDisplay" : {
"type" : "string"
},
"questionPopularity" : {
"type" : "long"
},
"questionTitle" : {
"type" : "string"
},
"responseDescription" : {
"type" : "string",
"index_analyzer" : "index_analyzer",
"search_analyzer" : "search_analyzer"
},
"responseMedia" : {
"type" : "string"
},
"responseMediaGlimpse" : {
"type" : "string"
},
"responsePdf" : {
"type" : "string"
},
"responsePlusLabel" : {
"type" : "string"
},
"responsePlusUrl" : {
"type" : "string"
},
"responseTitle" : {
"type" : "string"
}
}
}
}
}

But an analysis try gives nothing :
curl -XGET 'http://192.168.6.159:9202/faq-ze/_analyze?pretty=1&text="In the
future, the HCCI diesel engine (Homogenous Charge Compression Ignition) and
CAI gasoline engine "&analyzer=response_index_analyzer'
curl: (52) Empty reply from server
[1] 9614 exit 52 curl -XGET

And of course, searching returns nothing :
{
"query" : {
"field" : {
"responseDescription" : "ren"
}
}
}

Gives :
Pretty
Result Transformer?
Repeat Request
Display Options?
{

  • took: 1
  • timed_out: false
  • _shards: {
    • total: 2
    • successful: 2
    • failed: 0
      }
  • hits: {
    • total: 0
    • max_score: null
    • hits:
      }

}

(given that my document's reponseDecription field values contains 'Renault'
in the middle of a text.

Any idea ?

On Friday, June 1, 2012 2:56:53 PM UTC+2, Frederic Esnault wrote:

Well i think i get the point, i just need to get up to the speed with
ngrams, and how to implement an analyzer.
I guess the best source to look for this is lucene, right ?

On Friday, June 1, 2012 12:45:39 PM UTC+2, David G Ortega wrote:

"...are NGrams useful for this usage? And if yes, how ?..."

ngrams are the way I have gone to do autocompletion and probably the
way almost everyone here uses.
Thats why I'm asking how was performing something that IMHO is a
beast.
Wildcard queries over a large index is not the best idea unless you
dont mind the response time or has sharded with lots of machines.

To autocomplete with ngram just create a custom analyzer that do what
you want plus ngram, thats making the index bigger but
you have the possibility of search by ngram which are essentially
parts of a word.

Does this makes sense to U?

And if i remove the index/search analyzers from the field, i get responses
with my query using 'Renault' (but not 'ren')

On Friday, June 1, 2012 4:49:35 PM UTC+2, Frederic Esnault wrote:

I tried creating an index with a mapping, specifying a tokenizer and
analyzers.

XContentBuilder settings = jsonBuilder()
.startObject()
.startObject("analysis")
.startObject("analyzer")
.startObject("response_search_analyzer")
.field("tokenizer", "responseTokenizer")
.field("filter", "lowercase")
.endObject()
.startObject("response_index_analyzer")
.field("tokenizer", "responseTokenizer")
.field("filter", "lowercase", "nGram")
.endObject()
.endObject()
.startObject("tokenizer")
.startObject("responseTokenizer")
.field("type", "whitespace")
.endObject()
.endObject()
.startObject("filter")
.startObject("nGram")
.field("type", "nGram")
.field("min_ngram", 3)
.field("max_ngram", 6)
.endObject()
.endObject()
.endObject()
.endObject();

XContentBuilder mapping = jsonBuilder()
.startObject()
.startObject("question")
.startObject("properties")
.startObject("responseDescription")
.field("type", "string")
.field("search_analyzer",
"response_search_analyzer")
.field("index_analyzer",
"response_index_analyzer")
.endObject()
.endObject()
.endObject()
.endObject();

Then i create the index this way :
CreateIndexResponse response =
client.admin().indices().prepareCreate("faq-ze").setSettings(createSettings())
.addMapping("question",
createMapping()).execute().actionGet();

Resulting settings are these :
curl -XGET 'http://192.168.6.159:9202/faq-ze/_settings?pretty=1'

{
"faq-ze" : {
"settings" : {
"index.analysis.analyzer.response_index_analyzer.filter.0" :
"lowercase",
"index.analysis.analyzer.response_index_analyzer.filter.1" : "nGram",
"index.analysis.tokenizer.responseTokenizer.type" : "whitespace",
"index.analysis.analyzer.response_index_analyzer.tokenizer" :
"responseTokenizer",
"index.analysis.analyzer.response_search_analyzer.filter" :
"lowercase",
"index.analysis.filter.nGram.min_ngram" : "3",
"index.analysis.filter.nGram.type" : "nGram",
"index.analysis.filter.nGram.max_ngram" : "6",
"index.analysis.analyzer.response_search_analyzer.tokenizer" :
"responseTokenizer",
"index.number_of_shards" : "2",
"index.number_of_replicas" : "1",
"index.version.created" : "190499"
}
}
}

And the mapping :

curl -XGET 'http://192.168.6.159:9202/faq-ze/_mapping?pretty=1'

{
"faq-ze" : {
"category" : {
"properties" : {
"id" : {
"type" : "long"
},
"name" : {
"type" : "string"
}
}
},
"question" : {
"properties" : {
"categoryTitle" : {
"type" : "string"
},
"id" : {
"type" : "long"
},
"questionDisplay" : {
"type" : "string"
},
"questionPopularity" : {
"type" : "long"
},
"questionTitle" : {
"type" : "string"
},
"responseDescription" : {
"type" : "string",
"index_analyzer" : "index_analyzer",
"search_analyzer" : "search_analyzer"
},
"responseMedia" : {
"type" : "string"
},
"responseMediaGlimpse" : {
"type" : "string"
},
"responsePdf" : {
"type" : "string"
},
"responsePlusLabel" : {
"type" : "string"
},
"responsePlusUrl" : {
"type" : "string"
},
"responseTitle" : {
"type" : "string"
}
}
}
}
}

But an analysis try gives nothing :
curl -XGET 'http://192.168.6.159:9202/faq-ze/_analyze?pretty=1&text="In
the future, the HCCI diesel engine (Homogenous Charge Compression Ignition)
and CAI gasoline engine "&analyzer=response_index_analyzer'
curl: (52) Empty reply from server
[1] 9614 exit 52 curl -XGET

And of course, searching returns nothing :
{
"query" : {
"field" : {
"responseDescription" : "ren"
}
}
}

Gives :
Pretty
Result Transformer?
Repeat Request
Display Options?
{

  • took: 1
  • timed_out: false
  • _shards: {
    • total: 2
    • successful: 2
    • failed: 0
      }
  • hits: {
    • total: 0
    • max_score: null
    • hits:
      }

}

(given that my document's reponseDecription field values contains
'Renault' in the middle of a text.

Any idea ?

On Friday, June 1, 2012 2:56:53 PM UTC+2, Frederic Esnault wrote:

Well i think i get the point, i just need to get up to the speed with
ngrams, and how to implement an analyzer.
I guess the best source to look for this is lucene, right ?

On Friday, June 1, 2012 12:45:39 PM UTC+2, David G Ortega wrote:

"...are NGrams useful for this usage? And if yes, how ?..."

ngrams are the way I have gone to do autocompletion and probably the
way almost everyone here uses.
Thats why I'm asking how was performing something that IMHO is a
beast.
Wildcard queries over a large index is not the best idea unless you
dont mind the response time or has sharded with lots of machines.

To autocomplete with ngram just create a custom analyzer that do what
you want plus ngram, thats making the index bigger but
you have the possibility of search by ngram which are essentially
parts of a word.

Does this makes sense to U?

Got it working !

First actually my analysis query returned something, but not displayed with
curl :confused:
I entered the request in my browser and saw a list of tokens, but very
wrong (ngrams with defaults boundaries (min : 1 and max : 2)
I had a wrong ngram filter with wrong fields.
Here is my new settings :

XContentBuilder settings = jsonBuilder()
.startObject()
.startObject("analysis")
.startObject("analyzer")
.startObject("response_search_analyzer")
.field("type","custom")
.field("tokenizer", "responseTokenizer")
.field("filter", "lowercase")
.endObject()
.startObject("response_index_analyzer")
.field("type","custom")
.field("tokenizer", "standard")
.field("filter", "lowercase", "edgeNGram")
.endObject()
.endObject()
.startObject("tokenizer")
.startObject("responseTokenizer")
.field("type", "lowercase")
.endObject()
.endObject()
.startObject("tokenizer")
.startObject("responseNGRamTokenizer")
.field("type", "edgeNGram")
.field("min_ngram", 3)
.field("max_ngram", 6)
.endObject()
.endObject()
.startObject("filter")
.startObject("edgeNGram")
.field("type", "edgeNGram")
.field("side", "front")
.field("min_gram", 3)
.field("max_gram", 6)
.endObject()
.endObject()
.endObject()
.endObject();

The analysis works perfectly and my request also.

{"query" : {
"text" : {
"responseDescription" : "red"
}
}
}

Gives me ( i'm looking for a text containing 'reduce', so an autocompletion
request with 'red' gives me the text i'm looking for ) :
{

  • took: 2
  • timed_out: false
  • _shards: {
    • total: 2
    • successful: 2
    • failed: 0
      }
  • hits: {
    • total: 1
    • max_score: 0.033902764
    • hits: [
      • {
        • _index: faq-ze
        • _type: question
        • _id: 2
        • _score: 0.033902764
        • _source: {
          • id: 2
          • responseDescription: We created the low pressure EGR
            (Exhaust Gas Recirculation) system to* reduce* nitrogen
            oxide (NOx) emissions from combustion. Using several injectors in the
            diesel burning process also optimizes combustion. This multi-injection
            diesel process reduces pollutant emissions and engine noise.
            }
            }
            ]
            }

}

Yipeee !

Frederic

On Friday, June 1, 2012 4:58:21 PM UTC+2, Frederic Esnault wrote:

And if i remove the index/search analyzers from the field, i get responses
with my query using 'Renault' (but not 'ren')

On Friday, June 1, 2012 4:49:35 PM UTC+2, Frederic Esnault wrote:

I tried creating an index with a mapping, specifying a tokenizer and
analyzers.

XContentBuilder settings = jsonBuilder()
.startObject()
.startObject("analysis")
.startObject("analyzer")
.startObject("response_search_analyzer")
.field("tokenizer", "responseTokenizer")
.field("filter", "lowercase")
.endObject()
.startObject("response_index_analyzer")
.field("tokenizer", "responseTokenizer")
.field("filter", "lowercase", "nGram")
.endObject()
.endObject()
.startObject("tokenizer")
.startObject("responseTokenizer")
.field("type", "whitespace")
.endObject()
.endObject()
.startObject("filter")
.startObject("nGram")
.field("type", "nGram")
.field("min_ngram", 3)
.field("max_ngram", 6)
.endObject()
.endObject()
.endObject()
.endObject();

XContentBuilder mapping = jsonBuilder()
.startObject()
.startObject("question")
.startObject("properties")
.startObject("responseDescription")
.field("type", "string")
.field("search_analyzer",
"response_search_analyzer")
.field("index_analyzer",
"response_index_analyzer")
.endObject()
.endObject()
.endObject()
.endObject();

Then i create the index this way :
CreateIndexResponse response =
client.admin().indices().prepareCreate("faq-ze").setSettings(createSettings())
.addMapping("question",
createMapping()).execute().actionGet();

Resulting settings are these :
curl -XGET 'http://192.168.6.159:9202/faq-ze/_settings?pretty=1'

{
"faq-ze" : {
"settings" : {
"index.analysis.analyzer.response_index_analyzer.filter.0" :
"lowercase",
"index.analysis.analyzer.response_index_analyzer.filter.1" :
"nGram",
"index.analysis.tokenizer.responseTokenizer.type" : "whitespace",
"index.analysis.analyzer.response_index_analyzer.tokenizer" :
"responseTokenizer",
"index.analysis.analyzer.response_search_analyzer.filter" :
"lowercase",
"index.analysis.filter.nGram.min_ngram" : "3",
"index.analysis.filter.nGram.type" : "nGram",
"index.analysis.filter.nGram.max_ngram" : "6",
"index.analysis.analyzer.response_search_analyzer.tokenizer" :
"responseTokenizer",
"index.number_of_shards" : "2",
"index.number_of_replicas" : "1",
"index.version.created" : "190499"
}
}
}

And the mapping :

curl -XGET 'http://192.168.6.159:9202/faq-ze/_mapping?pretty=1'

{
"faq-ze" : {
"category" : {
"properties" : {
"id" : {
"type" : "long"
},
"name" : {
"type" : "string"
}
}
},
"question" : {
"properties" : {
"categoryTitle" : {
"type" : "string"
},
"id" : {
"type" : "long"
},
"questionDisplay" : {
"type" : "string"
},
"questionPopularity" : {
"type" : "long"
},
"questionTitle" : {
"type" : "string"
},
"responseDescription" : {
"type" : "string",
"index_analyzer" : "index_analyzer",
"search_analyzer" : "search_analyzer"
},
"responseMedia" : {
"type" : "string"
},
"responseMediaGlimpse" : {
"type" : "string"
},
"responsePdf" : {
"type" : "string"
},
"responsePlusLabel" : {
"type" : "string"
},
"responsePlusUrl" : {
"type" : "string"
},
"responseTitle" : {
"type" : "string"
}
}
}
}
}

But an analysis try gives nothing :
curl -XGET 'http://192.168.6.159:9202/faq-ze/_analyze?pretty=1&text="In
the future, the HCCI diesel engine (Homogenous Charge Compression Ignition)
and CAI gasoline engine "&analyzer=response_index_analyzer'
curl: (52) Empty reply from server
[1] 9614 exit 52 curl -XGET

And of course, searching returns nothing :
{
"query" : {
"field" : {
"responseDescription" : "ren"
}
}
}

Gives :
Pretty
Result Transformer?
Repeat Request
Display Options?
{

  • took: 1
  • timed_out: false
  • _shards: {
    • total: 2
    • successful: 2
    • failed: 0
      }
  • hits: {
    • total: 0
    • max_score: null
    • hits:
      }

}

(given that my document's reponseDecription field values contains
'Renault' in the middle of a text.

Any idea ?

On Friday, June 1, 2012 2:56:53 PM UTC+2, Frederic Esnault wrote:

Well i think i get the point, i just need to get up to the speed with
ngrams, and how to implement an analyzer.
I guess the best source to look for this is lucene, right ?

On Friday, June 1, 2012 12:45:39 PM UTC+2, David G Ortega wrote:

"...are NGrams useful for this usage? And if yes, how ?..."

ngrams are the way I have gone to do autocompletion and probably the
way almost everyone here uses.
Thats why I'm asking how was performing something that IMHO is a
beast.
Wildcard queries over a large index is not the best idea unless you
dont mind the response time or has sharded with lots of machines.

To autocomplete with ngram just create a custom analyzer that do what
you want plus ngram, thats making the index bigger but
you have the possibility of search by ngram which are essentially
parts of a word.

Does this makes sense to U?

I recommend this
thread: http://elasticsearch-users.115913.n3.nabble.com/help-needed-with-the-query-tt3177477.html#a3178856

On Wednesday, May 30, 2012 4:10:03 PM UTC-4, Frederic Esnault wrote:

Hi everyone,

I'm looking for a way to implement an autocomplete feature using
elasticsearch.
Does someone have tips about the way to achieve this ?
Kimchy, the search bar on elasticsearch sitenis exactly what i'd like to
do. Can you tell us how you did it?
What kind of mapping? Hownyou used the api to implement it ?

Thanks,
Frederic

Nice pointer, seems quite informative. I'll check this out later, but i'm
interested in the different filters used, allowing less strict search. And
the multifields are back in there :slight_smile:
Thx Roly !

Frederic

Le vendredi 1 juin 2012 21:27:41 UTC+2, Roly Vicaria a écrit :

I recommend this thread:
http://elasticsearch-users.115913.n3.nabble.com/help-needed-with-the-query-tt3177477.html#a3178856

On Wednesday, May 30, 2012 4:10:03 PM UTC-4, Frederic Esnault wrote:

Hi everyone,

I'm looking for a way to implement an autocomplete feature using
elasticsearch.
Does someone have tips about the way to achieve this ?
Kimchy, the search bar on elasticsearch sitenis exactly what i'd like to
do. Can you tell us how you did it?
What kind of mapping? Hownyou used the api to implement it ?

Thanks,
Frederic