Analysis Mismatch in Elastic Search


(Neil) #1

I have the following configuration in elasticsearch defined below.
The problem is that I'm not getting
back the results I expect. What's odd is that I'm taking exactly what
I am doing in solr and trying to
apply it to elastic search.

Given that I have no way to analyze what's going on in elastic search,
I'm at a loss to further troubleshoot
why things are working as they would expect.

Essentially using the query_string on "ja" in elastic search is not
returning: "627 Lb Woman Jackies Story"
even though that document exists in the index.

Any help is appreciated.

Thanks,

Neil

elasticsearch.yml

analysis:
analyzer:
containsText:
tokenizer: whitespace
filter: [asciifolding, lowercase, autocomplete]
filter:
autocomplete:
type: edgeNGram
min_gram: 1
max_gram: 100
side: front

mapper:
program:
properties:
autoCompleteContainsProgramTitle: {"type" : "containsText",
"store" : "yes", "index" : "analyzed" , "term_vector" :
"with_positions_offsets"}


curl -XGET http://localhost:9200/myindex/program/_search -d '{
"query" : {
"query_string" : {
"query" : "ja",
"fields" : [ "autoCompleteContainsProgramTitle" ],
"default_operator" : "or",
"analyze_wildcard" : false,
"analyzer": "containsText"
}
},
"fields" : [ "autoCompleteContainsProgramTitle"]
}'

----- response

"hits": [
{
"_id": "1934155",
"_index": "myindex",
"_score": 6.411867,
"_type": "program",
"fields": {
"autoCompleteContainsProgramTitle": "J.A. Johnson"
}
},
{
"_id": "16039073",
"_index": "myindex",
"_score": 5.1311817,
"_type": "program",
"fields": {
"autoCompleteContainsProgramTitle": "Sam ja
Vjatskij urozhenec"
}
},
{
"_id": "3904322",
"_index": "myindex",
"_score": 5.130897,
"_type": "program",
"fields": {
"autoCompleteContainsProgramTitle": "Ljudmila
Chursina. Ja- Nich'ja"
}
},
{
"_id": "1954834",
"_index": "myindex",
"_score": 4.4931150000000004,
"_type": "program",
"fields": {
"autoCompleteContainsProgramTitle": "J.A. Johnson/
Eastern Star Baptist Church"
}
}
]


curl -XGET http://localhost:9200/myindex/program/_search -d '{
"query" : {
"term" : { "programID": "1652094" }
}
}'

----- response

{
"_shards": {
"failed": 0,
"successful": 4,
"total": 4
},
"hits": {
"hits": [
{
"_id": "1652094",
"_index": "myindex",
"_score": 9.8563279999999995,
"_source": {
"autoCompleteContainsProgramTitle": "627 Lb Woman
Jackies Story",
"programID": "1652094"
},
"_type": "program"
}
],
"max_score": 9.8563279999999995,
"total": 1
},
"timed_out": false,
"took": 2
}

So if I run the same thing in solr using the following schema snippet
I get all the results I would expect.















(Neil) #2

I realized that I'm getting this error:

curl -XGET 'localhost:9200/myindex/_analyze?analyzer=containsText' -d
'627 Lb Woman Jackies Story'
{"error":"ElasticSearchIllegalArgumentException[failed to find
analyzer]","status":400}

So I'm confused as why elastic search would startup if this is broken.

Any ideas on why my yml isn't conveying the containsText analyzer?

Or how to troubleshoot?

Thanks,

Neil


(Shay Banon) #3

First, your analysis configuration is wrong. you are missing the top level
index setting. See some samples here:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/.

Second, where did you get that mapping configuration? Thats not how you
define mappings. See more here:
http://www.elasticsearch.org/guide/reference/mapping/ on mapping definition,
and ways to define mappings include providing then in the create index API
(recommended), using the put mapping API, or setting them in the config
section.

On Tue, Oct 4, 2011 at 9:33 PM, Neil neilmatthewlott@gmail.com wrote:

I realized that I'm getting this error:

curl -XGET 'localhost:9200/myindex/_analyze?analyzer=containsText' -d
'627 Lb Woman Jackies Story'
{"error":"ElasticSearchIllegalArgumentException[failed to find
analyzer]","status":400}

So I'm confused as why elastic search would startup if this is broken.

Any ideas on why my yml isn't conveying the containsText analyzer?

Or how to troubleshoot?

Thanks,

Neil


(system) #4