I have got a little Problem with my synonym filter

... I build a little sample of what I do.

My Test Synonyms file is (test.syn placed into my /etc/elasticsearch
folder):

aaa,bbb,ccc,ddd
www,xxx,yyy,zzz
eee,fff,ggg,hhh => 111
sss,ttt,uuu,vvv => 222
rrr => 333,444,555

I created an index like so:

PUT /testindex?pretty
{
"settings": {
"analysis": {
"analyzer": {
"myIndexAnalyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"mySynonymsFilter"
]
},
"mySearchAnalyzer": {
"tokenizer": "standard",
"filter": [
"lowercase"
]
}
},
"filter": {
"mySynonymsFilter": {
"type": "synonym",
"ignore_case": true,
"synonyms_path": "test.syn"
}
}
}
},
"mappings": {
"testitem": {
"properties": {
"title": {
"type": "string",
"index_analyzer": "myIndexAnalyzer",
"search_analyzer": "mySearchAnalyzer"
}
}
}
}
}

and added some data:

POST /_bulk
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "1" }}
{ "title": "aaa test daten eintrag." }
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "2" }}
{ "title": "bbb test daten eintrag." }
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "3" }}
{ "title": "eee test daten eintrag." }

Testing the myIndexAnalyzer using

POST /testindex/_analyze?analyzer=myIndexAnalyzer&pretty
{aaa test daten eintrag}

Results to:

{
"tokens": [
{
"token": "aaa",
"start_offset": 1,
"end_offset": 4,
"type": "SYNONYM",
"position": 1
},
{
"token": "bbb",
"start_offset": 1,
"end_offset": 4,
"type": "SYNONYM",
"position": 1
},
{
"token": "ccc",
"start_offset": 1,
"end_offset": 4,
"type": "SYNONYM",
"position": 1
},
{
"token": "ddd",
"start_offset": 1,
"end_offset": 4,
"type": "SYNONYM",
"position": 1
},
{
"token": "test",
"start_offset": 5,
"end_offset": 9,
"type": "",
"position": 2
}
]
}

Which to me seems to be fine.

Searching this index, i expected to find Record Ids 1 and 2 if I am
searching for "aaa", "bbb", "ccc", "ddd".

Which is my fault??

TIA
Ste Phan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/64e90076-f905-4490-bfe8-3b1607e5e98a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I forgot to figure out that if search for "aaa" I receive Record _id = 1,

searching vor "bbb" I receive Record _id = 2 ... nothing else.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2a74f413-bd31-4c9a-a3a0-95084cc2fc0d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

What kind of query are you executing? Are you query against a specific
field? A match query against the title field should work.

When using the analyze API, explicit state the field and not the analyzer
for more accurate behavior of what really goes on.

Cheers,

Ivan
On Apr 21, 2015 11:40 AM, "Ste Phan" stephan.skusa@gmail.com wrote:

... I build a little sample of what I do.

My Test Synonyms file is (test.syn placed into my /etc/elasticsearch
folder):

aaa,bbb,ccc,ddd
www,xxx,yyy,zzz
eee,fff,ggg,hhh => 111
sss,ttt,uuu,vvv => 222
rrr => 333,444,555

I created an index like so:

PUT /testindex?pretty
{
"settings": {
"analysis": {
"analyzer": {
"myIndexAnalyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"mySynonymsFilter"
]
},
"mySearchAnalyzer": {
"tokenizer": "standard",
"filter": [
"lowercase"
]
}
},
"filter": {
"mySynonymsFilter": {
"type": "synonym",
"ignore_case": true,
"synonyms_path": "test.syn"
}
}
}
},
"mappings": {
"testitem": {
"properties": {
"title": {
"type": "string",
"index_analyzer": "myIndexAnalyzer",
"search_analyzer": "mySearchAnalyzer"
}
}
}
}
}

and added some data:

POST /_bulk
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "1" }}
{ "title": "aaa test daten eintrag." }
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "2" }}
{ "title": "bbb test daten eintrag." }
{ "index": { "_index": "testindex", "_type": "testitem", "_id": "3" }}
{ "title": "eee test daten eintrag." }

Testing the myIndexAnalyzer using

POST /testindex/_analyze?analyzer=myIndexAnalyzer&pretty
{aaa test daten eintrag}

Results to:

{
"tokens": [
{
"token": "aaa",
"start_offset": 1,
"end_offset": 4,
"type": "SYNONYM",
"position": 1
},
{
"token": "bbb",
"start_offset": 1,
"end_offset": 4,
"type": "SYNONYM",
"position": 1
},
{
"token": "ccc",
"start_offset": 1,
"end_offset": 4,
"type": "SYNONYM",
"position": 1
},
{
"token": "ddd",
"start_offset": 1,
"end_offset": 4,
"type": "SYNONYM",
"position": 1
},
{
"token": "test",
"start_offset": 5,
"end_offset": 9,
"type": "",
"position": 2
}
]
}

Which to me seems to be fine.

Searching this index, i expected to find Record Ids 1 and 2 if I am
searching for "aaa", "bbb", "ccc", "ddd".

Which is my fault??

TIA
Ste Phan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/64e90076-f905-4490-bfe8-3b1607e5e98a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/64e90076-f905-4490-bfe8-3b1607e5e98a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB6A9nq1GC52sdugQx1%2BM_pJJvdo6ti0ofQYfbOqK6P2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I tried multi_match queries.

The little Example seems to work meanwhile ... don't know why?! But my
original index has the same problem.

I am posting the synonyms file, so as the create statement.

Analyzing this via:

GET /myindex/_analyze?field=article.authors
{gumbel}

results to:

{
"tokens": [
{
"token": "gumbel",
"start_offset": 1,
"end_offset": 7,
"type": "",
"position": 1
}
]
}

No synonym results as I would expect. Searching for a synonym "gumble" for
example gives no results.

I would be glad to hear from you ... TIA

Ste Phan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0aa73333-56ce-40af-a5e5-1f7673ff60a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ok, I found my error ... the structure of the index definition was wrong
... sorry.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7fea8770-154c-4815-be1b-e2ed397a4944%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.