Hi Glen,
On a related note, I have a use case where I want to search using
wild-cards on a custom analyzed field. I am currently seeing some
discrepancies w.r.t what I expect.
Basically, I have string data in a field such as "Name-55", "Name-56" etc.
I want to be able to search for "Name-5*", and get these results.
I have indexed the data as terms
"Name", "-", "55"
"Name", "-", "56"
I am using a custom pattern analyzer to achieve this. I am using a similar
custom pattern analyzer for my query string, except that I am swallowing
&,? and *.
"my_template" : {
"template" : "",
"order": 1,
"settings" :{
"analysis": {
"analyzer": {
"custom_index":{
"type": "pattern",
"pattern":"([\s]+)|((?<=\p{L})(?=\P{L})|((?<=\P{L})(?=\p{L}))|((?<=\d)(?=\D))|((?<=\D)(?=\d)))"
},
"custom_search":{
"type": "pattern",
"pattern":"([?&\s]+)|((?<=\p{L})(?=\P{L})|((?<=\P{L})(?=\p{L}))|((?<=\d)(?=\D))|((?<=\D)(?=\d)))"
}
}
}
},
"mappings" : {
"account" : {
"properties" : {
"myfield" : {
"type" : "string",
"store" : "yes",
"index" : "analyzed",
"index_analyzer" :"custom_index",
"search_analyzer":"custom_search"
}}}}}}
Using this, I see that when I search for "Name-5*", I do not get any
results returned.
However, if I search for "Name- 5*" (Note additional white-space in the
search string), then I get the results Name-55 and Name-56.
Do you have an understanding of why elasticsearch may be exhibiting this
behavior? Is there some issue in the way I have setup the patterns in my
analyzer?
Your help is much appreciated!
Thanks,
On Monday, June 30, 2014 9:21:40 AM UTC-7, Glen Smith wrote:
Totally. For example:
"analyzer": {
"default_index": {
"tokenizer": "standard",
"filter": ["standard", "lowercase"]
},
"default_search": {
"tokenizer": "standard",
"filter": ["standard", "lowercase", "stop"]
},
On Monday, June 30, 2014 12:19:55 PM UTC-4, mooky wrote:
Excellent. Thanks for the info.
Is it possible to set my custom analyser as the default analyser for an
index (ie instead of standard_analyzer)
-N
On Monday, 30 June 2014 14:41:10 UTC+1, Glen Smith wrote:
You can set up an analyser for your index...
...
"my-index": {
"analysis": {
"analyzer": {
"default_index": {
"tokenizer": "standard",
"filter": ["standard", "icu_fold_filter", "stop"]
},
"default_search": {
"tokenizer": "standard",
"filter": ["standard", "icu_fold_filter", "stop"]
},
"custom_index": {
"tokenizer": "whitespace",
"filter": ["lower"]
},
"custom_search": {
"tokenizer": "whitespace",
"filter": ["lower"]
}
}
}
}
...
and then map your relevant field accordingly:
{
"_timestamp": {
"enabled": "true",
"store": "yes"
},
"properties": {
"my_field": {
"type": "string",
"index_analyzer": "custom_index",
"search_analyzer": "custom_search"
}
}
}
Note that you can (and often should) set up index analysis and search
analysis differently (eg if you use synonyms, only expand search terms).
Hope I haven't missed the point...
On Monday, June 30, 2014 8:47:36 AM UTC-4, mooky wrote:
Hi all,
I have a google-style search capability in my app that uses the _all
field with the default (standard) analyzer (I don't configure anything - so
its Elastic's default).
There are a few cases where we don't quite get the behaviour we want,
and I am trying to work out how I tweak the analyzer configuration.
-
if the user searches using 99.97, then they get the results they
expect, but if they search using 99.97%, they get nothing. They should get
the results that match "99.97%". The default analyzer config loses the %, I
guess.
-
I have no idea what the text is ( : ) ) but the user wants to search
using 托克金通贸易 - which is in the data - but currently we get zero results. It
looks like the standard analyzer/tokenizer breaks on each character.
I think I just want a whitespace analyzer with lower-casing ....
However,
a) I am not exactly sure how to configure that, and;
b) I am not 100% sure what I am losing/gaining vs standard analyzer.
(dont need stop-words - in any case default cfg for standard analyser
doesn't have any IIRC)
(FWIW, on all our other text fields, we tend to use no analyzer)
(Elastic 1.1.1 and 1.2 ...)
Cheers.
-M
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4608f1da-6fcb-47fa-a6e5-490d9895879f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.