We are indexing data which has various fields like Address, Phone Number,
Street Number etc. The problem which we are facing is Address field has
data space separated for .e.g "Oak Avenue", "Pretoria Avenue", "B Street",
"A Street" etc.
When we search for Pretoria Avenue it gives all set of documents which has
Pretoria as well as Avenue. But we need specific search to Pretoria Avenue
only.
Same problem with Phone Number and Street Number it has data with "-", "."
characters, and the document separate the data into 2 whenever it has "-" .
How and which analyzers we should use to get proper and specific results in
that case?
You can can also do this through your custom analyzer if you have one
of those too.
Not Analyzed will not tokenize the string at all. The only issue I
would see in your senerio is if i searched for "Oak Ave" that wouldn't
come back. You might need to setup some synonyms or create a custom
analyzer.
We are indexing data which has various fields like Address, Phone Number,
Street Number etc. The problem which we are facing is Address field has data
space separated for .e.g "Oak Avenue", "Pretoria Avenue", "B Street", "A
Street" etc.
When we search for Pretoria Avenue it gives all set of documents which has
Pretoria as well as Avenue. But we need specific search to Pretoria Avenue
only.
Same problem with Phone Number and Street Number it has data with "-", "."
characters, and the document separate the data into 2 whenever it has "-" .
How and which analyzers we should use to get proper and specific results in
that case?
Thanks for the prompt response. Use of synonyms could be difficult because
the data is huge and we cannot predict the data in advance.
We will look into custom analyzers for this, because we need documents to
be searchable if I write only "pretoria".
On Thursday, 5 July 2012 14:27:28 UTC+2, Shaun Farrell wrote:
In your mapping you can put
"index" : "not_analyzed"
You can can also do this through your custom analyzer if you have one
of those too.
Not Analyzed will not tokenize the string at all. The only issue I
would see in your senerio is if i searched for "Oak Ave" that wouldn't
come back. You might need to setup some synonyms or create a custom
analyzer.
We are indexing data which has various fields like Address, Phone
Number,
Street Number etc. The problem which we are facing is Address field has
data
space separated for .e.g "Oak Avenue", "Pretoria Avenue", "B Street", "A
Street" etc.
When we search for Pretoria Avenue it gives all set of documents which
has
Pretoria as well as Avenue. But we need specific search to Pretoria
Avenue
only.
Same problem with Phone Number and Street Number it has data with "-",
"."
characters, and the document separate the data into 2 whenever it has
"-" .
How and which analyzers we should use to get proper and specific results
in
that case?
Thanks for the prompt response. Use of synonyms could be difficult because the data is huge and we cannot predict the data in advance.
We will look into custom analyzers for this, because we need documents to be searchable if I write only "pretoria".
On Thursday, 5 July 2012 14:27:28 UTC+2, Shaun Farrell wrote:
In your mapping you can put
"index" : "not_analyzed"
You can can also do this through your custom analyzer if you have one
of those too.
Not Analyzed will not tokenize the string at all. The only issue I
would see in your senerio is if i searched for "Oak Ave" that wouldn't
come back. You might need to setup some synonyms or create a custom
analyzer.
We are indexing data which has various fields like Address, Phone Number,
Street Number etc. The problem which we are facing is Address field has data
space separated for .e.g "Oak Avenue", "Pretoria Avenue", "B Street", "A
Street" etc.
When we search for Pretoria Avenue it gives all set of documents which has
Pretoria as well as Avenue. But we need specific search to Pretoria Avenue
only.
Same problem with Phone Number and Street Number it has data with "-", "."
characters, and the document separate the data into 2 whenever it has "-" .
How and which analyzers we should use to get proper and specific results in
that case?
The multi-field-type needs two searchable fields like in example one is
Name and second one is untouched.name. This will be not suitable in our
condition because we are searching based on one field.
Is there any other alternative which will satisfy our condition? I tried to
make custom analyzer, what are the filters will make it not_analyzed as
well as searchable when give parts of the field string.
Thanks in advance.
On Thursday, 5 July 2012 16:24:14 UTC+2, Shaun Farrell wrote:
Thanks for the prompt response. Use of synonyms could be difficult because
the data is huge and we cannot predict the data in advance.
We will look into custom analyzers for this, because we need documents to
be searchable if I write only "pretoria".
On Thursday, 5 July 2012 14:27:28 UTC+2, Shaun Farrell wrote:
In your mapping you can put
"index" : "not_analyzed"
You can can also do this through your custom analyzer if you have one
of those too.
Not Analyzed will not tokenize the string at all. The only issue I
would see in your senerio is if i searched for "Oak Ave" that wouldn't
come back. You might need to setup some synonyms or create a custom
analyzer.
We are indexing data which has various fields like Address, Phone
Number,
Street Number etc. The problem which we are facing is Address field has
data
space separated for .e.g "Oak Avenue", "Pretoria Avenue", "B Street", "A
Street" etc.
When we search for Pretoria Avenue it gives all set of documents which
has
Pretoria as well as Avenue. But we need specific search to Pretoria
Avenue
only.
Same problem with Phone Number and Street Number it has data with "-",
"."
characters, and the document separate the data into 2 whenever it has
"-" .
How and which analyzers we should use to get proper and specific results
in
that case?
On Monday, July 9, 2012 5:50:04 AM UTC-4, aps wrote:
Hi Shaun,
The multi-field-type needs two searchable fields like in example one is
Name and second one is untouched.name. This will be not suitable in our
condition because we are searching based on one field.
Is there any other alternative which will satisfy our condition? I tried
to make custom analyzer, what are the filters will make it not_analyzed as
well as searchable when give parts of the field string.
Thanks in advance.
On Thursday, 5 July 2012 16:24:14 UTC+2, Shaun Farrell wrote:
Thanks for the prompt response. Use of synonyms could be difficult
because the data is huge and we cannot predict the data in advance.
We will look into custom analyzers for this, because we need documents to
be searchable if I write only "pretoria".
On Thursday, 5 July 2012 14:27:28 UTC+2, Shaun Farrell wrote:
In your mapping you can put
"index" : "not_analyzed"
You can can also do this through your custom analyzer if you have one
of those too.
Not Analyzed will not tokenize the string at all. The only issue I
would see in your senerio is if i searched for "Oak Ave" that wouldn't
come back. You might need to setup some synonyms or create a custom
analyzer.
We are indexing data which has various fields like Address, Phone
Number,
Street Number etc. The problem which we are facing is Address field has
data
space separated for .e.g "Oak Avenue", "Pretoria Avenue", "B Street",
"A
Street" etc.
When we search for Pretoria Avenue it gives all set of documents which
has
Pretoria as well as Avenue. But we need specific search to Pretoria
Avenue
only.
Same problem with Phone Number and Street Number it has data with "-",
"."
characters, and the document separate the data into 2 whenever it has
"-" .
How and which analyzers we should use to get proper and specific
results in
that case?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.