I've been using ElasticSearch for a while, but I haven't really dug
into the text search side of things. I went to implement really simple
text searching earlier today and got a bit confused.
From what I understand I can't do a "Whole text value is equal to
search parameter" query because the text was analysed when it was
indexed and doesn't exist in it's whole form? If someone could explain
how to get the following working I'd be very grateful:
Index a piece of text:
"Lorem ipsum dolor sit amet"
And be able to retrieve the document with the following queries:
Phrase: "ipsum dolor"
Contains: "sit"
EqualTo: "Lorem ipsum dolor sit amet"
But not retrieve the document for the following queries:
Phrase: "dolor ipsum"
EqualTo: "dolor sit"
I don't really want users to be able to use keywords in the strings
for the sake of unwanted confusion so keywords like "and" and "or"
should either be ignored or actually searched for rather than being
operators.
Doing an "equal to" on an analyzed fields that gets broken down into
multiple terms is not really possible, but doing phrase searches, or sloppy
phrase searches are certainly possible.
For exact matching, you could have a multi field mapping, with a non
analyzed option on the field, and search against that.
I've been using Elasticsearch for a while, but I haven't really dug
into the text search side of things. I went to implement really simple
text searching earlier today and got a bit confused.
From what I understand I can't do a "Whole text value is equal to
search parameter" query because the text was analysed when it was
indexed and doesn't exist in it's whole form? If someone could explain
how to get the following working I'd be very grateful:
Index a piece of text:
"Lorem ipsum dolor sit amet"
And be able to retrieve the document with the following queries:
Phrase: "ipsum dolor"
Contains: "sit"
EqualTo: "Lorem ipsum dolor sit amet"
But not retrieve the document for the following queries:
Phrase: "dolor ipsum"
EqualTo: "dolor sit"
I don't really want users to be able to use keywords in the strings
for the sake of unwanted confusion so keywords like "and" and "or"
should either be ignored or actually searched for rather than being
operators.
I appreciate the time you've taken to answer Shay. Elasticsearch is
great.
I'm using the Java API.
How would I go about producing multiple mappings for a single field?
or do you mean having two fields with only one being analysed? And how
do I go about choosing which mapping to use when querying/filtering?
You say "doing phrase searches, or sloppy phrase searches are
certainly possible." Which of my examples do these relate to and how
would I go about implementing this?
I may have misunderstood, but if phrase searches are possible (and by
this I mean "ipsum dolor" is matched, but "dolour ipsum" is not in the
above example), then why is a full "equal to" not possible on the same
analysed field?
Doing an "equal to" on an analyzed fields that gets broken down into
multiple terms is not really possible, but doing phrase searches, or sloppy
phrase searches are certainly possible.
For exact matching, you could have a multi field mapping, with a non
analyzed option on the field, and search against that.
I've been using Elasticsearch for a while, but I haven't really dug
into the text search side of things. I went to implement really simple
text searching earlier today and got a bit confused.
From what I understand I can't do a "Whole text value is equal to
search parameter" query because the text was analysed when it was
indexed and doesn't exist in it's whole form? If someone could explain
how to get the following working I'd be very grateful:
Index a piece of text:
"Lorem ipsum dolor sit amet"
And be able to retrieve the document with the following queries:
Phrase: "ipsum dolor"
Contains: "sit"
EqualTo: "Lorem ipsum dolor sit amet"
But not retrieve the document for the following queries:
Phrase: "dolor ipsum"
EqualTo: "dolor sit"
I don't really want users to be able to use keywords in the strings
for the sake of unwanted confusion so keywords like "and" and "or"
should either be ignored or actually searched for rather than being
operators.
I appreciate the time you've taken to answer Shay. Elasticsearch is
great.
I'm using the Java API.
How would I go about producing multiple mappings for a single field?
or do you mean having two fields with only one being analysed? And how
do I go about choosing which mapping to use when querying/filtering?
You say "doing phrase searches, or sloppy phrase searches are
certainly possible." Which of my examples do these relate to and how
would I go about implementing this?
I may have misunderstood, but if phrase searches are possible (and by
this I mean "ipsum dolor" is matched, but "dolour ipsum" is not in the
above example), then why is a full "equal to" not possible on the same
analysed field?
Doing an "equal to" on an analyzed fields that gets broken down into
multiple terms is not really possible, but doing phrase searches, or
sloppy
phrase searches are certainly possible.
For exact matching, you could have a multi field mapping, with a non
analyzed option on the field, and search against that.
I've been using Elasticsearch for a while, but I haven't really dug
into the text search side of things. I went to implement really simple
text searching earlier today and got a bit confused.
From what I understand I can't do a "Whole text value is equal to
search parameter" query because the text was analysed when it was
indexed and doesn't exist in it's whole form? If someone could explain
how to get the following working I'd be very grateful:
Index a piece of text:
"Lorem ipsum dolor sit amet"
And be able to retrieve the document with the following queries:
Phrase: "ipsum dolor"
Contains: "sit"
EqualTo: "Lorem ipsum dolor sit amet"
But not retrieve the document for the following queries:
Phrase: "dolor ipsum"
EqualTo: "dolor sit"
I don't really want users to be able to use keywords in the strings
for the sake of unwanted confusion so keywords like "and" and "or"
should either be ignored or actually searched for rather than being
operators.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.