I have keyword like: 'ABC - DEF', 'ABC DEF', 'ABC is there any combination with DEF'.
Exact Match : "ABC-DEF" - Its gives perfect result ("ABC-DEF").
Partial Match: "ABC-DE" - its gives more result (all 3 document). Last one should not come in search.
Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.
A full reproduction script will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.
What type of full reproduction script you want? you mean to say mappings or settings or content?
DELETE testindex
PUT testindex
PUT testindex/_mapping
{
"properties": {
"filename": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
POST testindex/_doc
{
"filename": "704140-0001 FIT TO P1.pdf"
}
POST testindex/_doc
{
"filename": "704140 FIT TO P4 MARCH 1994 DATA SUBMITTLE -0051.pdf"
}
POST testindex/_doc
{
"filename": "704140 FIT TO P1 OCTOBER 2003 POINT OF FIT -0001.pdf"
}
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:("704140"))')"""
}
it gives perfect exact match result.
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:(70414*))')"""
}
it gives partial match result for single word.
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:("704140-0001"))')"""
}
it gives perfect exact match result.
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:(704140-00*))')"""
}
Here is a way to fix it. Note that with this analyzer it's now case sensitive.
If you don't want that, you'll need to provide a custom analyzer.
Also 704140-0001 won't match anymore when you query with 704140.
DELETE testindex
PUT testindex
{
"mappings": {
"properties": {
"filename": {
"type": "text",
"analyzer": "whitespace"
}
}
}
}
POST testindex/_doc
{
"filename": "704140-0001 FIT TO P1.pdf"
}
POST testindex/_doc
{
"filename": "704140 FIT TO P4 MARCH 1994 DATA SUBMITTLE -0051.pdf"
}
POST testindex/_doc
{
"filename": "704140 FIT TO P1 OCTOBER 2003 POINT OF FIT -0001.pdf"
}
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:("704140"))')"""
}
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:(70414*))')"""
}
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:(704140-00*))')"""
}
For more information, use the _analyze API to understand how everything is working behind the scene.
PUT testindex/_mapping
{
"properties": {
"filecontent": {
"type": "text"
},
"filename": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
POST testindex/_doc
{
"filename": "704140-0001 FIT TO P1.pdf",
"filecontent": "This is PO : 000704140"
}
POST testindex/_doc
{
"filename": "704140 FIT TO P4 MARCH 1994 DATA SUBMITTLE -0051.pdf",
"filecontent": "This is PO : 704140"
}
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filecontent:(704140))') """
}
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filecontent:(704140)) OR (filename:(704140))') """
}
When I search document with keyword "704140-0001 FIT TO P1.pdf", it should search in filename(keyword).
When I search document with keyword "704140", it should search in filename(keyword and text).
I have mappings same as above, but my question is only how I search exact and partial search with keyword like "704140-00*".
Its already working in sphinx environment. before turn into elastic search, I want to clear my doubt. it will work or not? if its work then how can achieve?
The user has 1 input text field content and a check box include_filename_too in the form.
If the user search for foo without include_filename_too checked, it should search for foo only in filecontent.
If the user search for foo with include_filename_too checked, it should search for foo in filecontent and filename.
Am I correct?
If I am correct, I don't understand how this works:
How do you know that you should search (keyword) or (keyword and text)?
Yeah but may be you could also think of giving the user a better experience. I mean that without having the user to click on a checkbox like include_filename_too. Just search everywhere and use the scoring system to give first the more relevant results to the user. Just like you do everyday with google.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.