Search Exact and partial match

Hi Support,

I have keyword like: 'ABC - DEF', 'ABC DEF', 'ABC is there any combination with DEF'.

Exact Match : "ABC-DEF" - Its gives perfect result ("ABC-DEF").
Partial Match: "ABC-DE" - its gives more result (all 3 document). Last one should not come in search.

Can you please help me on above query?

How can get perfect result?

Thanks

Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

Hi David,

Thanks for reply.

What type of full reproduction script you want? you mean to say mappings or settings or content?

DELETE testindex

PUT testindex

PUT testindex/_mapping
{
  "properties": {
    "filename": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    }
  }
}
POST testindex/_doc
{
  "filename": "704140-0001 FIT TO P1.pdf"
}
POST testindex/_doc
{
  "filename": "704140 FIT TO P4 MARCH 1994 DATA SUBMITTLE -0051.pdf"
}
POST testindex/_doc
{
  "filename": "704140 FIT TO P1 OCTOBER 2003 POINT OF FIT -0001.pdf"
}


POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:("704140"))')"""
}

it gives perfect exact match result.

POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:(70414*))')"""
}

it gives partial match result for single word.

POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:("704140-0001"))')"""
}

it gives perfect exact match result.

POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:(704140-00*))')"""
}

it gives no result.

I want partial match search. how it is possible?

Can you please help me?

Thanks

Any hope?

Here is a way to fix it. Note that with this analyzer it's now case sensitive.
If you don't want that, you'll need to provide a custom analyzer.
Also 704140-0001 won't match anymore when you query with 704140.

DELETE testindex
PUT testindex
{
  "mappings": {
    "properties": {
      "filename": {
        "type": "text",
        "analyzer": "whitespace"
      }
    }
  }
}
POST testindex/_doc
{
  "filename": "704140-0001 FIT TO P1.pdf"
}
POST testindex/_doc
{
  "filename": "704140 FIT TO P4 MARCH 1994 DATA SUBMITTLE -0051.pdf"
}
POST testindex/_doc
{
  "filename": "704140 FIT TO P1 OCTOBER 2003 POINT OF FIT -0001.pdf"
}


POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:("704140"))')"""
}
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:(70414*))')"""
}
POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filename:(704140-00*))')"""
}

For more information, use the _analyze API to understand how everything is working behind the scene.

Thanks David,

Actually, i want to use filename column for full text search and keyword search.

Thanks

Any example of the text you want to search for and the result you want?

Example here :

PUT testindex/_mapping
{
  "properties": {
    "filecontent": {
      "type": "text"
    },
    "filename": {
      "type": "text",
      "fields": {
        "keyword": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    }
  }
}

POST testindex/_doc
{
  "filename": "704140-0001 FIT TO P1.pdf",
   "filecontent": "This is PO : 000704140"
}
POST testindex/_doc
{
  "filename": "704140 FIT TO P4 MARCH 1994 DATA SUBMITTLE -0051.pdf",
  "filecontent": "This is PO : 704140"
}

POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filecontent:(704140))') """
}

POST _sql?format=txt
{
"query":""" SELECT filename FROM testindex WHERE QUERY('(filecontent:(704140)) OR (filename:(704140))') """
}

Thanks

Hi David,

I have provided example on above reply.
Can you please help me on the same?

Thanks

What do you expect to get back?

Hi David,

I expect that,

  1. When I search document with keyword "704140-0001 FIT TO P1.pdf", it should search in filename(keyword).

  2. When I search document with keyword "704140", it should search in filename(keyword and text).

I have mappings same as above, but my question is only how I search exact and partial search with keyword like "704140-00*".

Its already working in sphinx environment. before turn into elastic search, I want to clear my doubt. it will work or not? if its work then how can achieve?

Thanks

Any Solution?

How do you determine in which field you should search based on the user input?

I mean that I have hard time to understand the use case.
You have one search box, right?

Why should it search in different fields in one case or another?
Why not searching in both fields every time?

The more fields which match, better the score will be so best documents will be on the top of the response list.

Hi David,

Thanks for the reply.

How do you determine in which field you should search based on the user input?

Yes - based on user selection.
I gave 2 input box with check box [include filename too] selection. so user can search in filename too.

You have one search box, right?

Yes

Why should it search in different fields in one case or another?

Because we have existing search functionality in sphinx search.

Why not searching in both fields every time?

User needs to search either in content or in filename OR in both fields.

I hope you understood now.

Can you please guide me in correct way?

How mapping should be for exact and partial match?

Cases:
1 - If search type Partial with "ABCD-DEFG", result should come with "ABCD-DEFG" and "ABCD-DEFGHI".

2 - If search type Exact with "ABCD-DEFG", result should come with "ABCD-DEFG".

Thanks

So if I'm trying to sum up:

  • The user has 1 input text field content and a check box include_filename_too in the form.
  • If the user search for foo without include_filename_too checked, it should search for foo only in filecontent.
  • If the user search for foo with include_filename_too checked, it should search for foo in filecontent and filename.

Am I correct?

If I am correct, I don't understand how this works:

How do you know that you should search (keyword) or (keyword and text)?

Yeah but may be you could also think of giving the user a better experience. I mean that without having the user to click on a checkbox like include_filename_too. Just search everywhere and use the scoring system to give first the more relevant results to the user. Just like you do everyday with google.

Yes, you understood perfect.

Now Just need to understand, above mapping will work ?

With these cases:
1 - If search type Partial with "ABCD-DEFG", result should come with "ABCD-DEFG" and "ABCD-DEFGHI".

2 - If search type Exact with "ABCD-DEFG", result should come with "ABCD-DEFG".

Thanks

But you did not answer this question:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.