Help with search on website

I am creating a search on website based on ElasticSearch, and I am very new to ElasticSearch so I would be happy for any help.

This is a structure of my ElasticSearch content:

hits:   
    0:  
        _index: "mysite"
        _type:  "products"
        _id:    "1"
        _score: 1
        _source:    
            name:   
                0:  "This is name"
            number: "N6"
            status: "Y"

And this is my Perl script:

my $e = Search::Elasticsearch->new();

    my $results = $e->search(
        index => 'mysite',
        type => 'products',
        size => 3,
            body  => {
                    query => {
                            match => { name => $query }
                    }
            }
    );

I would like to ask you how to make script that it match only products which have a status "Y" and it should find a substring of name, for example if $query contains "is", it should find that product because that product contains "is" in name "This is name", and the last one, it should search for number too, for example if $query contains "N6", it should find that product too.

So something like this: (name OR number) AND status = "Y"

Thank you very much.

Hi,

Hope you enjoy your first test ride with Elasticsearch so far. It looks like at this point you should get familiar with a few basic queries and how to combine them.

For exact term matches (like matching the status "Y" and the number "N6") the usual first "go-to" query is the Term Query that matches documents that contain the exact term specified.

For "full text" search (thats what you probably want for the "product" field) you should start looking at the Match Query or maybe the Query String Query. They have a few more options, so I'd suggest rading up on them and trying some of them out.

For the combination of queries you typically start with the Bool Query, which takes any of the above queries as sub-queries. In your example you probably need to put the term query on the status field in a "filter" clause (selecting only documents with status "Y") and the queries on number and name in the "should" clause, although there are a few other options to try out here depending on your use case. This gets a bit more involved, but the above should get you started for now.

Hope this helps.

@cbuescher Thank you very much for reply, for now i have something like:

           "query" => {
                "bool" => {
                    "filter" => [{
                        "should" => [{
                            "match" => {"name" => "is" },
                            "term" => {"number" => "N6"}
                        }],
                        "must" => {
                            "term" => {"status" => "Y"}
                        }
                    }]
                }
            } 

But I get error: SearchPhaseExecutionException[Failed to execute phase [query], all shards failed, that error is really long, over 4600 letters.

Still, its hard to say whats wrong without seing those error. I think its the match query, although I'm not familier with the syntax in the client code you are using. Any chance you can either share the error or post the query in the REST syntax?

@cbuescher I uploaded it to pastebin --> https://pastebin.com/fZBm41NU

Maybe bool query does not support [filter]]; ?

Yesm thats it. I have a hard time reading the query in the provided syntax (its the php client, no?), so I didn't see this in you example. The bool query syntax (in the Json REST format) is:

"query": {
    "bool" : {
      "must" : {
        ...
      },
      "filter": {
        ...
      },
      "must_not" : {
        ...
      },
      "should" : [
        ...
      ]
    }
  }

So your must branch actually should be a filter.

@cbuescher Must should be a filter? So:

    "query" => {
        "bool" => {
            "must" => [{
                "should" => [{
                    "match" => {"name" => "is" },
                    "term" => {"number" => "N6"}
                }],
                "filter" => {
                    "term" => {"status" => "Y"}
                }
            }]
        }
    }

because it throw another error: No query registered for [filter]];

I think at this point you should read the documentation for the search REST API and how you translate this to your client language of choice.

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html
https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/_search_operations.html

@cbuescher by the way it is Perl not PHP :slight_smile: but I created a variable which contains a query in JSON format, I think it will be better, because it will be independent of language.

So my query looks like:

my $json = '{
    "query" : {
        "bool" : {
            "must" : [{
                "should" : [{
                    "match" : {"name" : "is" },
                    "term" : {"number" : "N6"}
                }],
                "filter" : {
                    "term" : {"status" : "Y"}
                }
            }]
        }
    }
}'; 

But there is an error: No query registered for [should]]; So this is elastic error not Perl.

That why I pointed you to the documentation for the Bool Query. It will tell you what's wrong with the above example. (hint: the should and filter need to go to another level in the query syntax tree)

@cbuescher So I read documentation and I changed my code to:

    "query" : {
        "bool" : {
            "must" : {
                "term" : {"status" : "Y"}
            },
            "filter": {
                "term" : {"status" : "Y"}
            },
            "should" : [
                {   "match" : { "name" : "is"   }   },
                {   "term" :  { "number" : "N6" }   }
            ]
        }
    }

But I get error: [bool] query does not support [filter]]; why??? In documentation is nested filter in bool too.

I don't get that error using that query. Which version of Elasticsearch are you using? The documentation refers to 5.6.0, the "filter" in the bool query has been there since 2.0.0 I think.

  "status" : 200,
  "name" : "db-*",
  "version" : {
    "number" : "1.2.1",
    "build_hash" : "6c95b759f9e7ef0f8e17f77d850da43ce8a4b364",
    "build_timestamp" : "2014-06-03T15:02:52Z",
    "build_snapshot" : false,
    "lucene_version" : "4.8"
  },
  "tagline" : "You Know, for Search"

So it is 1.2.1, haha sorry I have to upgrade :slight_smile:

@cbuescher
I upgraded it and now there is no errors but it still doesn't work.. this is json:

    "query" : {
        "bool" : {
            "must" : {
                "term" : {"status" : "Y"}
            },
            "filter": {
                "term" : {"status" : "Y"}
            },
            "should" : [
                {   "match" : { "name" : "sheet"   }   },
                {   "term" :  { "number" : "N5" }   }
            ]
        }
    }

If there is must and filter it doesn't find anything at all.. If I leave there only should, it finds but for example match : name is "sheet" and it doesn't find worksheet, It should find a substring, and term : number cannot find too.

Hi,

There are a couple of things to note here, most of which have to do with the topics of Mappings and Analysis.

If you don't specify a mapping for String fields, it is indexed as an analyzed "text" field, which involves spliting the string into tokens and e.g. lowercasing it. This is the case for the "status" and "number" field in your case. So the letter "Y" gets lowercased to "y" and thats what the Terms query would match on. There are two options:

  1. Specify the mapping for the "status" and "number" field to be "keyword" so it doesn't get analyzed
  2. by default we add a subfield called "keyword" to each String field that doesn't have an explicint mapping. You can use that in your terms query, which would then be "status.keyword"

Also the standard analysis splits input text on whitespace, so you cannot find substrings with the "match" query. You can use other query types (like e.g. Prefix queries for those cases). Not all are advisable for each use case though, it hard to give general advice here.

Here's an example for a document and a matching query. I'd suggest reading up on the above topics (analysis, mappings), they are really central to understanding how search works in general.

PUT /test/t/1
{
  "name" : "the quick brown fox",
  "status" : "Y",
  "number" : "N6"
}

POST /test/t/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "status.keyword": "Y"
        }
      },
      "should": [
        {
          "match": {
            "name": "brown"
          }
        },
        {
          "term": {
            "number.keyword": "N6"
          }
        }
      ]
    }
  }
}
1 Like

@cbuescher Hello, sorry for late reply. Yes I understand, your script shows me some results.

But problem is that it shows items which doesn't contain only name "brown" it looks, that it shows all items randomly.

Hi,

I think your original question has been answered quiet exhaustively in this thread. Would you mind opening a new issue about the problem you are facing now? In it I suggest you mention exactly which version of ES you are using, post one or two test documents, your mapping, your query and your expected result for this query. From there I'm sure someone can assist you.

1 Like

@cbuescher Hello, no no you don't understand me, for example this code (in should):

            {
                "match": {
                    "name": "row"
                }
            }

Doesn't get any results, because it doesn't search for substrings, it should find a "brown" because "brown" contains substring "row".. --> solved it by using wildcard (but I don't know if it is good approach)

And if I use

"filter": {
        "term": {
          "status.keyword": "Y"
        }
      }

It works, but it shows me results randomly, not based on number or name.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.