Limit result

Hello. I'm pretty new to elastic and this is my first post in the forum :slight_smile:

I've a very simple document with this format

{
"shop_name" : "Armani",
"country" : "italy",
"city": "milan",
"relevance" : 10
}

** DATA **

  • 15 shops in Milano
  • 5 shops in Rome
  • 100 shops in Italy
  • 200 shops in the world

Scenario 1 - Top 20 shops for people living in Milan
I want to get the top 10 shop in Milano and top 5 in the rest of Italy and top N from the rest of the world enough to reach 20 shops in the resultset

So the resulting result set has;

  • 10 shops from Milan sorted by relevance.
  • 5 shops in Italy sorted by relevance
  • 5 shops sorted by relevance

Scenario 2 - Top 20 shops for people living in Rome
Same as Milano but for Rome.

As you can see in the "DATA" section I don't have 10 shops in Rome but my application is not aware of it.
The query used by my application will be the very same as the one above (except for the city) and elastic should fill the resultset with the required number of shops to reach 20.

  • 5 shops from Rome sorted by relevance.
  • 5 shops in Italy sorted by relevance
  • 10 shops from the rest of the world sorted by relevance | Elastic should figure out by itself that the shops from the rest of the world have to be to be 10.

Hope my question is clear.

--
Simone

I believe that you will need to use the /_msearch API for this, to get back the different results.

Here's an example that I hope makes things clear, first some fake data:

PUT /i/d/1
{
  "shop_name" : "Armani",
  "country" : "italy",
  "city": "milan",
  "relevance" : 10
}

PUT /i/d/2
{
  "shop_name" : "Shop2",
  "country" : "italy",
  "city": "milan",
  "relevance" : 5
}

PUT /i/d/3
{
  "shop_name" : "Store3",
  "country" : "italy",
  "city": "campione",
  "relevance" : 12
}

PUT /i/d/3
{
  "shop_name" : "Store4",
  "country" : "italy",
  "city": "campione",
  "relevance" : 3
}

PUT /i/d/4
{
  "shop_name" : "Store5",
  "country" : "switzerland",
  "city": "lugano",
  "relevance" : 10
}

PUT /i/d/5
{
  "shop_name" : "Store6",
  "country" : "switzerland",
  "city": "lugano",
  "relevance" : 18
}

The query you'll execute then:

POST /i/_msearch
{}
{"query": {"match": {"city": "milan"}}, "size": 10, "sort": {"relevance": {"order": "desc"}}}
{}
{"query": {"match": {"country": "italy"}}, "size": 5, "sort": {"relevance": {"order": "desc"}}}
{}
{"query": {"match_all": {}}, "size": 5, "sort": {"relevance": {"order": "desc"}}}

And then, the results you get back from the /_msearch, which I believe is what you are looking for:

{
  "responses" : [
    {
      "took" : 81,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 2,
        "max_score" : null,
        "hits" : [
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "1",
            "_score" : null,
            "_source" : {
              "shop_name" : "Armani",
              "country" : "italy",
              "city" : "milan",
              "relevance" : 10
            },
            "sort" : [
              10
            ]
          },
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "2",
            "_score" : null,
            "_source" : {
              "shop_name" : "Shop2",
              "country" : "italy",
              "city" : "milan",
              "relevance" : 5
            },
            "sort" : [
              5
            ]
          }
        ]
      },
      "status" : 200
    },
    {
      "took" : 76,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 3,
        "max_score" : null,
        "hits" : [
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "1",
            "_score" : null,
            "_source" : {
              "shop_name" : "Armani",
              "country" : "italy",
              "city" : "milan",
              "relevance" : 10
            },
            "sort" : [
              10
            ]
          },
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "2",
            "_score" : null,
            "_source" : {
              "shop_name" : "Shop2",
              "country" : "italy",
              "city" : "milan",
              "relevance" : 5
            },
            "sort" : [
              5
            ]
          },
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "3",
            "_score" : null,
            "_source" : {
              "shop_name" : "Store4",
              "country" : "italy",
              "city" : "campione",
              "relevance" : 3
            },
            "sort" : [
              3
            ]
          }
        ]
      },
      "status" : 200
    },
    {
      "took" : 74,
      "timed_out" : false,
      "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
      },
      "hits" : {
        "total" : 5,
        "max_score" : null,
        "hits" : [
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "5",
            "_score" : null,
            "_source" : {
              "shop_name" : "Store6",
              "country" : "switzerland",
              "city" : "lugano",
              "relevance" : 18
            },
            "sort" : [
              18
            ]
          },
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "4",
            "_score" : null,
            "_source" : {
              "shop_name" : "Store5",
              "country" : "switzerland",
              "city" : "lugano",
              "relevance" : 10
            },
            "sort" : [
              10
            ]
          },
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "1",
            "_score" : null,
            "_source" : {
              "shop_name" : "Armani",
              "country" : "italy",
              "city" : "milan",
              "relevance" : 10
            },
            "sort" : [
              10
            ]
          },
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "2",
            "_score" : null,
            "_source" : {
              "shop_name" : "Shop2",
              "country" : "italy",
              "city" : "milan",
              "relevance" : 5
            },
            "sort" : [
              5
            ]
          },
          {
            "_index" : "i",
            "_type" : "d",
            "_id" : "3",
            "_score" : null,
            "_source" : {
              "shop_name" : "Store4",
              "country" : "italy",
              "city" : "campione",
              "relevance" : 3
            },
            "sort" : [
              3
            ]
          }
        ]
      },
      "status" : 200
    }
  ]
}

Hope that helps!

1 Like

This is for sure the way to go.

A couple of details:

  • Items returned are not unique. I see "Armani" is in the first and in the second result set. Is there a way to avoid duplicated results ?

  • I see you specified the "size": 5 in the last query but I don't have this parameter. I want a maximum of 10 items from the first query and a maximum of 5 from the second one and 20 results in total. This means that if the first query returns 3 items (because there aren't other items that match the query) the second one return 5 I want the third one to return 12 items (20 - (3+5))

Simone Fumagalli writes:

*Items returned are not unique. I see "Armani" is in the first and in
the second result set. Is there a way to avoid duplicated results ?

I would recommend you check out the recently released field collapsing
feature for de-duplicating results based on a field:

*I see you specified the "size": 5 in the last query but I don't have
this parameter. I want a maximum of 10 items from the first query and
a maximum of 5 from the second one and 20 results in total. This means
that if the first query returns 3 items (because there aren't other
items that match the query) the second one return 5 I want the third
one to return 12 items (20 - (3+5))

Okay, in this case, you have a couple of options.

  1. Do each individual query, so that you know the total number of hits
    and can adjust the third query if the first one only returned 3 hits.

  2. Return more hits than necessary from each query and display only what
    is needed on the client side. So you'd ask for 20 hits from each of the
    three queries, but only display [10, 5, 5], [3, 4, 13], or [0, 0, 20]
    etc.

If you want to do the queries in a single request, then option #2 is the
way to go.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.