Search API - Is there a way to get position of document in search response hits?

Elasticsearch 6.5.4

While this looks a trivial task I did not find an easy way to include document position in search response

Sample query

{
  "query":{
     "bool":{
        "must":[
           {
              "term":{
                 "organizationId":{
                    "value":21,
                    "boost":1.0
                 }
              }
           }
        ]
     }
  }
}

Desired results. Note the position field:

    "hits": {
        "total": 25,
        "max_score": null,
        "hits": [
            {
                "_index": "...",
                "_type": "...",
                "_id": "...",
                "position": 0,
                "_source": {...}
            },
            {
                "_index": "...",
                "_type": "...",
                "_id": "...",
                "position": 1,
                "_source": {...}
            }
        ]
    }

I'm not sure this is possible in a simple way. Isn't the information available implicitly by the position in the hits array? Or is the hits array getting scrambled by whatever you are using to query elastic?

What do you mean by

Isn't the information available implicitly by the position in the hits array?

I'm looking for a way to get the position in elasticsearch query and then use this value in an aggregation statement on the same query.

The goal is to get the position of a specific document (by given criteria) in search results.

  "query":{
  ...
  },
  "aggs":{
  "position-aggregation":{
    "terms":{
       "script":{
          "source":"if (<condition on doc>) { return doc.position } else { return null }",
          "lang":"painless"
       },
       "size":1,
       "min_doc_count":1,
       "shard_min_doc_count":0,
       "show_term_doc_count_error":false,
       "order":[
          {
             "_count":"desc"
          },
          {
             "_key":"asc"
          }
       ]
    }
 }

Ah, I understand your problem now. I was refering to the fact that the position of the doc is obviously also the position in the hits array, but since you're trying to use that inside the same query that is of no help.

I don't know if what you're planning to do is possible. The best I could come up with would be using the _score variable, which would at least preserve the relative position of the documents.

@NashSLX thank you for your response.

Could you please elaborate on how to use the _score to get document relative position?

Well since the score should be inversely proportional to the position, I was just thinking something like this:

"script":{
   "source":"if (<condition on doc>) { return someFunction(_score) } else { return null }",
   "lang":"painless"
},

SomeFunction could be something simple like 1/_score, or maybe something more sophisticated. This should give you the relative positioning of all documents that fullfill your condition. However, you obviously wouldn't have the exact position and would lose all information about the "position distance" between the documents.

In addition, the exact _score values obviously will not between consistent at all between different searches. If you need any of that, then I'm afraid I'm out of ideas for you.

1 Like

As I'm looking for an exact position I'm afraid using the _score won't be sufficient.

I tried using script_fields to get the position but it did not work as expected - For some reason the position of last element is reset to initial value:

{
  "query":{
     "bool":{
        "must":[
           {
              "term":{
                 "organizationId":{
                    "value":21,
                    "boost":1.0
                 }
              }
           }
        ],
        "adjust_pure_negative":true,
        "boost":1.0
     }
  },
  "_source" : true,
  "script_fields" : {
    "position": {
        "script": {
            "source":"params.counter++",
            "lang":"painless",
            "params":{
                "counter": 0
            }
        }
    }
  }
}

In case I had 4 hits from query I received the following response. position of last element is reset to initial counter value:

"hits": {
    "total": 4,
    "hits": [
        {
            "_index": "...",
            ...,
            "fields": {
                "position": [0]
            }
        },
        {
            "_index": "...",
            ...,
            "fields": {
                "position": [1]
            }
        },
        {
            "_index": "...",
            ...,
            "fields": {
                "position": [2]
            }
        },
        {
            "_index": "...",
            ...,
            "fields": {
                "position": [0]
            }
        },
    ]
}

I don't know why, but changing counter type from int to int array solved the issue:

	"script_fields" : {
    "position": {
        "script": {
            "source":"params.counter[0]++",
            "lang":"painless",
            "params":{
                "counter": [0]
            }
    	}
    }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.