Sort by _id field

MillQK · February 19, 2019, 12:10pm

Hello! I want to use search after to process all documents but document hasn't any unique field, only _id. That's why I want use _id for sorting. But there are some important note in search after docs about my case. I have several questions:

Is overhead really big?
Can I create script field with source doc['_id'].value or overhead will not disappear?
Can I set doc_value = true for _id field (I using ids generated by myself, not auto-generated by elastic)?

Will appreciate for some advices.
Thanks!

gbrown · February 19, 2019, 7:27pm

Hi!

You mention wanting to use search-after to "process all documents" - before I answer your question, I want to ask if the Scroll API might be suited to your use case - this would alleviate the issues you're having with search-after. If you're processing all documents returned from a search all at once and just need the results returned in batches, consider using the scroll API instead.

Search-after is the correct choice, though, if your workload 1) has large delays between retrieving batches of results, or 2) has many clients which need to maintain independent contexts.

To answer your question:

The overhead is pretty significant - we generally don't make recommendations like that in our docs if we aren't pretty sure that it will cause problems.
I don't believe using a script field would be any better - the problem is how the _id field is stored on disk in comparison to fields with doc_values enabled.
No, unfortunately this is not currently possible, which is why recommend copying the _id into a regular document field.

MillQK · February 21, 2019, 6:35am

Thanks for your great answer!

Scroll API documentation has note: Scrolling is not intended for real time user requests, but real time user requests is my case, that is why this api not good for me.

Well, then i will copy _id to doc field as recommended.

Can you tell me or may be share some article why _id has this significant restrictions?

system · March 21, 2019, 6:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Query on search_after Elasticsearch	2	1289	August 30, 2018
Correct way to do tiebreaking with search_after query without PIT Elasticsearch	1	1046	April 27, 2023
Search after - unique sort fields Elasticsearch	1	434	August 7, 2018
How to ensure unique sorting with search_after in Elasticsearch 8? Elasticsearch	3	36	April 2, 2025
Adding _id to doc_values to help w/ sorting performance Elasticsearch	2	588	February 25, 2017

Sort by _id field

Related topics