Confirm force merging the index is running

We are using a dense_vector field for semantic search. Due to very slow performance on our index and based on several conversations here I decided to force_merge the index to 1 segment.

So it's been running asynchronously for about 6 days (the index has about 14 million documents with 1024-dim vectors).

Is there ANY way to confirm that the merge process is still actually running?

The task info shows the following:

{
    "completed": false,
    "task": {
        "node": "vGiOGPHoQnSNmndhy2Np1A",
        "id": 7727142,
        "type": "transport",
        "action": "indices:admin/forcemerge",
        "description": "Force-merge indices [proposals.proposals.vec], maxSegments[1], onlyExpungeDeletes[false], flush[true]",
        "start_time_in_millis": 1681209571540,
        "running_time_in_nanos": 586720634185050,
        "cancellable": false,
        "headers": {}
    }
}

and the /_stats/merge on the index shows the following:

{
    "_shards": {
        "total": 3,
        "successful": 3,
        "failed": 0
    },
    "_all": {
        "primaries": {
            "merges": {
                "current": 0,
                "current_docs": 0,
                "current_size_in_bytes": 0,
                "total": 25,
                "total_time_in_millis": 289020033,
                "total_docs": 21674346,
                "total_size_in_bytes": 148916612974,
                "total_stopped_time_in_millis": 0,
                "total_throttled_time_in_millis": 9923216,
                "total_auto_throttle_in_bytes": 5242880
            }
        },
        "total": {
            "merges": {
                "current": 2,
                "current_docs": 28652980,
                "current_size_in_bytes": 157314725265,
                "total": 29,
                "total_time_in_millis": 292723931,
                "total_docs": 25154344,
                "total_size_in_bytes": 166508672568,
                "total_stopped_time_in_millis": 0,
                "total_throttled_time_in_millis": 11369079,
                "total_auto_throttle_in_bytes": 36755306
            }
        }
    },
    "indices": {
        "proposals.proposals.vector_20230403": {
            "uuid": "_Xs0oTRjSKm2YEhU4r2r4w",
            "health": "green",
            "status": "open",
            "primaries": {
                "merges": {
                    "current": 0,
                    "current_docs": 0,
                    "current_size_in_bytes": 0,
                    "total": 25,
                    "total_time_in_millis": 289020033,
                    "total_docs": 21674346,
                    "total_size_in_bytes": 148916612974,
                    "total_stopped_time_in_millis": 0,
                    "total_throttled_time_in_millis": 9923216,
                    "total_auto_throttle_in_bytes": 5242880
                }
            },
            "total": {
                "merges": {
                    "current": 2,
                    "current_docs": 28652980,
                    "current_size_in_bytes": 157314725265,
                    "total": 29,
                    "total_time_in_millis": 292723931,
                    "total_docs": 25154344,
                    "total_size_in_bytes": 166508672568,
                    "total_stopped_time_in_millis": 0,
                    "total_throttled_time_in_millis": 11369079,
                    "total_auto_throttle_in_bytes": 36755306
                }
            }
        }
    }
}

I don't even need the estimation when it is supposed to complete, just to make sure the process is actually running.
What metric would confirm that fact?

I think you are on 8.X so see Task management API | Elasticsearch Guide [8.7] | Elastic

Yes, I did read the documents on that link and many others and that was one of the endpoint I used. But unfortunately I could not find the answers there.

For example I could not find how I can confirm the fact that the merge process is actually still running at all. There is an attribute running_time_in_nanos. So if this value is updating then it definitely means the merge is actually still running?

And what is the relation between these two attributes that I got in this endpoint index_name/_stats/merge:

"primaries": {
    "merges": {
        "current": 0,
...
        "total": 25,
-------------------------------
"total": {
    "merges": {
        "current": 2,
...
        "total": 29,

Does it mean it still has to process 25 segments or 27 segments?
And why for primaries the current is 0 and for total it is 2?
How 2 is related to 29?

I wish there was some kind of documentation to help decipher this output.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.