Reindex & Missing Documents

I've got a fairly stock standard ELK stack with a bunch of data in it.

We control our field mappings, and sometimes we add a new mapping. Obviously old indexes aren't searchable on that new mapping, so I'd like to start reindexing the data when that happens.

The intent would be to reindex into a temporary index, then delete the old one, reindex into a new index named the same as the old one and delete the temporary one.

I wrote what I thought was a fairly stock standard script to do this, which I've included below:

[CmdletBinding()]
param
(
    [string]$elasticsearchUrl,
    [string]$indexMatchRegex="logstash-2017\.07\.31",
    [switch]$whatIf=$true
)

if ($whatIf)
{
    Write-Output "Running a theoretical reindex of all indexes in [$elasticsearchUrl] that match pattern [$indexMatchRegex]";
}

$ErrorActionPreference = "Stop";

function Reindex
{
    [CmdletBinding()]
    param
    (
        [string]$elasticsearchUrl,
        [string]$sourceIndex,
        [string]$destinationIndex,
        [switch]$whatIf=$true
    )

    if ($WhatIf)
    {
        Write-Output "Would have created a new index with name [$destinationIndex] here";
    }
    else 
    {
        Write-Output "Creating a new index with name [$destinationIndex]";
        $create = Invoke-WebRequest -Method PUT -Uri ("$elasticsearchUrl/$destinationIndex" + "?pretty") -Headers @{"accept"="application/json"};
        Write-Output "Create response";
        Write-Output "-------------------------------------------";
        Write-Output $create.Content;
        Write-Output "-------------------------------------------";
    }

    $reindexPayload = "{ `"source`": { `"index`": `"$($sourceIndex)`" }, `"dest`": { `"index`": `"$destinationIndex`" } }";

    if ($WhatIf)
    {
        Write-Output "Would have created a reindex request using payload [$reindexPayload]";
    }
    else
    {
        Write-Output "Reindexing using payload [$reindexPayload]";
        $reindex = Invoke-WebRequest -Method POST -Uri "$elasticsearchUrl/_reindex?pretty" -Body $reindexPayload -Headers @{"accept"="application/json";"content-type"="application/json"} -TimeoutSec 3600;
        Write-Output "Reindex response";
        Write-Output "-------------------------------------------";
        Write-Output $reindex.Content;
        Write-Output "-------------------------------------------";
    }

    if ($WhatIf)
    {
        Write-Output "Would have deleted the old index named [$sourceIndex] here";
    }
    else 
    {
        Write-Output "Deleting old index named [$sourceIndex]";
        $delete = Invoke-WebRequest -Method DELETE -Uri ("$elasticsearchUrl/$sourceIndex" + "?pretty") -Headers @{"accept"="application/json"};
        Write-Output "Delete response";
        Write-Output "-------------------------------------------";
        Write-Output $delete.Content;
        Write-Output "-------------------------------------------";
    }
}

$indices = Invoke-RestMethod "$elasticsearchUrl/_cat/indices?pretty" -Headers @{"accept"="application/json"};
$sortedIndices = $indices | Sort-Object { $_.index };
foreach ($index in $sortedIndices)
{
    $oldIndexName = $index.index;
    if ($oldIndexName -match $indexMatchRegex)
    {
        $newIndexName = "$oldIndexName-r";
        Reindex -elasticsearchUrl $elasticsearchUrl -sourceIndex $oldIndexName -destinationIndex $newIndexName -whatIf:$whatIf;
        Reindex -elasticsearchUrl $elasticsearchUrl -sourceIndex $newIndexName -destinationIndex $oldIndexName -whatIf:$whatIf;
    }
}

When I ran this script, it worked for a few indexes, but other indexes ended up with missing documents (or in one case, completely empty).

I'm sure I've done something silly (like not waiting for a synchronous response, or waiting for ES to complete the requested changed), but I'm not sure where.

I'm using Elasticsearch 5.4.2.

Any help or advice would be appreciated.

Thank you.

I couldn't include the output from the execution that resulted in lost data in the main post (character limits) so I've put it here instead:

It shows one index that worked as expected, and one that lost around 20000 documents.

Creating a new index with name [logstash-2017.08.02.00-r]
Create response
-------------------------------------------
{
  "acknowledged" : true,
  "shards_acknowledged" : true
}

-------------------------------------------
Reindexing using payload [{ "source": { "index": "logstash-2017.08.02.00" }, "dest": { "index": "logstash-2017.08.02.00-r" } }]
Reindex response
-------------------------------------------
{
  "took" : 4730,
  "timed_out" : false,
  "total" : 24515,
  "updated" : 0,
  "created" : 24515,
  "deleted" : 0,
  "batches" : 25,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

-------------------------------------------
Deleting old index named [logstash-2017.08.02.00]
Delete response
-------------------------------------------
{
  "acknowledged" : true
}

-------------------------------------------
Creating a new index with name [logstash-2017.08.02.00]
Create response
-------------------------------------------
{
  "acknowledged" : true,
  "shards_acknowledged" : true
}

-------------------------------------------
Reindexing using payload [{ "source": { "index": "logstash-2017.08.02.00-r" }, "dest": { "index": "logstash-2017.08.02.00" } }]
Reindex response
-------------------------------------------
{
  "took" : 5778,
  "timed_out" : false,
  "total" : 24515,
  "updated" : 0,
  "created" : 24515,
  "deleted" : 0,
  "batches" : 25,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

-------------------------------------------
Deleting old index named [logstash-2017.08.02.00-r]
Delete response
-------------------------------------------
{
  "acknowledged" : true
}

-------------------------------------------
Creating a new index with name [logstash-2017.08.02.01-r]
Create response
-------------------------------------------
{
  "acknowledged" : true,
  "shards_acknowledged" : true
}

-------------------------------------------
Reindexing using payload [{ "source": { "index": "logstash-2017.08.02.01" }, "dest": { "index": "logstash-2017.08.02.01-r" } }]
Reindex response
-------------------------------------------
{
  "took" : 4467,
  "timed_out" : false,
  "total" : 24311,
  "updated" : 0,
  "created" : 24311,
  "deleted" : 0,
  "batches" : 25,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

-------------------------------------------
Deleting old index named [logstash-2017.08.02.01]
Delete response
-------------------------------------------
{
  "acknowledged" : true
}

-------------------------------------------
Creating a new index with name [logstash-2017.08.02.01]
Create response
-------------------------------------------
{
  "acknowledged" : true,
  "shards_acknowledged" : true
}

-------------------------------------------
Reindexing using payload [{ "source": { "index": "logstash-2017.08.02.01-r" }, "dest": { "index": "logstash-2017.08.02.01" } }]
Reindex response
-------------------------------------------
{
  "took" : 560,
  "timed_out" : false,
  "total" : 2333,
  "updated" : 0,
  "created" : 2333,
  "deleted" : 0,
  "batches" : 3,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

-------------------------------------------
Deleting old index named [logstash-2017.08.02.01-r]
Delete response
-------------------------------------------
{
  "acknowledged" : true
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.