Combining Month's Worth of Daily Logstash Indices into Single Month

jamiejackson · August 31, 2019, 4:56pm

I'm pretty new to ES and brand new to anything like reindexing.

I've looked through a handful of search_phase_execution_exception-related posts, but didn't see any that I understood to be the same as my problem.

My goal is to get a bunch of old daily logstash indices combined into a single, single-node, index. For what it's worth, I'm running AWS-managed ES 6.7.

Here's my failed attempt:

Create an index with one shard (successful):

$ curl -XPUT 'https://myinstance.us-east-1.es.amazonaws.com/logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04?pretty' -H 'Content-Type: application/json' '-d
{
    "settings" : {
        "index" : {
            "number_of_shards" : 1
        }
    }
}
'
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04"
}

Attempt at combining a month's worth of indices into the new one (failed):

$ curl -XPOST 'https://myinstance.us-east-1.es.amazonaws.com/_reindex?pretty' -H 'Content-Type: application/json' '-d
{
    "conflicts": "proceed",
    "source": {
        "index": "logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.*"
    },
    "dest": {
        "index": "logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04",
        "op_type": "create"
    }
}
'
{
  "error" : {
    "root_cause" : [ ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [ ]
  },
  "status" : 503
}

Here are the indices to be reindexed:

$ curl -XGET -s 'https://myinstance.us-east-1.es.amazonaws.com.us-east-1.es.amazonaws.com/_cat/indices/logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.*?v' | sort

health status index                                                    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.01 Mg8VVkN5TnWXO9jF6m6vMw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.02 88-2t-wVQW2s8RU0mCnLow   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.03 tw5JnZBHRYKBRXw_xOd8Tg   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.04 Cv3GD_7GQcKLqFIs2z8h0Q   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.05 avqvnfNSQ76K0C6hicQvhw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.06 TY6Pn69pSdGl5M-CqP6YhQ   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.07 vfqupey2TWKcWJXpKrDkvw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.08 gFnBQSJNQ-2978bsbaSADA   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.09 4k2YlMuSRuan6Xu_40OkkQ   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.10 jxANAFVLQDKxrsAEjQz8Jg   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.11 WFZwjA8pSBOyB-vgVah2bA   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.12 7pEvln89SU6xHL_qkhZ93A   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.13 l4tqNkUsQZmuVSq_b-vVgg   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.14 7flcu677Rry6ISKkulpWGw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.15 0nPdMXc2TZ2tgJtmk3QGMA   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.16 zhk-lgSfRL24mTxCwRDWqw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.17 SF5aWGccQdiAJgRqL9RDnw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.18 rEsmvfESQ2SCwdxzUQIwTQ   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.19 YLJuHgZ_SdGNRvqFP4woKg   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.20 ypAdh8VZTm2oBZR6ySTtjw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.21 zT8uTekuQDylCF7V7_fnmw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.22 8i_sR5ZZRUmPtwBVt3qOAA   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.23 JHENsre9TF2HPZDqh4TRCg   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.24 qsh4V1U6S2qYWrepUIdFrA   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.25 0UM66HYJQnmL_F7nyrnteA   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.26 ZIELeM0MTLuliFoW_EKXnA   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.27 dpnyPzGNR1Cs0TPzbCD5bw   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.28 TCcrZU1WTmGHnw4Zj6MVXg   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.29 grWS9jnyTlmWVYnkOKWDGg   5   1
red    open   logstash-lucee-tomcat-access~hudx_stage_lucee~2019.04.30 YhZMK_beS3uRweigUw1wvg   5   1

I have no idea where to go from here.

Thanks,
Jamie

Christian_Dahlqvist · September 1, 2019, 2:58pm

While the indices are in red state no not all shards are allocated, which mean you can not access the data in them. You will need to wait until they are in yellow state before you can reindex.

jamiejackson · September 5, 2019, 6:09pm

Thanks, @Christian_Dahlqvist. I have some green ones that I can work with, too, so I'm now working with those, but I have another issue.

First, here's my strategy for attempting reindexing:

ES_URI=https://myinstance.us-east-1.es.amazonaws.com
DEST_INDEX=logstash-lucee-lucee~hudx_dev_lucee~exception~2019.08

SOURCE_INDEX_PATTERN=${DEST_INDEX}.*

# create destination "monthly" index with one shard
curl -XPUT "${ES_URI}/${DEST_INDEX}?pretty" -H 'Content-Type: application/json' -d'
{
    "settings" : {
        "index" : {
            "number_of_shards" : 1 
        }
    }
}
'

# reindex a month's worth of dailies into the new monthly index
curl -XPOST "${ES_URI}/_reindex?pretty" -H 'Content-Type: application/json' -d'
{
    "conflicts": "proceed",
    "source": {
        "index": "'${SOURCE_INDEX_PATTERN}'"
    },
    "dest": {
        "index": "'${DEST_INDEX}'",
        "op_type": "create"
    }
}
'

# delete the dailies
curl -XDELETE "${ES_URI}/${SOURCE_INDEX_PATTERN}"

However, I got unexpected output from the reindex step:

{
  "took" : 230,
  "timed_out" : false,
  "total" : 11,
  "updated" : 0,
  "created" : 11,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

The counts are too low on the destination index:

$ count_docs () {
    local INDEX=$1
    curl -XGET "${ES_URI}/${INDEX}/_count?pretty"
}

$ count_docs $SOURCE_INDEX
{
  "count" : 13735,
  "_shards" : {
    "total" : 15014,
    "successful" : 245,
    "skipped" : 0,
    "failed" : 0
  }
}

$ count_docs $DEST_INDEX
{
  "count" : 11,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

I read that this might be related to ES (incorrectly) guessing mappings and failing to bring in documents. However, I don't know of a way to export the mappings from the source index for use in a new index. How is that done?

system · October 3, 2019, 6:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Daily indices into monthly Logstash	5	5425	October 6, 2017
Merge multiple indices in one Elasticsearch	7	9068	March 23, 2020
Reindex indices Elasticsearch	2	60	February 3, 2025
Indexing documents to elasticsearch monthly? Logstash	5	6511	July 6, 2017
Migrating daily indices to monthly Logstash	6	1356	November 1, 2017

Combining Month's Worth of Daily Logstash Indices into Single Month

Related topics