Logstash Elasticsearch Reindexing Question

I'm new to logstash and trying to make a proof of concept to reindex an
already created Index (This index is not logs but standard files like
.doc/pdf/emails etc). It will be greatly appreciated if you answer my
question below and share your thoughts.

  1. Can we reindex a non-log index by using logstash and keep the mappings
    of old index? If someone in this forum has tried it, please share your
    experience for the steps involved to achieve this.

My Observation:

By running the below configuration, elasticsearch is able to reindex but at
the same time changes the mapping of new index. It also changed the index
type to doc/text,html etc.

I think that's because elasticsearch parses the incoming JSON object from
logstash message field via tika and determines the object type.

I know we could define our own indexing template and make mapping=strict
but will that resolve the issue?

Here Input is any filesdata index created by elasticsearch and output is a
new index. My logstash conf file looks like below

input {
elasticsearch {
host => "10.0.0.10"
port => "9200"
index => "filesdata"
scroll => "1m"
}
}

output {
elasticsearch {

             host => "10.0.0.11"
             protocol => "http"
             cluster => "node1"
             node_name => "indexer"
             index => "filesdata_15022015"
  }

stdout { codec => rubydebug }
}

Looking forward for community experience in handing such scenario.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3e090d81-975a-4438-a355-dd04de4e772d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

1 Like

Have you tried exporting and importing the mapping to the new cluster, then
reindex using LS?

LS makes a number of assumptions when it creates mappings for a new index,
which is probably what is happening.

On 16 February 2015 at 07:54, Sumit Arora sumit101@gmail.com wrote:

I'm new to logstash and trying to make a proof of concept to reindex an
already created Index (This index is not logs but standard files like
.doc/pdf/emails etc). It will be greatly appreciated if you answer my
question below and share your thoughts.

  1. Can we reindex a non-log index by using logstash and keep the mappings
    of old index? If someone in this forum has tried it, please share your
    experience for the steps involved to achieve this.

My Observation:

By running the below configuration, elasticsearch is able to reindex but
at the same time changes the mapping of new index. It also changed the
index type to doc/text,html etc.

I think that's because elasticsearch parses the incoming JSON object from
logstash message field via tika and determines the object type.

I know we could define our own indexing template and make mapping=strict
but will that resolve the issue?

Here Input is any filesdata index created by elasticsearch and output is a
new index. My logstash conf file looks like below

input {
elasticsearch {
host => "10.0.0.10"
port => "9200"
index => "filesdata"
scroll => "1m"
}
}

output {
elasticsearch {

             host => "10.0.0.11"
             protocol => "http"
             cluster => "node1"
             node_name => "indexer"
             index => "filesdata_15022015"
  }

stdout { codec => rubydebug }
}

Looking forward for community experience in handing such scenario.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3e090d81-975a-4438-a355-dd04de4e772d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/3e090d81-975a-4438-a355-dd04de4e772d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X86fGzigQJ2j6fr%2B%3DF1dR%2B0Qh58jt_-hELuL9KHHa7ekw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks Mark for your reply. Yes I did look into the mapping and it works
perfectly OK for simple documents. My mapping has nested and dynamic
templates embedded and it's quite complex to implement re-indexing with
logstash. I end up reindexing by using GIT
plugin https://github.com/karussell/elasticsearch-reindex which worked well
for my use case.

On Monday, February 16, 2015 at 7:54:34 AM UTC+11, Sumit Arora wrote:

I'm new to logstash and trying to make a proof of concept to reindex an
already created Index (This index is not logs but standard files like
.doc/pdf/emails etc). It will be greatly appreciated if you answer my
question below and share your thoughts.

  1. Can we reindex a non-log index by using logstash and keep the mappings
    of old index? If someone in this forum has tried it, please share your
    experience for the steps involved to achieve this.

My Observation:

By running the below configuration, elasticsearch is able to reindex but
at the same time changes the mapping of new index. It also changed the
index type to doc/text,html etc.

I think that's because elasticsearch parses the incoming JSON object from
logstash message field via tika and determines the object type.

I know we could define our own indexing template and make mapping=strict
but will that resolve the issue?

Here Input is any filesdata index created by elasticsearch and output is a
new index. My logstash conf file looks like below

input {
elasticsearch {
host => "10.0.0.10"
port => "9200"
index => "filesdata"
scroll => "1m"
}
}

output {
elasticsearch {

             host => "10.0.0.11"
             protocol => "http"
             cluster => "node1"
             node_name => "indexer"
             index => "filesdata_15022015"
  }

stdout { codec => rubydebug }
}

Looking forward for community experience in handing such scenario.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f49dac20-da0c-4993-bf3b-829379a0134c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.