Is there a way to reindex data in ES from start to now()?

I figured there was a built in way to say: Hey Master node, reindex on propertyA, propertyB, propertyC for independent faster lookups.

My ES DB is about 250g large now on a single dual purpose Master/Data node. I was having issues where kibana queries were taking > 30 seconds, so it sounds like it might be time to start optimizing.

While I can tweak my logstash instances to pass tweaked data, I was not sure if it is possible to have elastic go over the currently existing data and reindex by more keywords etc.

I can likely update my filter groks in logstash

filter {
	grok {
		match => [ "path", "%{GREEDYDATA}/%{GREEDYDATA:filename}\.txt"]
	}
	grok {
		match => {
			"message" => "%{DATA:sampleinfo}[:;]%{GREEDYDATA:backupinfo}"
		}
	}
	mutate {
		gsub => ["backupinfo", "[\n\r\t]", ""]
	}
}

But I wasnt sure if I can do this from within Elasticsearch.

Something like: Starting now(), reindex all X,Y,Z and turn it into A,B,C. I figured that when all logstash are updated they will start ingesting the correct information. So i would just need to do an update for all documents from: Oldest entry to now()

I will make a follow up post in Logstash on how to update logstash the things which need to be indexed for the fastest lookup are: timestamp, filename, sampleinfo, backupinfo

I presume there way was a way to redefine a property to have a different value for indexing. All the data are just variable character strings,

Can I execute a command in Elasticsearch to reindex everything while also ingesting new Data? I was not sure if i could execute a function which will reindex a property from X type to Y type? I presume there is a way I can define the properties (sampleinfo and backupinfo and the timestamp bucketing) which are stored and reindex them to something else for easier lookups?

Take a look at the reindex API, which can be run while new data is being indexed.

I might be using the wrong words here. It seems that the API is like moving data from Cluster "foo" to a new cluster "bar" I think? I think this will be useful later for sure.

I was thinking that the way some of my variables being defined may be why my when querying, the resultset is really slow to fetch. I was thinking there may be a way to update the property type by doing some command such that when querying, it would find results faster?

Im not sure how ES handles typings, since I want to streamline the ES instance Specifically around the config variables: timestamp (built in), filename, sampleinfo, backupinfo

Hey,

reindex from remote is only. a part of reindex, you can also specify another index within your cluster.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.