Specify routing in Logstash


(Ashwin Tumma) #1

I wanted to store my data on different shards based on some field value. For example, geo-sharding is something which I am looking for. All my records with continent value 'NA' should go to shard-1, North America; 'EU' should go to shard-2, Europe and so on.

I learn that, I can leverage, routing parameter when specifying the shard to which I want to route the document to. I wanted to understand, is there a way we can do the same thing in Logstash?


(Ashwin Tumma) #2

I found a solution the above query. Thanks to the literature! Posting my conf file here, in case anyone else has the same issue:

input {
stdin{}
}
filter {
csv {
separator => ","
columns => ["ID","Continent"]
}
}
output {
elasticsearch {
protocol => "http"
routing => "%{Continent}"
}
stdout {}
}


(Christian Dahlqvist) #3

Routing ensures that all documents with the same routing ID are stored in the same shard, but as far as I can recall it does not guarantee that two different routing ID gets allocated to different shards. As your field seems to have fairly low cardinality, you could run the risk of having an uneven distribution of data between the shards. If the cardinality is very low and you really need to separate the data into different shards, it may be worthwhile considering storing the data in separate indices altogether as this gives you greater control.


(system) #4