Info on Logstash 2.3.4 and Elasticsearch 2.1.0 compatibility

I am working with logstash 2.3.4 and elasticsearch 2.1.0 versions for my project where I am doing some performance testing for indexing the data into elasticsearch through logstash.

I am working on a search feature on indexed data, for which I am indexing logfiles as documents.
We have around 82000 logfiles which constitutes to around 1.4 G in size in total. one logfile is indexed as one document where logfile data is stored as string in a field.

Example of a document:

{
"_index" : "global_test",
"_type" : "logsearch",
"_id" : "log1",
"_score" : 1.0,
"_source":{"@timestamp":"2016-08-12T07:26:35.571Z","type":"GLOBAL_LOG","logdata":"logfile contents goes here", "logfile":"logname here","logfilepath":"path/to/log/file",}
}
We are testing the compression ratio for all the 1.4 G of data and below is the analysis

curl http://127.0.0.1:9200/_cat/indices?v

health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open globallogs_test 5 1 82443 0 715.5mb 715.5mb

As above with logstash version 2.3.4(latest version) and Elasticsearch version 2.1.0, 1.4 G of data has been compressed to 715.5 MB.

Last week I upgraded Elasticsearch to latest version 2.3.5 and ran the same test again with same data. This time 1.4 G of data was compressed to 881.5 MB.

So I am planning to revert my Elasticsearch version back to 2.1.0 to achieve the greater compression as above i.e. 715.5 MB.

My question is, will there be any compatibility issues if we use logstash 2.3.4(latest) along with elasticsearch version 2.1.0 (older version)?

Also, why the compression ratio varied if we used latest elasticsearch version?

Config file just for your reference:

input
{
file
{
path => ["path/to/logs"]
start_position => "beginning"
sincedb_path => "/dev/null"
type => "GLOBAL_LOG"
max_open_files => 10000
close_older => 300
ignore_older => 0
}
}

filter
{
if [type] == "GLOBAL_LOG"{

 multiline {
       pattern => "/.*./gm"
       negate => true
       what => "previous"
    }

 ruby
    {
        code => "
        event['logfile']      = event['path'].split('/').last
        event['logfilepath'] = event['path'].strip
        "
    }

mutate
    {
        add_field    => ["logdata", "%{message}"]
        remove_field => ["@version", "path",  "host", tags, "message"]
    }

}

}

output
{
if [type] == "GLOBAL_LOG"
{
stdout
{
codec => rubydebug
}
elasticsearch
{
template_name => "template_name"
manage_template => true
template => "/etc/logstash/mapping/template_name.json"
hosts => "127.0.0.1:9200"
index => "index_name"
document_type => "logsearch"
document_id => "%{[logfilepath]}"
}
}
}

Request your quick response on this.

My question is, will there be any compatibility issues if we use logstash 2.3.4(latest) along with elasticsearch version 2.1.0 (older version)?

No. See https://www.elastic.co/support/matrix#show_compatibility.

So I am planning to revert my Elasticsearch version back to 2.1.0 to achieve the greater compression as above i.e. 715.5 MB.

Running an older ES version to save 165 MB of disk space doesn't make much sense to me.

Also, why the compression ratio varied if we used latest elasticsearch version?

Different Lucene, probably.

Thanks Magnus for the quick response.

Running an older ES version to save 165 MB of disk space doesn't make much sense to me.

For this, 1.4 GB is only the sample data. In production environment, it might go well beyond 100 GB. In that case, more disk space will be saved if I use ES 2.1.0 is my thought.

100 GB, 1 TB, whatever. I still don't think running an old ES release to save 10% disk space is a very good idea.

Are you using best_compression for your indices?

Yes. I do as below in my elasticsearch.yml

index.codec: best_compression