Why should I upgrade to 5.0?

We have large installation of ES 1.6.0 and are happy with it. Are there any benefits I will be getting of upgrading to V5.0?

1 Like

Write performance is nicer.
Aggregations are much safer.
Painless, the new scripting language, is enabled by default because it is actually safe.
The indices for numerics are much nicer.
If you ask for help we might remember the code. 1.6 is rather distant for those of us in the code every day.

The big bump is that you have to reindex to go from 1.x to 5.0. You can use reindex-from-remote to "skip" a version, but you have to have a 5.0 cluster to reindex into.

If you're on 1.6.0, you're susceptible to both a remote code execution vulnerability as well as a directory traversal attack.

Regarding new features and improved performance, there are a boatload between 1.6 and 5.0. A few more beyond what Nik mentioned:

  • The ability to reindex documents inside of Elasticsearch (to get the latest performance benefits or to change the mappings)
  • The ability to profile the performance of queries and aggregations
  • A whole new class of aggregations called pipeline aggregations, so you can do derivatives, moving averages, and other processing across your data directly inside of Elasticsearch
  • An entirely new type of node called an ingest node that allows you to transform data directly inside of the Elasticsearch cluster
  • Better compression options (best_compression)
  • The ability to rollover indexes and shrink shard counts for time-series use cases
  • Lower heap usage
  • Built-in support for IPv6 datatypes
  • Faster/better handling of geo data
  • Way too many resiliency improvements to list

And many more

2 Likes

Thanks for the feedback, Nik & Shane! Are these benefits worth taking the pain of upgrading?

Shane, Though we have large installation of ES, we do not use much of any of the above features(except compression). Can you please point me to a document for each version that lists the added/fixed features in that version. That will help me present them to my management for upgrading.

This is probably going to seem daunting due to the size, but the 5.0.0 release notes are at https://www.elastic.co/guide/en/elasticsearch/reference/current/release-notes-5.0.0.html and release notes for all of the 2.x versions are at https://www.elastic.co/guide/en/elasticsearch/reference/2.4/es-release-notes.html . The size of these release notes should be an indication of how much further things have come in Elasticsearch since the 1.x versions. We do wrap up some of the benefits in our release blog posts, which you can find at the following locations

Regarding the "are the benefits worth the pain / we don't use (m)any of these features," only you can decide the value that these may provide. I work for Elastic, and I'm inherently biased towards answering "of course it's worth upgrading! Look at all the great things that are there!" :slight_smile:

But in all seriousness, there are likely at least a few compelling reasons to upgrade. I'll leave it to you to decide if any of these are "worth it" to you:

  • You'll obviously never use a feature that you don't have access to, so saying "we don't use (m)any of these features" may be at least partially because they're not available to your developers/users. You may be using other datastores that you could simplify architectures on by using a later version of Elasticsearch instead, for example. We've seen many users get a lot of additional value through these features and deploy Elasticsearch in entirely new use cases as a result.
  • Resiliency is a hard thing to quantify, but if you're running 1.6 we can pretty authoritatively say that you stand a significantly increased risk of losing data at some point. See all of the items listed as 2.0-5.0 on our resiliency page for examples. Whether losing data in Elasticsearch is a big deal to you or not obviously depends on your use case(s) and what it means to recover that data. Some users have sophisticated methods to avoid some of these scenarios, but some are quite difficult to avoid and you may be setting yourself up for a 3am pager of a production outage, for example. What that means to your business obviously is up to you. 5.0 is so much better at resiliency than 1.6 that maybe you/your management would decide to use it for other reasons if you had the latest/greatest. If you're running a "large installation," I'd think that this reason alone may be very compelling.
  • With all of the performance benefits -- especially if you're storing numbers (including dates, metrics, IP addresses, or geo locations) though generally across the board -- you could benefit in reduced storage costs, CPU usage, and memory usage.
  • If your data is time-based, you could use the _rollover and/or _shrink APIs in 5.0 to optimize your shard utilization, which can yield improved performance and lower heap usage
  • If you're at all concerned with security, 2.0 is significantly better and 5.0 is significantly better than that. This means lower chances of somebody using Elasticsearch as an attack vector for the other machines in your corporation, for example.
  • Shortly, 1.6.0 will be end-of-life. It's generally not good to be running end-of-life software.
  • I'll add a bit of a plug for our company here as well: with 5.0, we have a nice set of features in our x-pack. It plugs nicely in to provide a variety of additional features, including security, alerting, monitoring, and graph functionality.

If you're running a large installation in an company, you'll ultimately need to consider how important Elasticsearch is to your business and what would happen if it went away all of a sudden due to resiliency issues or somebody used it as an attack vector to other systems you owned. That may be some indication as to whether you'd want to upgrade or not. The new features that may mean you could simplify or get additional benefit out of business/technical processes may be another reason. The performance benefits may be yet another.

How much these mean to you is ultimately going to be up to you, but I do hope you decide it's worth it!

5 Likes

Thank you, Shane! It is really detailed information that i can take it from here to assess. You made my life easy.

1 Like