Special characters in elasticsearch _id from logstash


(Charlie) #1

Hello.
I am using Logstash to load data into ElasticSearch.
I have chosen to use default "_id".

One of my tools only accepts characters from a range a-z, A-Z, 0-9, -.
Sometimes the _id looks like

1alyPWYB9XD0VZRJa9J_

.

Is there a way to control the selection of characters used to generate this _id?
Otherwise, I will have to rewrite or patch in between.


(Christian Dahlqvist) #2

That sounds like an unusual limitation. Elasticsearch autogenerated IDs uses a form of base64 encoding that includes the characters - and _, as they are required in addition to the standard alphanumeric characters in order to get to 64 characters. You can not change this, which means that you may need to generate your own IDs.


(Charlie) #3

Thank you, as I thought is not possible directly.

Using fingerprint option to generate something else puts too much of the computational overhead.

Side note.
The application that poses limitation uses SOLR.
I wonder now if that limitation is because of its index digestion capability. But ES is based on Lucene, the same as SOLR.


(Christian Dahlqvist) #4

Have you tried generating UUID through the fingerprint filter plugin? This should be computationally less expensive than the cryptographic hashes and would meet your format requirements.


(Charlie) #5

That is also good idea, thank you.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.