Document version number not advancing

dorj1234 · January 17, 2018, 1:51pm

Hello,
I just tested how versions work, loaded the same set of CSV files again, same loading script and everything is the same.

Looking at the "discover" section of Kibana I expected to see the @version number in my documents advance from "1" to "2" but that did not happen.

What could be the cause?

val · January 17, 2018, 1:55pm

How do you perform your updates? In bulk? Can you show your script?

Note that if you use the Update API and nothing changes in your document, the version is not increased (i.e. that's a noop operation)

dorj1234 · January 17, 2018, 3:21pm

Good point - I am using logstash to re-ingest the same csv files.
I have fingerprinted the records, so next time I run logstash it should greate the same document ID for the same record.

Are you saying that if the record data is identical then no new version will be created? that could be the explanation and I will test it later today and update the thread.

val · January 17, 2018, 3:25pm

If you use the Index API, then the version will increase as the Index API does not retrieve the document to perform a diff. However, if you use the Update API and the document hasn't changed, then the version won't be bumped (i.e. noop operation)

dorj1234 · January 17, 2018, 6:45pm

I use neither. Indexing is done via logstash CSV plugin. What does it do in the background? I think it does index API calls.

So far versioning does not work as expected.

I have many columns in my CSV
I fingerprint 2 of the columns, and index the csv file using logstash
I change one of the columns that does not participate in the fingerprint and re-run logstash

The result is new documents instead of incrementing the version of the existing documents.

val · January 17, 2018, 9:18pm

This is because the ID is different on each indexation and thus the existing document is not updated but a new one is created instead. What configuration do you have for the document_id setting in the elasticsearch output?

dorj1234 · January 17, 2018, 10:25pm

I use the fingerprint result for document ID
document_id => "%{[@metadata][fingerprint]}"

After setting fingerprint to:
fingerprint {
method => "SHA1"
source => [ "column_a","run_date" ]
concatenate_sources => true
target => "[@metadata][fingerprint]"
key => "SOMEKEY"
}

val · January 18, 2018, 4:05am

Ok then I guess column_a is constant, but where does run_date come from'

dorj1234 · January 18, 2018, 4:50am

run_date is also a constant (example: 2017-10-25), you can call it column_b. Once run_date is in the csv file it's never changing changing.

When I combine column_a+column_b (run_date) using fingerprint, and use the hash as document_id, then the next time I index the same record, I expect the same hash ==> the same document ID because nothing has changed (maybe the timestamp).
However column_a+column_b are not changing

system · February 15, 2018, 4:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash/Elasticsearch process overwriting doc version each time? Logstash	2	1021	January 14, 2017
Logstash and doc versioning Logstash	1	245	September 12, 2019
Upsert logstash output - Version not upcounting Elasticsearch	1	991	April 11, 2017
_version number incremented, but updates not reflected Elasticsearch	1	325	August 21, 2019
Logstash is indexing the last line of my csv file in elasticsearch Logstash	3	1250	July 6, 2017

Document version number not advancing

Related topics