Question about nested data

Mirakuru · July 3, 2020, 3:23pm

Im trying to read a 100k line file into elastic search with logstash.

I have a .txt file lines are like this userid;action.

I want to use my userids as uniq and import every matching action to it, but there are duplicated actions which i dont want..

Badger · July 3, 2020, 3:37pm

Set the document_id based on the userid (either as-is, or using a fingerprint filter) and then duplicate actions for the same userid will be overwritten.

Mirakuru · July 3, 2020, 3:42pm

yes but i also want to keep new actions nested to userid too, with what you are saying it would overwrite action wouldnt it?

what i need is:

add
userid = 3
action = walk

if new data matches userid3 and the new action is uniq add it to action list etc
action = walk, talk

Badger · July 3, 2020, 3:47pm

In that case use an aggregate filter to collect all the unique actions for a userid (using a Ruby set, perhaps). If the input is not sorted it would be like example 3, if it is sorted by userid then example 4. Note that this means logstash will be holding the entire input file in memory until the timeout triggers.

Mirakuru · July 3, 2020, 3:48pm

would using php with elasticsearch module to do this without using logstash would perform good?

Badger · July 3, 2020, 4:00pm

I could not say.

system · July 31, 2020, 4:03pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash action "Update" Logstash	2	763	July 24, 2018
Aggregate filter plugin does not create nested data in Elasticsearch Logstash	1	287	January 13, 2020
[Logstash] Aggregate Filter Plugin with Nested Fields Logstash	1	341	March 8, 2021
Ingest SQL relationships as nested documents Logstash	4	626	June 15, 2018
ES query to check the existence of a document_id? Logstash	10	983	June 26, 2020

Question about nested data

Related topics