Best way to add a field as "sessionize" of another field+date


(Thomas Decaux) #1

Hello guys, in my pocket I have:

  • 3 data-nodes
  • 7 daily indices
  • 12.000.000 doc/day <-> 20Go/day

Documents are:

{"date" : "01/10/2002 10:20", "user_uid": "0032840238-3049309" ....}

I would like scan all documents (for a given day) and change user_uid to an integer (like a hash, same user_uid have same integer value, I don't know the name of this process..).

What is the best and cleanest way to achieve this with ES 2.4?

Thanks you,


(Christoph) #2

Hi,

Just a few questions:

  • Do you want to discard the original field or would it be okay to also keep the original value?
  • Why do you need to convert it? After all, "user_uid" already sounds like it is unique.
  • Whats your current indexing process (do you go through logstash, if so, why not do the transformation there)?

(Thomas Decaux) #3

Hello,

  • I must discard it at the end
  • user_uid is string, from HASH(IP), my company asks me to not store that, but something "daily reseted"
  • I have logstash at the front, but I don't have full access to it whereas I have full access to elasticsearch

Just curious if it's possible via reindex API or update API. Else, I am using a small Java script for that but its slow and consume CPU and network :confused:


(system) #4