Any idea if we can set some default value to particular field, while defining the mapping for an index. For example I have a log which writes to 10 lines (not necessarily in same order) and identified with request I'd. Where as say doc1 with request id 1 , gets written at line 1, 2, 3,4,8,9 and then doc 2 with request id gets written at line no 5,6,7,10 lines in the log file. (I am making request id as document_id so as to append/update the same request to one record)
So, as soon as I get message as started in my request I Mark the status as "running" in y logstash parsing and when I get a log having "completed" it makes/updates the status field as "finished". Hence the catch is Running is set when I see start and finished is set when I see completed.
But as request are asynchronous, So sometime es gets "completed" message before "start" while indexing the same as a result even though request is completed it gets tagged as running. So, can I make something like status = running as default while creating an index/mapping/template
In that case if value is not specified while indexing the record, it will not be updated/inserted but will have default value as "running" and when ever I see completed I make it as "finished"
I can say that Mappings can not set defaults but as for the rest of what your asking I can not comment on.
Why not just store all the value of it's state then updating it, this way you keep all the states it went though
field: [ "Start"
"middle"
"end"
]
Another option is don't update the existing document just put each state as a separate document and when you query for its existing state just sort the data or filter by the values you want first. That would be more inline with the behavior of elasticsearch
The main moto of updating the docs is to get the final state showing , so as I can show a pie chart or any chart that how many are running , completed , error
If I store 3 fields for each status like
isCompleted : completed
isRunning : running
Then I am not sure how can I show a pie chart of it showing a grid with running and completed as those will be 2 different fields.
If each "Update" was its own record , a pie chart would say, Total Start and Total Completed, but it would not tell you which ones
If you updated the record with a field of is Completed and isRunning, you could could do you pie chart with a filter rather then a term
So instead of
Term
Field: state
Size: 0
filter
"isCompleted:true AND NOT isRunning:true" (Label Completed)
"isRunning:true (Label Running)
This would give you 2 different slices, running and completed (add error for a third slice)
I guess we can bat this idea around a couple of ways. I am not sure of what process you use to insert the data to elasticsearch, so it makes it difficult to advise you on the best way to treat your data.
Each upsert is not an own/new record rather I am updating the existing record, as I get 10-12 lines for one request and those are not continuos but they have requestId in common.
So far doing with filter instead of terms in pi chart works for me
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.