First of all, I'm a complete newbie to ElasticSeach and related tools. Although 30 yrs programming, in fact this is my First project using Elasticsearch. So sorry if this is not the kind of question I should be asking here.
So, the thing is:
- I am planning to use ElasticSearch to store a sequence of events representing activity in my application.
- I have designed an object structure in which I will store several types of events, similar but different, in the same collection of documents.
- Each document will have at least the following data:
{
"id": [a sha1 Id]
"class": [the type of event being stored]
"version": [version of the structure of data for that event class]
[...]
}
My question is about the "type of event" to be stored in the key "class". Let me show a partial drill-down in the event type hierarchy:
- I am planning to store
http
events (users calling my web controllers) andconsole
events (users invoking CLI application calls). - For
http
I will have normal pages and API calls. Something likehttp.page
andhttp.api
for example.
For pages I will render an event when the controller is creating the page and one other event when the page is being rendered, which I will know loading an event tracker URL via an ajax. something like http.page.production
and http.page.consumption
. For example my PHP class for this last one is named HttpPageConsumptionEvent
.
In addition I have "contextualized" events, which represent the previous ones but with added data representing "who tells what", for example if "event 1 is saying 'user logged in at date XXX'" and an importer running 1 year later is transforming data it may say "the importer is saying at date YYY that event 1 said that 'user logged in at date XXX'".
That class in my PHP is named after the ContextualizedEvent suffix. For example HttpPageConsumptionContextualizedEvent
.
So here's my question:
Q1) I am wondering if I should name the events in some kind of structured way as separated by dots, like for example http.page.consumption
vs something wich is compact HttpPageConsumption
. Will those dots help me in ulterior analysis of the data for example to place things into buckets? Or don't mind at all?
Q2) I am wondering if I should name the events in a way that it feels more natural, like adding the event
word at the end like in http.page.consumption.event
or it is better if I don't add any word at all, like in http.page.consumption
.
Q3) In the case I add the word event
in the string of the value of the class
key, I wonder if it may go anywhere, like at the end, like in http.page.consumption.event
or it should look like a namespace for any search-reason or anything related to grouping or aggregating, like in event.http.page.consumption
to separate it from event.http.page.consumption
, from event.http.api.....
from event.console......
in a way that it acts like a namespace structure nesting sub-concepts after grouping words.
Q4) I wonder if I may store all the events in the same collection of documents, or it is better that I set the http
in one place, the console
in another... even if I place the http.page
in one place and the http.api
in another... or it is completely transparent from a technical point of view and neither solution will limit me when asking for complex data analysis (thru for example Kibana).
Q5) When it comes to contextualized events the same problem arises adding one more word... HttpPageConsumptionContextualizedEvent
or event.contextualized.http.page.consumption
or contextualized.event.http.page.consumption
or contextualizedEvent.http.page.consumption
...
Probably you guys are going to feel this is like an stupid question, but I tend to be very purist when programming and I would love to kick-off with a naming that is not limiting me later when querying.
Thanks in advance. Any tips are really very welcome.