I have a number of different servers (php, java...). All of the servers uses different user tracking cookies and are behind nginx.
I already have access.log with all cookies I need and all of them are feeded to ES with logstash.
Now I want to analyze logs to see metrics: unique visitors per page, unique user_agents per page and so on. In RDMBS I would create view with complex SELECT to distinguish visitors. Not sure this is right way in ES.
Should I build/calculate these metrics outside ES or there is good way to do it in ES?
Seems like I didn't express all difficulties I see.
Each server has different cookies to track user (PHPSESSID, JSESSSIONID, metrics and such). Many of users don't like to be tracked and has no cookies enabled.
So I have access.log line with all common fields and number of cookies I intent to use to track.
To distinguish visitor, I check "main" cookie, say _ym_id (from yandex metrica), if it not set then BX_USER_ID, if it not set, then PHPSESSID and lastly ip with user-agent.
I'm afraid this aggregation may be difficult for ES under heavy load. Or this is normal way?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.