Hi,
We are planning to use Elastic search in our cloud application project. We are planning to use this for the below purposes
Logging - for debugging the cloud application
Reports generation based on log – For management reports and do little bit analytics
History/Alarms - For our cloud application users
We want your expertise suggestion for the item# 3 when use Elastic search for this purpose.
For this item, number of concurrent queries will be high (say in the order of thousands) with complex queries like multiple terms search, pagination, sorting which need to do minimum 30 indexes . As this query exposed to end user mobile application, query time also shall be faster.
In general, Is ELK designed to handle many queries at the same time ? or it is mainly for offline processing with minimal user ? Do you we need to allocate more cpu and memory to achieve this ? or query is not that intensive ?
PS :
We already have Cassandra database in our application. So we want to know Which one is best architecture method for the item#3 . Should keep it in Cassandra or Elastic search database ?
Thank you. one more query related to this. In my case
Sharding key will be user id and
Store data for "last" 30 days ( it is not monthly , last 30 days )
have to provide search with any time range - say from feb 27, 3.00 pm to feb 1,2.00 pm
number of records per day per user is usually ~150
5.Number of user will be around 50000
In this case, how we should create our index ? if it is daily index then we may need to search in 30 indexes (maximum) , will it not create any performance impact ? if we have many Elastic search node then we can achieve parallel query but what about if we have just 3 elastic node/machine ?
That number of documents per day is super small. I think you will be fine if you make a new index per day so you can drop the index after 30 days. Stick all the data from all users into that index. I think using user_id as the routing key would be over engineering things so I wouldn't do it without proof it helps.
It all sounds fine. If you need to scale read load add more nodes and more replicas. That is super common and easy to do.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.