Resource limitation for each query

I think it is not an error but I need to solve it.

Sometimes, huge query is carried-on over the elastic cluster like query with lots of 'or' and wide range of time (1 year or so...) At that time, this query takes most of resource especially file I/O, so other user cannot do anything.

In such a situation, I know scale-out is first option, but my client don't want it. (And such mad things occur very rarely, but occur.) Instead, they and I want to find second option which makes limitation on resource for each query. For example, if I can permit only 10% resource per 'a' query, then other query may be available with rest 90% of resource. Of course, I know query performance may be lower.

If there any option or possibility to do so?

I don’t think there is anything like that.
As you said, it would be probably better to understand why this query is consuming too much resources.

What kind of hardware do you have? How many indices, shards? What is a typical document? What do a “normal” query and what is a “heavy” query look like?

1 Like

It is a system of 3 nodes of 16core-64G Mem linux cluster with 6~7TB data (over private cloud on the openstack). It is not a large system and most use case is predefined dashboard, so in ordinary time there is no such a problem.

There are 300~400 indeices (everyday 10 new indecies are created and remove 0~100 indecies as the free space of disk) and each indecies create 10 shards (5 primary and 5 replica).

But my client worries about the situation that some one queries like this (type1: "blahblah or type2:"blahblah2" or ... or type7: "blahblah7) over all data through Kibana. Of course one who knows Kibana and elasticsearch may split query in shorter range and terms, but everyone does not. I said, in this situation file I/O is full so every other user wait for their response and it seems to be a disorder for someone. If resource may be limited, than other small query may be executed.
(normal query : dashboard with 5 ~ 10 ordinary charts through Kibana with aggregation for 24 hour or 1 week)

And I ask question more detail.

  1. I use HDD and if I change it to SDD, how much does performance improve? Is there any benchmark article between HDD and SDD for elasticsearch?
  2. I make 10 shard for a day because there are 10 major data sources. 1 data sources is very huge. It generates data 150G+ for a day. Other 9 data source generate only 20~25G for a day. Are there any tuning tips?
  3. It is original question and I want to know the way how I can limit resources for each query. If there is not an option for it in elasticsearch configuration, is there another way except option?

Thanks anyway.

It sounds like you have some room for optimisations with respect to how you organise you data into shards. I would recommend reading this blog post around shards and sharding practices. Exactly how much switching to SSDs will improve performance is hard to tell, as it varies depending on data, queries and load characteristics. I would recommend that you benchmark with your data if that is possible.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.