Performance Problems

fgonzalez · December 7, 2023, 3:50pm

I am going to escalate your recommendations and we will try to implement them as soon as possible, I just wanted to clarify a few things in case this implies changing any parameter of what was discussed.

The 4 TB of data is for each HOT node of the 6 that you have, that is, 24TB in the HOT part between the information + the replica.

I will tell you more details and updated information as soon as I have it, thank you for your help!

Sorry to emphasize this again, but doesn't separating the 700TB of total data into more than one indexset have a significant impact on performance? At the end of the day it is like having a SQL with only one monstrous table where all the data is entered in a nutshell.

thanks greetings!

fgonzalez · December 21, 2023, 2:53pm

Good afternoon,

We have planned to change the size of the shards for the week of January 2nd, I will tell you the news.

Thank you and merry Christmas!

fgonzalez · January 9, 2024, 7:44am

Good morning,

We will finally implement the changes next week, keeping you informed.

Thanks greetings!

Christian_Dahlqvist · January 9, 2024, 8:23am

I do not understand this comment/question. Elasticsearch works very differently compared to a SQL database.

fgonzalez · January 29, 2024, 1:14pm

Good afternoon,

I know it is not comparable, but I am not sure if it is the most optimal to put everything in the same indexset, because without going any further, sometimes we get warnings about reaching the limit of indexed fields of 1000, therefore if it is divided The information by affinity in different indexsets should a priori be more optimal, right?

Regarding the change in the size of the shards, in principle we should have done it but it has become complicated and we have not yet been able to carry it out, I will keep you informed.

Thanks greetings!

Christian_Dahlqvist · January 29, 2024, 1:30pm

Best practice in general is to group data with similar mappings into index sets, and part of this is to avoid mapping explosion with respect to the number of fields. You do not want to go too granular though as you will end up with lots of very small indices and shards, which is very inefficient and hurts performance.

It seems like Graylog only creates a single index set, which may indeed not be optimal. That is however something you would need to address with them.

fgonzalez · January 29, 2024, 1:32pm

No, we can really have as many indexsets as we want, Graylog is not a problem for this.

Christian_Dahlqvist · January 29, 2024, 1:34pm

Then I would recommend creating multiple index sets based on how similar the mappings for different types of data are. As retention management is done at the index level, another aspect to consider is to also group data that have the same retention period together.

system · February 26, 2024, 1:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Slow Query Performance Elasticsearch	10	782	July 6, 2017
Performance problems Elasticsearch	12	618	July 6, 2017
Debugging extremely slow indexing Elasticsearch	39	7114	February 16, 2021
How can I tune for Elasticsearch performance? Elasticsearch	9	603	May 18, 2020
Clustering question (storage needs and performance) Elasticsearch	19	2520	January 11, 2018

Performance Problems

Related topics