Investigating high disk usage

gfeher · August 6, 2020, 9:26am

I am running Elastic App Search on https://cloud.elastic.co/ (GCP). I have about 40000 documents and I have multiple deployments with these same sets of documents. What I have noticed is that the disk usage is very different between these deployments. For example in the deployment that I have been running for months, it is:
GCP.DATA.HIGHCPU.1 Disk allocation = 13.31 GB
And in the new deployment, started recently with the same documents, it is:
GCP.DATA.HIGHCPU.1 Disk allocation = 402 MB
I do have some small technical differences between these clusters, but they shouldn't justify such a big difference, I guess. My main suspicion is that some sort of logging or automatic snapshotting is taking up all the space.

For example, I know that some of the logs are going into ".app-search-app-logs-loco_togo_production*". Is there a way to check how much space it is taking? I'd also appreciate any other suggestions on how to investigate the disk usage in my Elastic Cloud environment.

Thanks,
Gabor

Carlos_Redondo · August 11, 2020, 6:43pm

Very interesting question! I also wondering if AppSearch settings could be adjusted (how and where) to limit the historical information stored.

gfeher · August 11, 2020, 7:05pm

I am making some progress with this question. It turns out that Kibana contains some management features for Elasticsearch. (It was installed by default in my case.) Inside Kibana, I looked for "Stack Management", created an "Index Lifecycle Policy" and linked it to "ent-search-ecs-logs". (I am using one of those new versions where App Search is called Enterprise Search.) Also under "Stack Management", you can select "Index Management", flip the "Include System Indices" switch and see how much space your indices are currently taking up.

I am still waiting to see if this solution has worked for me or not. I am also quite surprised that the deployments I can create within the Elastic Cloud don't have some sort of sane lifecycle policy to prevent the logs from filling up the disk space over time.

Carlos_Redondo · August 11, 2020, 7:33pm

I totally agree with you. I'll take a look over the Life Cycle Policy, it sounds like a very promising solution. I'm dealing with performance, trying to get the best response from AppSearch Enterprise Search solution. So far the Query_Suggestions and Search API's requests are taking ~220ms response, and we want to be under 100ms. So far I have few documents loaded into the Engine, and I'm changing up/down RAM/Zones/Cloud Provider (AWS/Google/Azure) trying to benchmark each deployment and found a good balance between performance and cost.

gfeher · August 11, 2020, 9:09pm

Cool. Sorry, I don't have any info on response times for you. Maybe you have a better chance of your performance questions answered if you open a separate topic for them.

system · September 8, 2020, 9:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
AppSearch disk consumption keeps growing Elastic Search elastic-app-search	8	761	July 26, 2020
How to optimize disk usage? Elasticsearch	5	1219	July 6, 2017
How to delete older logs in ELK to give each application a certain disk quota? Elasticsearch	7	3186	July 5, 2017
Kibana not able to save searches Kibana	33	10350	January 29, 2019
High disk usage compared to before Elasticsearch	10	2541	July 5, 2017

Investigating high disk usage

Related topics