Elastic enterprise throws 502 errors

Hi there,

We have been getting quite a few 5xx responses from Elastic Enterprise Request: both from our web client requests to App Search, as well as when browsing Enterprise Search within elastic cloud deployment UX.

Example error:

    "statusCode": 502,
    "error": "Bad Gateway",
    "message": "Enterprise Search encountered an internal server error. Please contact your system administrator if the problem persists."

Our deployment intermittently appears as unhealthy, saying our Elastic Enterprise instance/s our unstable.

I am guessing that it might have to do with our minimal (2GM ram) resource configuration for the Enterprise Search instance:

Questions I have:

  • How do I troubleshoot the issue? I already ship logs and metrics to a monitoring deployment but this doesn't give me any useful information to troubleshoot the problem AFAIK.
  • How do I know how much resource I need to provision for Enterprise Search?
    • Do I also need 2 availability zones for Enterprise Search or within 1 availability zone there is already replication ?

thanks in advance.

As a cloud customer, you're entitled to Elastic Support services. You can engage them at support.elastic.co, and they can help dig into your logs to figure out what's going on.

As you noted, 2GB is the smallest size available for Enterprise Search, and depending on what Enterprise Search features you're using (and at what scale) this may be insufficient.

I'd recommend as a first step just bumping the size of the instance up to 4GB, and see if that resolves your issue. If not, I'd engage support, and ask for some more targeted troubleshooting/advice based on your specific use case.

Thanks for the response @Sean_Story

I have contacted support as well but was hoping perhaps for a faster response here.

I'd recommend as a first step just bumping the size of the instance up to 4GB, and see if that resolves your issue.

That's not a bad idea : ) What I wanted to avoid was to "brute" fix the issue without knowing exactly the root cause ( "e.g. an out of mem log in enterprise nodes, etc")

Hi @Gerardo_Zenobi, have you tried looking at the Enterprise Search logs? The logline you're referencing is a Kibana log line, Enterprise Search should be logging more specific information on any issues it's running into.

Hi @Sander_Philipse ,

It was hard to look at the logs as we were not shipping them to a separate monitoring cluster.

Since then, we started doing so and I have been trying to find clues in them without much success: trying to get around the log stream usage and understand the different ways of filtering to quickly spot issues (e.g. how to show only error/warnings, etc). E.g.:

If you have any tips about it, they are welcome : )

Thanks in advance.