Looking at graphql-server-express
is seems that it's an early development edition of apollo-server-express
. And the README just says to use apollo-server-express
instead as it should have the same API. So hopefully the switch should be easy
- On the 2nd screenshot you said we're seeing only sample data, is this sample data selected randomly from all 836 requests on this bucket? I understood that what you're calling bucket is that blue column from Response time distribution graph (selectable by their response time range) right?
While not directly random, it's correct that we just choose one of the 836 requests in the selected blue bucket. Roughly speaking, we make a query to Elasticsearch to return one transaction that has the given type+name, is within the bounds of the response time dictated by the bucket, and was recorded within the selected time-range.
Can we see that nice looking dashboard like that one in "Transaction sample" for that particular request? If I copy trace.id from Discover tab and paste it on APM dashboard it's showing Transaction sample for that particular trace. So am I doing it correct?
Depends what you mean by "paste it on APM dashboard". But you should be able to do what you're asking about using the following approach:
- From Discover take note of the transaction id (and its trace id) for the transaction you want to see the waterfall for
- Go to the Transaction detail page that has the same name as the the transaction whos ids you just found
- Append the ids to the URL in the following form:
&transactionId=<id>&traceId=<id>
, e.g. &transactionId=42cdd1306af31b5b&traceId=66e024c806473ed1e5ac3d58546d197d
.
This should load up the transaction with the given id and show its details.
In general can you give us quick recommendation for our workflow? Like how we find buggy or slow queries that are slowing down our website quickly, or what are most important things to do when using APM? etc, we maybe missing key principles of APM server.
In general, you shouldn't need to do the things described above.
What is most important for you is to make sure the transactions are named properly. Switching from graphql-server-express
to apollo-server-express
seems to be the easiest way to do that. If that isn't possible, adding a line of code to your application that manually names the transactions using apm.setTransactionName()
is the way to go.
When all the transactions are named correctly, you should be able to use the front page for the service that's showing all the transaction groups. Make sure that you sort by Impact:
By focusing on the transaction groups with the highest impact first, you spend your time fixing the slow responses that most users experience.
After selecting an appropriate group with a high impact, you normally see a response time distribution curve that looks like this:
As you can see from this curve, most requests are served fast as the bars (buckets) to the left in the graph are higher. Then you see a so called "long tail" of lower and lower buckets going out into the right. Those are normally the requests I would focus on as those for some reason took longer than average.
By clicking one of those buckets to the right in the graph, you'll be presented with a span waterfall showing what a typical request in that bucket was doing. And hopefully from investigating this span waterfall you can see why it was slow and fix it.
This is the most normal approach that our users use to hunt down and fix slow requests. And this doesn't require you to dive into all the transactions with a manual query like described previously.
This was a very short explanation. For a longer introduction to this subject, I can recommend to watch some of our APM webinars. This one is our most popular: https://www.elastic.co/webinars/elasticsearch-apm-overview
Cheers,
Thomas