I've deleted only the relevant portions of your template.
APM Agent language and version: Python 6.7.2
Steps to reproduce:
Create a very large list. In our case, we had a list of objects. One of the property's was a list of list. So we essentially had a list of list of list.
We noticed then when an our code throws an exception, the elastic python agent then evaluates the local variables. But because this data in one of our particular cases is very large, as in millions of items, the APM python client code was taking up huge amounts of CPU. In addition, it would take over a minute to finally create the APM output to send to the APM server.
To simply replicate the issue, I: (see code snipit below)
created a simple script to create a list of list of lists of integers.
in one method I had my "with elasticAPMTransaction:" statement.
under this I created the list (200 items) of list (300 items) of list (40 items) aka 24M items total.
started to iterate through the outer list and picked an arbitrary spot to through an exception. This essentially mimicked the method where the real code threw an exception.
wrapped this whole thing in a profiler and evaluated the output.
This would take 60 seconds +/- to complete.
Seems like the issue is in the varmap function in elasticapm\utils_init_.py but there maybe lots of ways to solve this issue.
The snipit of code looks like:
with elastic_apm_transaction:
x = 0
my_list = self.build_lists()
for foo in my_list:
x += 1
if x >= 20:
time_at_exception = datetime.now()
print ("ABOUT TO THROW EXCEPTION AT ", time_at_exception.strftime("%A, %H:%M.%S"))
raise Exception("KA_BOOM")
You can turn it off entirely, or just limit the list length for your use case. Note that this only looks at the top level of the list, if I remember correctly, so keep that in mind as you're tuning.
Thanks for the quick response. I tried setting ELASTIC_APM_LOCAL_VAR_MAX_LENGTH to 20 and it seemingly had no affect. I don't want to set ELASTIC_APM_COLLECT_LOCAL_VARIABLES to False, because we find the variables very useful when there are errors.
Did you try ELASTIC_APM_LOCAL_VAR_LIST_MAX_LENGTH? I'm not surprised that ELASTIC_APM_LOCAL_VAR_MAX_LENGTH didn't solve your issue as we can't check the length until we've converted the variable to a string (which would be the performance bottleneck for your list of lists of lists).
Very strange. I just tried ELASTIC_APM_LOCAL_VAR_LIST_MAX_LENGTH as you suggested. I set the value to 20 and it went faster. What is strange is after wards I went back to the documentation and is says the default is 10. Setting it to 10 gets it down to 5 seconds.
I tried changing the varmap function to only use 20 elements from each list and this completes in less then a second.
You're right, it certainly seems to be 10 items by default
Does removing the config for ELASTIC_APM_LOCAL_VAR_LIST_MAX_LENGTH take it back to long runtimes?
I tried changing the varmap function to only use 20 elements from each list and this completes in less then a second.
Very odd. Did you also remove the recursive piece of varmap() with this edit? That's the only reason I could see for such a speed-up, since the default is 10.....
took out setting ELASTIC_APM_LOCAL_VAR_LIST_MAX_LENGTH
took 52 seconds.
Tried only only put the ELASTIC_APM_LOCAL_VAR_LIST_MAX_LENGTH = 10 back
took 40 seconds.
take out ELASTIC_APM_LOCAL_VAR_LIST_MAX_LENGTH, put back varmap change
took less then a second
Here is another interesting thing, looking at APM for the last one, I still see the lists each with . . . drumroll please . . . 10 items and when it cuts off a list you see the message "X more elements." So you are correct the default is 10, but the perf bug is there, it looks like it evaluates the entire list so it can calculate how many items were truncated.
def varmap(func, var, context=None, name=None, **kwargs):
"""
Executes ``func(key_name, value)`` on all values,
recursively discovering dict and list scoped
values.
"""
if context is None:
context = set()
objid = id(var)
if objid in context:
return func(name, "<...>", **kwargs)
context.add(objid)
if isinstance(var, dict):
# iterate over a copy of the dictionary to avoid "dictionary changed size during iteration" issues
ret = func(
name, dict((k, varmap(func, v, context, k, **kwargs)) for k, v in compat.iteritems(var.copy())), **kwargs
)
elif isinstance(var, (list, tuple)):
#my change here
max_elements = 20
ret = func(name, [varmap(func, f, context, name, **kwargs) for f in var[:max_elements]], **kwargs)
else:
ret = func(name, var, **kwargs)
context.remove(objid)
return ret
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.