I have an ID field with very high cardinality, currently implemented as a string, containing content similar to a GUID.
I wish to perform terms aggregations on a large data, and want to optimize this.
I read this article that discusses ordinals and was wondering:
If I change the field implementation to a long, would that help in terms of query speed / memory usage / anything?
Hi Boaz, thanks for the info.
I will look into the formatting of whatever type I choose. I see precision_step is only for Elasticsearch 2.0+. Are there any recommendations for v1.7?
Also, I'm still wondering about this (from the link I posted):
Can switching to a numeric type help the performance of my query as well?
P.S. It's important to note I'm doing terms aggregation on a contextual ID field that is shared between multiple records (i.e. "session_id"), not on the unique document ID itself, if that matters.