Can I discover clusters of numeric values with Graph

I have a set of data about software projects with code quality metrics. Although I have defined the metric values as integer in the index mapping, in Graph I see separate vertices named 0, 4, 10, 11 etc, while I am interested in relevant relations like "projects of team A have high coverage, projects of team B have low coverage" (I have twisted the data to introduce this relation). Is this possible with Graph?

Generally graph is intended for exploring the connections between nouns (people, projects, source files, hashtags). Fields that represent quantities (e.g. a 0-100 quality measure) tend to be less useful to represent as nodes in a graph and so we don't currently provide support for this.

To identify teams that under-perform in code quality metrics I would be more tempted to use a range or histogram aggregation on the numeric quality field then underneath slot a significant terms aggregation on the team field. This would identify teams that are disproportionately represented in a particular band of quality metrics.

1 Like