Can I discover clusters of numeric values with Graph

(Assen Kolov) #1

I have a set of data about software projects with code quality metrics. Although I have defined the metric values as integer in the index mapping, in Graph I see separate vertices named 0, 4, 10, 11 etc, while I am interested in relevant relations like "projects of team A have high coverage, projects of team B have low coverage" (I have twisted the data to introduce this relation). Is this possible with Graph?

(Mark Harwood) #2

Generally graph is intended for exploring the connections between nouns (people, projects, source files, hashtags). Fields that represent quantities (e.g. a 0-100 quality measure) tend to be less useful to represent as nodes in a graph and so we don't currently provide support for this.

To identify teams that under-perform in code quality metrics I would be more tempted to use a range or histogram aggregation on the numeric quality field then underneath slot a significant terms aggregation on the team field. This would identify teams that are disproportionately represented in a particular band of quality metrics.

(system) #3