Single or multiple types

jasmine · February 11, 2016, 2:31am

Hi There,

We are looking into solutions of storing and searching time-series of data and Elasticsearch comes up as one of the candidates. After reading and thinking of our domain, I have a question regarding to the mapping and type managements.

In short is it better to have multiple types with single data point for each search or single type with all the data points?

an example is,
1>
HostCPU {
name: string; //hostName
value: float; //cpu usage
timestamp: date; //collection time
}

HostMemory{
name: string; //hostName
value: float; //memory usage
timestamp: date; //collection time
}

.....
other types of interests
........

or
2>
Host{
name: string; //hostName
cpuUsage: float;
memoryUsage: float;
...otherDataOfInterest..
timestamp: data
}

With option 1> we would just retrieve CPU or memory data as needed; with option 2> you would always get all data even if user is only interested in one of the data points.

With option 2> the number of documents would be much less and the duplication of data is less, hence less footprints as well but more I/O when not all data is needed. also when multiple data points are needed, one search vs. multiple searches.

There are thousands of hosts to collect and search data for. For each host we have 20 or so datapoints of interets.

I'd appreciate any feedback and any pointers to some design principles to keep in mind in regards to number of indices, number of types, number of documents and any hard limits on those.

Thanks
Jasmine

jasontedor · February 11, 2016, 2:42am

Not directly an answer to your question, but the definitive writing on types is Index vs. Type. I hope that it helps you understand how to think about these sorts of issues.

warkolm · February 11, 2016, 2:44am

Maybe, compression would help with #1

Basically it's going to be a case of try both and see what works

jasmine · February 11, 2016, 4:39am

Thanks Jason,

We are planning to use one index ( per day) with many types.

We are uncertain with "many" different types, each with fewer data points, so the found documents would only contain the necessary data point; or less types, each with more data points, so the search would return all data and up to the consumer to pick out what data points are needed.

warkolm · February 11, 2016, 4:53am

Be careful - Mapping changes | Elasticsearch Guide [2.2] | Elastic

jasmine · February 11, 2016, 5:47am

Thanks Mark, it's very useful to be aware of the pitfalls.

My main concern of the multiple types vs. single type catching all is about storage and performance. From what you said previously, both are reasonable approaches pending on usage and can only find out by prototype and benchmarking.. On paper (in theory) it's not black or white with either option 1 or 2 from the expert's point of view.

Topic		Replies	Views
Performance Issue with Single-Type Index vs. Multi-Type Index in Elasticsearch Elasticsearch user-experience	1	9	November 15, 2024
Alternative for mapping type Elasticsearch	6	417	September 29, 2020
Types and Indices. One to one? Elasticsearch	5	1982	July 6, 2017
Design structure for similar mappings with small data type differences Elasticsearch	2	562	June 16, 2017
Performance boost with multiple types Elasticsearch	3	369	July 6, 2017

Single or multiple types

Related topics