Range values in a histogram


(Patrick Quinn) #1

I have data that deals with time spans, rather than discrete dates. I
store a start time and end time.

I need to produce a date histogram taking into account the time
spans. So for instance, if a document has a start date of 2012-03-01
and end date of 2012-03-10, and I produced the histogram with a day
interval, the document would be counted once on 9 different days.

I have many millions of records and the histogram interval may be much
smaller than the date range in some cases.

Is there any way to do this efficiently in Elastic?

Alternately, here's the real problem I need to solve:

Given many documents, some with time spans, some with discrete dates,
identify the time spans for which there's at least one matching
document per time interval (e.g. find blocks of time in April where
data is no more than an hour apart).

Is that any easier for Elastic to solve?

Thanks!


(Shay Banon) #2

The only way that I can think of you doing it is using multiple filter facets, each facet with different range filter (or a combination of range filter, on start and end date). You can have a filter facet per "interval".

On Tuesday, March 6, 2012 at 6:47 PM, Patrick Quinn wrote:

I have data that deals with time spans, rather than discrete dates. I
store a start time and end time.

I need to produce a date histogram taking into account the time
spans. So for instance, if a document has a start date of 2012-03-01
and end date of 2012-03-10, and I produced the histogram with a day
interval, the document would be counted once on 9 different days.

I have many millions of records and the histogram interval may be much
smaller than the date range in some cases.

Is there any way to do this efficiently in Elastic?

Alternately, here's the real problem I need to solve:

Given many documents, some with time spans, some with discrete dates,
identify the time spans for which there's at least one matching
document per time interval (e.g. find blocks of time in April where
data is no more than an hour apart).

Is that any easier for Elastic to solve?

Thanks!


(system) #3