Histogram and Sturges' formula


(Justin MacCarthy) #1

Hi,

Has anyone figured out how to use Sturges
(http://en.wikipedia.org/wiki/Histogram#cite_note-sturges-9) or similar to
set the number of bins in a histogram?
Is it possible to use scripting to set that number ?

This is a common method in more histogram, functions, in R, D3,js , etc.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #2

Hey,

can you maybe tell us, what you expect the histogram interval to happen
with? Linking to paid content makes is pretty hard to understand the
requirement. :slight_smile:
I suppose you want automatic bucketing instead of specifying an interval
manually?

--Alex

On Mon, Oct 28, 2013 at 5:09 AM, Justin MacCarthy <
justin.maccarthy@gmail.com> wrote:

Hi,

Has anyone figured out how to use Sturges (
http://en.wikipedia.org/wiki/Histogram#cite_note-sturges-9) or similar to
set the number of bins in a histogram?
Is it possible to use scripting to set that number ?

This is a common method in more histogram, functions, in R, D3,js , etc.

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Justin MacCarthy) #3

From the wikipedia article I linked to

Sturges' formula
Sturges' formula[9]http://en.wikipedia.org/wiki/Histogram#cite_note-sturges-9 is
derived from a binomial distribution and implicitly assumes an
approximately normal distribution.
[image: k = \lceil \log_2 n + 1 \rceil, ,]
It implicitly bases the bin sizes on the range of the data and can perform
poorly if n < 30.[*citation neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_needed
*] It may also perform poorly if the data are not normally distributed.

I suppose you want automatic bucketing instead of specifying an interval
manually?

Yes, using the above method or similar

Justin

On Monday, October 28, 2013 4:42:08 PM UTC+7, Alexander Reelsen wrote:

Hey,

can you maybe tell us, what you expect the histogram interval to happen
with? Linking to paid content makes is pretty hard to understand the
requirement. :slight_smile:
I suppose you want automatic bucketing instead of specifying an interval
manually?

--Alex

On Mon, Oct 28, 2013 at 5:09 AM, Justin MacCarthy <justin.m...@gmail.com<javascript:>

wrote:

Hi,

Has anyone figured out how to use Sturges (
http://en.wikipedia.org/wiki/Histogram#cite_note-sturges-9) or similar
to set the number of bins in a histogram?
Is it possible to use scripting to set that number ?

This is a common method in more histogram, functions, in R, D3,js , etc.

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #4

If you want to implement automatic number of bin calculation for a
histogram, I recommend reading Lucien Birgé and Yves Rozenholc:
http://archive.numdam.org/ARCHIVE/PS/PS_2006__10_/PS_2006__10__24_0/PS_2006__10__24_0.pdf

For a start, I would prefer alternatives of equi-distant histogram bins or
equi-frequent histogram bins instead of having to put in a specific
constant interval.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5