dantuff
(Dan Tuffery)
October 2, 2013, 10:23am
1
I have int field 'size' that stores a size value in bytes. A requirement
has come to be able to facet on the field using an log-linear scale, e.g.
Up to 1MB, Up to 10MB, 100MB, 1GB, Over 1GB
What is the best way to achieve this kind of faceting in ElasticSearch?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
depahelix
(depahelix)
October 2, 2013, 11:53am
2
One way to do it would be to do just a tiny bit of preprocessing and build
your own buckets.
Then index those as not_analyzed.
-Chris.
From: es newbie [via ElasticSearch Users]
[mailto:ml-node+s115913n4042001h36@n3.nabble.com ]
Sent: Wednesday, October 02, 2013 6:24 AM
To: depahelix
Subject: How to facet on size?
I have int field 'size' that stores a size value in bytes. A requirement has
come to be able to facet on the field using an log-linear scale, e.g. Up to
1MB, Up to 10MB, 100MB, 1GB, Over 1GB
What is the best way to achieve this kind of faceting in ElasticSearch?
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out .
If you reply to this email, your message will be added to the discussion
below:
http://elasticsearch-users.115913.n3.nabble.com/How-to-facet-on-size-tp40420
01.html
To start a new topic under ElasticSearch Users, email
ml-node+s115913n115913h8@n3.nabble.com
To unsubscribe from ElasticSearch Users, click here
<http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?ma
cro=unsubscribe_by_code&node=115913&code=Y2hyaXNAZGVwYWhlbGl4LmNvbXwxMTU5MTN
8LTE0MjA5MDM0ODI=> .
<http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?ma
cro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.name
spaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.w
eb.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.na
ml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.
naml> NAML
One way to do it would be to do just a tiny bit of pre-processing on ingest
and build your own buckets. Then, use not_analyzed. Just one idea. There
may be a better way that I don't know about.
On Wednesday, October 2, 2013 6:23:43 AM UTC-4, dan wrote:
I have int field 'size' that stores a size value in bytes. A requirement
has come to be able to facet on the field using an log-linear scale, e.g.
Up to 1MB, Up to 10MB, 100MB, 1GB, Over 1GB
What is the best way to achieve this kind of faceting in Elasticsearch?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
I use slightly different scale... but the code looks like this:
builderFor(query) would create the initial term query or whatever.
public class NoticeValueFacetsJsonScheme extends AbstractNoticeScheme {
private final List<Range> ranges = constructRanges();
public JsonString facetQuery(NoticeQuery query) {
return JsonString.of(addRangeFacetsTo(builderFor(query)).toString());
}
private SearchRequestBuilder addRangeFacetsTo(SearchRequestBuilder searchRequestBuilder) {
RangeFacetBuilder rangeFacet = FacetBuilders.rangeFacet("ranges").field("your-field-goes-here");
for (Range range : ranges) {
if ( range.isRange() ) {
rangeFacet.addRange(range.getLower(), range.getUpper());
}
else if ( range.isLowerBound() ) {
rangeFacet.addUnboundedTo(range.getLower());
}
else if ( range.isUpperBound() ) {
rangeFacet.addUnboundedFrom(range.getUpper());
}
}
searchRequestBuilder.addFacet(rangeFacet);
return searchRequestBuilder;
}
private List<Range> constructRanges() {
int count = 15;
double lower = thousands(100);
double upper = billions(1);
double lowerf = Math.log10(lower);
double upperf = Math.log10(upper);
double diff = upperf - lowerf;
double step = diff / (double) count;
List<Range> ranges = newArrayList();
ranges.add(Range.between(0d,1d));
ranges.add(Range.between(1d, lower));
for ( int i = 1 ; i <= count ; i++ ) {
ranges.add(
Range.between(
Math.pow(10.0,lowerf + ( step * ( i - 1))),
Math.pow(10.0, lowerf + ( step * i ))
));
}
ranges.add(Range.from(upper));
return ranges;
}
private long thousands(int i) { return i * 1000; }
private long millions(int i) { return i * thousands(1000); }
private long billions(int i) { return i * millions(1000); }
public static class Range {
private final Double lower;
private final Double upper;
private Range(Double lower, Double upper) {
this.lower = lower;
this.upper = upper;
}
public boolean isLowerBound() {
return lower != null;
}
public boolean isUpperBound() {
return upper != null;
}
public boolean isRange() {
return isLowerBound() && isUpperBound();
}
public double getLower() {
return lower;
}
public double getUpper() {
return upper;
}
public static Range to(Double number) {
return new Range(null, number);
}
public static Range from(Double number) {
return new Range(number, null);
}
public static Range between(Double lower, Double upper) {
return new Range(lower, upper);
}
}
}
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
I specifically wouldn't do that.
By doing this you are saying that if your ranges change you will need to
re-import all of your data, same if you come up with a similar but
not-the-same requirement - this is going to suck.
Let the stuff in the index be the base data, then find meaning by searching
and selecting and aggregating as you need to.
James
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
dantuff
(Dan Tuffery)
October 2, 2013, 1:51pm
6
Thanks James, that's very helpful.
On Wednesday, October 2, 2013 1:46:58 PM UTC+1, James Richardson wrote:
I specifically wouldn't do that.
By doing this you are saying that if your ranges change you will need to
re-import all of your data, same if you come up with a similar but
not-the-same requirement - this is going to suck.
Let the stuff in the index be the base data, then find meaning by
searching and selecting and aggregating as you need to.
James
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
dantuff
(Dan Tuffery)
October 2, 2013, 1:51pm
7
Thanks James, that's very helpful.
On Wednesday, October 2, 2013 1:27:11 PM UTC+1, James Richardson wrote:
I use slightly different scale... but the code looks like this:
builderFor(query) would create the initial term query or whatever.
public class NoticeValueFacetsJsonScheme extends AbstractNoticeScheme {
private final List<Range> ranges = constructRanges();
public JsonString facetQuery(NoticeQuery query) {
return JsonString.of(addRangeFacetsTo(builderFor(query)).toString());
}
private SearchRequestBuilder addRangeFacetsTo(SearchRequestBuilder searchRequestBuilder) {
RangeFacetBuilder rangeFacet = FacetBuilders.rangeFacet("ranges").field("your-field-goes-here");
for (Range range : ranges) {
if ( range.isRange() ) {
rangeFacet.addRange(range.getLower(), range.getUpper());
}
else if ( range.isLowerBound() ) {
rangeFacet.addUnboundedTo(range.getLower());
}
else if ( range.isUpperBound() ) {
rangeFacet.addUnboundedFrom(range.getUpper());
}
}
searchRequestBuilder.addFacet(rangeFacet);
return searchRequestBuilder;
}
private List<Range> constructRanges() {
int count = 15;
double lower = thousands(100);
double upper = billions(1);
double lowerf = Math.log10(lower);
double upperf = Math.log10(upper);
double diff = upperf - lowerf;
double step = diff / (double) count;
List<Range> ranges = newArrayList();
ranges.add(Range.between(0d,1d));
ranges.add(Range.between(1d, lower));
for ( int i = 1 ; i <= count ; i++ ) {
ranges.add(
Range.between(
Math.pow(10.0,lowerf + ( step * ( i - 1))),
Math.pow(10.0, lowerf + ( step * i ))
));
}
ranges.add(Range.from(upper));
return ranges;
}
private long thousands(int i) { return i * 1000; }
private long millions(int i) { return i * thousands(1000); }
private long billions(int i) { return i * millions(1000); }
public static class Range {
private final Double lower;
private final Double upper;
private Range(Double lower, Double upper) {
this.lower = lower;
this.upper = upper;
}
public boolean isLowerBound() {
return lower != null;
}
public boolean isUpperBound() {
return upper != null;
}
public boolean isRange() {
return isLowerBound() && isUpperBound();
}
public double getLower() {
return lower;
}
public double getUpper() {
return upper;
}
public static Range to(Double number) {
return new Range(null, number);
}
public static Range from(Double number) {
return new Range(number, null);
}
public static Range between(Double lower, Double upper) {
return new Range(lower, upper);
}
}
}
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .