Difficulty trying to find a way to use facets

Hi all,

I'm using ES to store maillogs. In that, I'm trying to use ES to generate
statistics for ex, per domain bandwidth usage etc.

For ex, if I have a document which has fields like:

Now, there can be many email addresses like abc@domain.com. I just want the
summation of size for a particular domain. How do I accomplish that?

I had a look at statistical facet but I'm not sure how will I use it to do
per domain summation of @fields.size field.
Any help is highly appreciated. Thanks

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

It's a question of mapping and analysis.

Define a mapping for faceting (multifield type is the best option) and apply to "from" field a custom analyzer: http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/.

Define this custom analyzer first when creating your index. I think you should use a pattern tokenizer: http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-tokenizer/
And a pattern replace token filter: http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern_replace-tokenfilter/

Have a look at http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer/. It may help you on how to define an analyzer and try it (analyze API).

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 avr. 2013 à 23:03, Abhijeet Rastogi abhijeet.1989@gmail.com a écrit :

Hi all,

I'm using ES to store maillogs. In that, I'm trying to use ES to generate statistics for ex, per domain bandwidth usage etc.

For ex, if I have a document which has fields like:

Now, there can be many email addresses like abc@domain.com. I just want the summation of size for a particular domain. How do I accomplish that?

I had a look at statistical facet but I'm not sure how will I use it to do per domain summation of @fields.size field.
Any help is highly appreciated. Thanks

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

How is this about mapping and analyzers? I forgot to mention but both these
fields I specified are non-analyzed. So, I already have these fields with
me in ES.

On Fri, Apr 5, 2013 at 7:33 AM, David Pilato david@pilato.fr wrote:

It's a question of mapping and analysis.

Define a mapping for faceting (multifield type is the best option) and
apply to "from" field a custom analyzer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/
.

Define this custom analyzer first when creating your index. I think you
should use a pattern tokenizer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-tokenizer/
And a pattern replace token filter:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern_replace-tokenfilter/

Have a look at
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer/.
It may help you on how to define an analyzer and try it (analyze API).

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 avr. 2013 à 23:03, Abhijeet Rastogi abhijeet.1989@gmail.com a
écrit :

Hi all,

I'm using ES to store maillogs. In that, I'm trying to use ES to generate
statistics for ex, per domain bandwidth usage etc.

For ex, if I have a document which has fields like:

Now, there can be many email addresses like abc@domain.com. I just want
the summation of size for a particular domain. How do I accomplish that?

I had a look at statistical facet but I'm not sure how will I use it to do
per domain summation of @fields.size field.
Any help is highly appreciated. Thanks

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,

I'm clueless as to how to proceed as I am not sure how'll I apply mapping
for faceting. I'll try to be more precise so that may be you can more
explicit about it. I would really appreciate it.

Suppose, I've this data.

curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "First
Line", "email_id": "abc@domain.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Second Line", "email_id": "def@domain.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "Third
Line", "email_id": "ghi@domain.com", "size": 3096}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fourth Line", "email_id": "abc@domainname.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "Fifth
Line", "email_id": "def@domainname.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "Fifth
Line", "email_id": "def@domainname.com", "size": 3096}'

I want a facet query that'll give me stats like sum of size field for each
domains (in this example domain.com and domainname.com) or perhaps as a
bonus for each email id too.

On Fri, Apr 5, 2013 at 7:33 AM, David Pilato david@pilato.fr wrote:

It's a question of mapping and analysis.

Define a mapping for faceting (multifield type is the best option) and
apply to "from" field a custom analyzer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/
.

Define this custom analyzer first when creating your index. I think you
should use a pattern tokenizer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-tokenizer/
And a pattern replace token filter:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern_replace-tokenfilter/

Have a look at
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer/.
It may help you on how to define an analyzer and try it (analyze API).

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 avr. 2013 à 23:03, Abhijeet Rastogi abhijeet.1989@gmail.com a
écrit :

Hi all,

I'm using ES to store maillogs. In that, I'm trying to use ES to generate
statistics for ex, per domain bandwidth usage etc.

For ex, if I have a document which has fields like:

Now, there can be many email addresses like abc@domain.com. I just want
the summation of size for a particular domain. How do I accomplish that?

I had a look at statistical facet but I'm not sure how will I use it to do
per domain summation of @fields.size field.
Any help is highly appreciated. Thanks

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

Here is a full gist to do it: https://gist.github.com/dadoonet/5320947

Hope this helps

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 5 avr. 2013 à 11:30, Abhijeet Rastogi abhijeet.1989@gmail.com a écrit :

Hi David,

I'm clueless as to how to proceed as I am not sure how'll I apply mapping for faceting. I'll try to be more precise so that may be you can more explicit about it. I would really appreciate it.

Suppose, I've this data.

curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "First Line", "email_id": "abc@domain.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "Second Line", "email_id": "def@domain.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "Third Line", "email_id": "ghi@domain.com", "size": 3096}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "Fourth Line", "email_id": "abc@domainname.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "Fifth Line", "email_id": "def@domainname.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" : "Fifth Line", "email_id": "def@domainname.com", "size": 3096}'

I want a facet query that'll give me stats like sum of size field for each domains (in this example domain.com and domainname.com) or perhaps as a bonus for each email id too.

On Fri, Apr 5, 2013 at 7:33 AM, David Pilato david@pilato.fr wrote:
It's a question of mapping and analysis.

Define a mapping for faceting (multifield type is the best option) and apply to "from" field a custom analyzer: http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/.

Define this custom analyzer first when creating your index. I think you should use a pattern tokenizer: http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-tokenizer/
And a pattern replace token filter: http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern_replace-tokenfilter/

Have a look at http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer/. It may help you on how to define an analyzer and try it (analyze API).

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 avr. 2013 à 23:03, Abhijeet Rastogi abhijeet.1989@gmail.com a écrit :

Hi all,

I'm using ES to store maillogs. In that, I'm trying to use ES to generate statistics for ex, per domain bandwidth usage etc.

For ex, if I have a document which has fields like:

Now, there can be many email addresses like abc@domain.com. I just want the summation of size for a particular domain. How do I accomplish that?

I had a look at statistical facet but I'm not sure how will I use it to do per domain summation of @fields.size field.
Any help is highly appreciated. Thanks

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks a lot for your awesome reply David. I now understand most part of it.
One thing though, this method doesn't save the actual value for email_id.
What I mean is, I can't use term queries to search for these email_ids.

Is there a way around it?

On Fri, Apr 5, 2013 at 10:40 PM, David Pilato david@pilato.fr wrote:

Hi,

Here is a full gist to do it: https://gist.github.com/dadoonet/5320947

Hope this helps

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 5 avr. 2013 à 11:30, Abhijeet Rastogi abhijeet.1989@gmail.com a
écrit :

Hi David,

I'm clueless as to how to proceed as I am not sure how'll I apply mapping
for faceting. I'll try to be more precise so that may be you can more
explicit about it. I would really appreciate it.

Suppose, I've this data.

curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"First Line", "email_id": "abc@domain.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Second Line", "email_id": "def@domain.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Third Line", "email_id": "ghi@domain.com", "size": 3096}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fourth Line", "email_id": "abc@domainname.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fifth Line", "email_id": "def@domainname.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fifth Line", "email_id": "def@domainname.com", "size": 3096}'

I want a facet query that'll give me stats like sum of size field for each
domains (in this example domain.com and domainname.com) or perhaps as a
bonus for each email id too.

On Fri, Apr 5, 2013 at 7:33 AM, David Pilato david@pilato.fr wrote:

It's a question of mapping and analysis.

Define a mapping for faceting (multifield type is the best option) and
apply to "from" field a custom analyzer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/
.

Define this custom analyzer first when creating your index. I think you
should use a pattern tokenizer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-tokenizer/
And a pattern replace token filter:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern_replace-tokenfilter/

Have a look at
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer/.
It may help you on how to define an analyzer and try it (analyze API).

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 avr. 2013 à 23:03, Abhijeet Rastogi abhijeet.1989@gmail.com a
écrit :

Hi all,

I'm using ES to store maillogs. In that, I'm trying to use ES to generate
statistics for ex, per domain bandwidth usage etc.

For ex, if I have a document which has fields like:

Now, there can be many email addresses like abc@domain.com. I just want
the summation of size for a particular domain. How do I accomplish that?

I had a look at statistical facet but I'm not sure how will I use it to
do per domain summation of @fields.size field.
Any help is highly appreciated. Thanks

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Interesting solution by David. If you want to preserve the value of the
entire email_id, you can use a multi-field where one field is not analyzed
(or uses the keyword analyzer) and the other one uses the email analyzer
created by David.

http://www.elasticsearch.org/guide/reference/mapping/multi-field-type/

--
Ivan

On Fri, Apr 5, 2013 at 1:29 PM, Abhijeet Rastogi abhijeet.1989@gmail.comwrote:

Thanks a lot for your awesome reply David. I now understand most part of
it.
One thing though, this method doesn't save the actual value for email_id.
What I mean is, I can't use term queries to search for these email_ids.

Is there a way around it?

On Fri, Apr 5, 2013 at 10:40 PM, David Pilato david@pilato.fr wrote:

Hi,

Here is a full gist to do it: https://gist.github.com/dadoonet/5320947

Hope this helps

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 5 avr. 2013 à 11:30, Abhijeet Rastogi abhijeet.1989@gmail.com a
écrit :

Hi David,

I'm clueless as to how to proceed as I am not sure how'll I apply mapping
for faceting. I'll try to be more precise so that may be you can more
explicit about it. I would really appreciate it.

Suppose, I've this data.

curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"First Line", "email_id": "abc@domain.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Second Line", "email_id": "def@domain.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Third Line", "email_id": "ghi@domain.com", "size": 3096}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fourth Line", "email_id": "abc@domainname.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fifth Line", "email_id": "def@domainname.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fifth Line", "email_id": "def@domainname.com", "size": 3096}'

I want a facet query that'll give me stats like sum of size field for
each domains (in this example domain.com and domainname.com) or perhaps
as a bonus for each email id too.

On Fri, Apr 5, 2013 at 7:33 AM, David Pilato david@pilato.fr wrote:

It's a question of mapping and analysis.

Define a mapping for faceting (multifield type is the best option) and
apply to "from" field a custom analyzer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/
.

Define this custom analyzer first when creating your index. I think you
should use a pattern tokenizer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-tokenizer/
And a pattern replace token filter:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern_replace-tokenfilter/

Have a look at
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer/.
It may help you on how to define an analyzer and try it (analyze API).

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 avr. 2013 à 23:03, Abhijeet Rastogi abhijeet.1989@gmail.com a
écrit :

Hi all,

I'm using ES to store maillogs. In that, I'm trying to use ES to
generate statistics for ex, per domain bandwidth usage etc.

For ex, if I have a document which has fields like:

Now, there can be many email addresses like abc@domain.com. I just want
the summation of size for a particular domain. How do I accomplish that?

I had a look at statistical facet but I'm not sure how will I use it to
do per domain summation of @fields.size field.
Any help is highly appreciated. Thanks

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks Ivan for pointing to the right direction. multi-field is the
solution I wanted. :slight_smile:

On Sat, Apr 6, 2013 at 5:34 AM, Ivan Brusic ivan@brusic.com wrote:

Interesting solution by David. If you want to preserve the value of the
entire email_id, you can use a multi-field where one field is not analyzed
(or uses the keyword analyzer) and the other one uses the email analyzer
created by David.

http://www.elasticsearch.org/guide/reference/mapping/multi-field-type/

--
Ivan

On Fri, Apr 5, 2013 at 1:29 PM, Abhijeet Rastogi abhijeet.1989@gmail.comwrote:

Thanks a lot for your awesome reply David. I now understand most part of
it.
One thing though, this method doesn't save the actual value for email_id.
What I mean is, I can't use term queries to search for these email_ids.

Is there a way around it?

On Fri, Apr 5, 2013 at 10:40 PM, David Pilato david@pilato.fr wrote:

Hi,

Here is a full gist to do it: https://gist.github.com/dadoonet/5320947

Hope this helps

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 5 avr. 2013 à 11:30, Abhijeet Rastogi abhijeet.1989@gmail.com a
écrit :

Hi David,

I'm clueless as to how to proceed as I am not sure how'll I apply
mapping for faceting. I'll try to be more precise so that may be you can
more explicit about it. I would really appreciate it.

Suppose, I've this data.

curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"First Line", "email_id": "abc@domain.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Second Line", "email_id": "def@domain.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Third Line", "email_id": "ghi@domain.com", "size": 3096}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fourth Line", "email_id": "abc@domainname.com", "size": 1024}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fifth Line", "email_id": "def@domainname.com", "size": 2048}'
curl -X POST "http://localhost:9200/facets_test/logs" -d '{"title" :
"Fifth Line", "email_id": "def@domainname.com", "size": 3096}'

I want a facet query that'll give me stats like sum of size field for
each domains (in this example domain.com and domainname.com) or perhaps
as a bonus for each email id too.

On Fri, Apr 5, 2013 at 7:33 AM, David Pilato david@pilato.fr wrote:

It's a question of mapping and analysis.

Define a mapping for faceting (multifield type is the best option) and
apply to "from" field a custom analyzer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer/
.

Define this custom analyzer first when creating your index. I think you
should use a pattern tokenizer:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-tokenizer/
And a pattern replace token filter:
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern_replace-tokenfilter/

Have a look at
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer/.
It may help you on how to define an analyzer and try it (analyze API).

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 4 avr. 2013 à 23:03, Abhijeet Rastogi abhijeet.1989@gmail.com a
écrit :

Hi all,

I'm using ES to store maillogs. In that, I'm trying to use ES to
generate statistics for ex, per domain bandwidth usage etc.

For ex, if I have a document which has fields like:

Now, there can be many email addresses like abc@domain.com. I just
want the summation of size for a particular domain. How do I accomplish
that?

I had a look at statistical facet but I'm not sure how will I use it to
do per domain summation of @fields.size field.
Any help is highly appreciated. Thanks

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.