Terms facet search, count of facets

I am using ElasticSearch with LogStash. I am looking to query the unique
ips for a given period. (BTW, via Ruby and Tire)

My thought is to use a faceted search of IP Address and count the IPs, with
no fields. Many other ideas?

Here is the query I am using.

curl -X GET
"http://lcoalhost:9200/logstash-2012.12.02/apache_access/_search?pretty=true"
-d '{"facets":{"myfacet":{"terms":
{"field":"@fields.client_ip","size":999999,"all_terms":"false"}}},"fields":[""]}'

A few questions:

  1. Is the a means in ElasticSearch to query for a count of facets? I am
    now just doing a ".size" of the "terms".

1b) There is a "special field" called "_index" that will return a facet
count of hits per index. Facet count or hits? How do I return this field?
(see:
http://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet.html)

  1. Other specifying the size of the facet, is there a means to return all
    terms. -1 and 0 do not work.

Thanks!

--

Hello,

Do you also need the number of accesses for each IP from that period or
just the IP itself?

You have to take care of the size when you do faceting. Besides the fact
that you can't return more than MAXINT results - so you can't do unlimited

  • you need to have enough memory per shard to hold all the fields. I assume
    you will get OOM errors with really big data sets.

If you only need to count the number of unique IPs, a workaround would be
to maintain a separate index with unique IPs. For example, you can run a
facet each day (or each hour/minute etc), to get the unique IPs for a
subset of your data. Then, you can index the IPs as new docs in a different
index, with the IP as the ID, and the log timestamp as data.

Then, when you want to count the unique IDs in the last X hours/days/etc,
you can look in this new index for unique IPs.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Mon, Dec 3, 2012 at 9:06 PM, Kubes philip@freepricealerts.com wrote:

I am using Elasticsearch with LogStash. I am looking to query the unique
ips for a given period. (BTW, via Ruby and Tire)

My thought is to use a faceted search of IP Address and count the IPs,
with no fields. Many other ideas?

Here is the query I am using.

curl -X GET "
http://lcoalhost:9200/logstash-2012.12.02/apache_access/_search?pretty=true"
-d '{"facets":{"myfacet":{"terms":

{"field":"@fields.client_ip","size":999999,"all_terms":"false"}}},"fields":[""]}'

A few questions:

  1. Is the a means in Elasticsearch to query for a count of facets? I am
    now just doing a ".size" of the "terms".

1b) There is a "special field" called "_index" that will return a facet
count of hits per index. Facet count or hits? How do I return this field?
(see:
Elasticsearch Platform — Find real-time answers at scale | Elastic
)

  1. Other specifying the size of the facet, is there a means to return all
    terms. -1 and 0 do not work.

Thanks!

--

--