Why does _stats doc count differ from _search/_count doc count for an index?

Hi Team,

I'm trying to work out which documents the _stats is counting when the
index _count is so much smaller.

On a test index with no replicas.

When hitting the stats:

localhost:9200/my-index/_stats
indices.my-index.primaries.docs.count
=68910 docs
(and deleted docs = 0)

where as search/count shows:

localhost:9200/my-index/_search?search_type=count
=11485 docs
localhost:9200/my-index/_count
=11485 docs

What docs is the stats api counting that the search/count api is not
counting?

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You are probably using nested documents, don't you?

Each nested doc is a Lucene doc. stats API count Lucene docs.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 20 janv. 2015 à 01:43, Peter like268@gmail.com a écrit :

Hi Team,

I'm trying to work out which documents the _stats is counting when the index _count is so much smaller.

On a test index with no replicas.

When hitting the stats:

localhost:9200/my-index/_stats
indices.my-index.primaries.docs.count
=68910 docs
(and deleted docs = 0)

where as search/count shows:

localhost:9200/my-index/_search?search_type=count
=11485 docs
localhost:9200/my-index/_count
=11485 docs

What docs is the stats api counting that the search/count api is not counting?

Thanks.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/31626DBD-0E69-4155-B43F-4841921AB33D%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

2 Likes

Hi David,

Yes, we have some nested fields.

Ok, so the short answer is that for types that include nested fields,
search/count api will only count top level matched docs and exclude nested
docs from the count.
Whereas stats api counts them all at index level.

Are lucene docs and elasticsearch docs are one and the same or are these
different beasts, lucene docs being lower level than elasticsearch docs?

So to count all docs, I'd need to aggregate each nested doc count along
with each top level doc?

Something like:

"aggs" : {
"stats_total_docs" : {
"stats" : {
"script" : "1 + _source.my-nested-field1.size() +
_source.my-nested-field2.size() + _source.my-nested-field3.size()"
}
}

This would run the aggregation against every matched top level doc for a
given query.

And is there any more efficient or native search/count API equivalent for
the script counting I'm using to arrive at total doc count for nested
documents?

I'm after a way to count total docs for nested docs but at query level not
at index level.

Thanks.

(Also note, I've used stats rather than just sum for my aggregation to get
some additional info as well as just the sum of docs).

On Tuesday, January 20, 2015 at 2:02:02 PM UTC+10, David Pilato wrote:

You are probably using nested documents, don't you?

Each nested doc is a Lucene doc. stats API count Lucene docs.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 20 janv. 2015 à 01:43, Peter <lik...@gmail.com <javascript:>> a écrit :

Hi Team,

I'm trying to work out which documents the _stats is counting when the
index _count is so much smaller.

On a test index with no replicas.

When hitting the stats:

localhost:9200/my-index/_stats
indices.my-index.primaries.docs.count
=68910 docs
(and deleted docs = 0)

where as search/count shows:

localhost:9200/my-index/_search?search_type=count
=11485 docs
localhost:9200/my-index/_count
=11485 docs

What docs is the stats api counting that the search/count api is not
counting?

Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/67260fb7-d414-4051-8282-f4413b471a7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The only way for doing that would be by using parent/child instead of nested.

My 2 cents.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 20 janv. 2015 à 05:27, Peter like268@gmail.com a écrit :

Hi David,

Yes, we have some nested fields.

Ok, so the short answer is that for types that include nested fields, search/count api will only count top level matched docs and exclude nested docs from the count.
Whereas stats api counts them all at index level.

Are lucene docs and elasticsearch docs are one and the same or are these different beasts, lucene docs being lower level than elasticsearch docs?

So to count all docs, I'd need to aggregate each nested doc count along with each top level doc?

Something like:

"aggs" : {
"stats_total_docs" : {
"stats" : {
"script" : "1 + _source.my-nested-field1.size() + _source.my-nested-field2.size() + _source.my-nested-field3.size()"
}
}

This would run the aggregation against every matched top level doc for a given query.

And is there any more efficient or native search/count API equivalent for the script counting I'm using to arrive at total doc count for nested documents?

I'm after a way to count total docs for nested docs but at query level not at index level.

Thanks.

(Also note, I've used stats rather than just sum for my aggregation to get some additional info as well as just the sum of docs).

On Tuesday, January 20, 2015 at 2:02:02 PM UTC+10, David Pilato wrote:
You are probably using nested documents, don't you?

Each nested doc is a Lucene doc. stats API count Lucene docs.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 20 janv. 2015 à 01:43, Peter lik...@gmail.com a écrit :

Hi Team,

I'm trying to work out which documents the _stats is counting when the index _count is so much smaller.

On a test index with no replicas.

When hitting the stats:

localhost:9200/my-index/_stats
indices.my-index.primaries.docs.count
=68910 docs
(and deleted docs = 0)

where as search/count shows:

localhost:9200/my-index/_search?search_type=count
=11485 docs
localhost:9200/my-index/_count
=11485 docs

What docs is the stats api counting that the search/count api is not counting?

Thanks.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/67260fb7-d414-4051-8282-f4413b471a7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/72AC7AED-C83C-4460-8213-30630C962329%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.