How to see total number of real documents while using nested type


(vineeth mohan) #1

Hi ,

While using nested types , i have noticed that number of documents shown in
head interface is not exactly the number of documents which are really
indexed.
Some of the documents are actually documents created out of nested types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested types over
array ?

Thanks
Vineeth


(David Pilato) #2

As I understand it, document is persisted once with the nested docs and each nested doc is also persisted.

That's why the total number of documents seen by ES is growing faster than the number of documents you inject.

If you use the count API you should get back the number of documents you are looking for.

I suppose that there is performance overhead. I really prefer (if my use case allows it) to use arrays then nested docs.

HTH
David :wink:
@dadoonet

Le 19 nov. 2011 à 07:12, Vineeth Mohan vineethmohan@algotree.com a écrit :

Hi ,

While using nested types , i have noticed that number of documents shown in head interface is not exactly the number of documents which are really indexed.
Some of the documents are actually documents created out of nested types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested types over array ?

Thanks
Vineeth


(vineeth mohan) #3

As i am having some complicated faceting needs , I have a situation where
am forced to use nested type. :frowning:

I beleive the elasticSearch head uses count API to get number of documents.
But that number is not what i wanted.
Am still pondering on the question why coudlnt array have all these
features nested type have !!!!

Thanks
Vineeth

On Sat, Nov 19, 2011 at 11:57 AM, David Pilato david@pilato.fr wrote:

As I understand it, document is persisted once with the nested docs and
each nested doc is also persisted.

That's why the total number of documents seen by ES is growing faster than
the number of documents you inject.

If you use the count API you should get back the number of documents you
are looking for.

I suppose that there is performance overhead. I really prefer (if my use
case allows it) to use arrays then nested docs.

HTH
David :wink:
@dadoonet

Le 19 nov. 2011 à 07:12, Vineeth Mohan vineethmohan@algotree.com a
écrit :

Hi ,

While using nested types , i have noticed that number of documents shown
in head interface is not exactly the number of documents which are really
indexed.
Some of the documents are actually documents created out of nested types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested types
over array ?

Thanks
Vineeth


(vineeth mohan) #4

Hello Shay ,

Can you tell us what is the overhead of using nested type over arrays.
Will it eat up more conputation time or memory ??

Tbanks
Vineeth

On Sat, Nov 19, 2011 at 12:04 PM, Vineeth Mohan
vineethmohan@algotree.comwrote:

As i am having some complicated faceting needs , I have a situation where
am forced to use nested type. :frowning:

I beleive the elasticSearch head uses count API to get number of documents.
But that number is not what i wanted.
Am still pondering on the question why coudlnt array have all these
features nested type have !!!!

Thanks
Vineeth

On Sat, Nov 19, 2011 at 11:57 AM, David Pilato david@pilato.fr wrote:

As I understand it, document is persisted once with the nested docs and
each nested doc is also persisted.

That's why the total number of documents seen by ES is growing faster
than the number of documents you inject.

If you use the count API you should get back the number of documents you
are looking for.

I suppose that there is performance overhead. I really prefer (if my use
case allows it) to use arrays then nested docs.

HTH
David :wink:
@dadoonet

Le 19 nov. 2011 à 07:12, Vineeth Mohan vineethmohan@algotree.com a
écrit :

Hi ,

While using nested types , i have noticed that number of documents
shown in head interface is not exactly the number of documents which are
really indexed.
Some of the documents are actually documents created out of nested
types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested types
over array ?

Thanks
Vineeth


(Shay Banon) #5

Each nested element will end up being another document in the index. If you
issue "count", they automatically get filtered out (as well as any other
type of query that is not wrapped by nested one). The index stats API can
return the actual number of docs in the index.

On Sat, Nov 19, 2011 at 8:54 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Hello Shay ,

Can you tell us what is the overhead of using nested type over arrays.
Will it eat up more conputation time or memory ??

Tbanks
Vineeth

On Sat, Nov 19, 2011 at 12:04 PM, Vineeth Mohan <vineethmohan@algotree.com

wrote:

As i am having some complicated faceting needs , I have a situation
where am forced to use nested type. :frowning:

I beleive the elasticSearch head uses count API to get number of
documents.
But that number is not what i wanted.
Am still pondering on the question why coudlnt array have all these
features nested type have !!!!

Thanks
Vineeth

On Sat, Nov 19, 2011 at 11:57 AM, David Pilato david@pilato.fr wrote:

As I understand it, document is persisted once with the nested docs and
each nested doc is also persisted.

That's why the total number of documents seen by ES is growing faster
than the number of documents you inject.

If you use the count API you should get back the number of documents you
are looking for.

I suppose that there is performance overhead. I really prefer (if my use
case allows it) to use arrays then nested docs.

HTH
David :wink:
@dadoonet

Le 19 nov. 2011 à 07:12, Vineeth Mohan vineethmohan@algotree.com a
écrit :

Hi ,

While using nested types , i have noticed that number of documents
shown in head interface is not exactly the number of documents which are
really indexed.
Some of the documents are actually documents created out of nested
types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested types
over array ?

Thanks
Vineeth


(vineeth mohan) #6

Thanks Shay.

But then can you tell my why a seprate type called nested type was created
!!! Instead cant all the things nested type can offer over arrays be
implemented in array type itself

I am not criticizing anything but seeing this as a learning oppurtunaty , i
would like to more about the same.

Thanks
Vineeth

On Sun, Nov 20, 2011 at 2:29 PM, Shay Banon kimchy@gmail.com wrote:

Each nested element will end up being another document in the index. If
you issue "count", they automatically get filtered out (as well as any
other type of query that is not wrapped by nested one). The index stats API
can return the actual number of docs in the index.

On Sat, Nov 19, 2011 at 8:54 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Hello Shay ,

Can you tell us what is the overhead of using nested type over arrays.
Will it eat up more conputation time or memory ??

Tbanks
Vineeth

On Sat, Nov 19, 2011 at 12:04 PM, Vineeth Mohan <
vineethmohan@algotree.com> wrote:

As i am having some complicated faceting needs , I have a situation
where am forced to use nested type. :frowning:

I beleive the elasticSearch head uses count API to get number of
documents.
But that number is not what i wanted.
Am still pondering on the question why coudlnt array have all these
features nested type have !!!!

Thanks
Vineeth

On Sat, Nov 19, 2011 at 11:57 AM, David Pilato david@pilato.fr wrote:

As I understand it, document is persisted once with the nested docs and
each nested doc is also persisted.

That's why the total number of documents seen by ES is growing faster
than the number of documents you inject.

If you use the count API you should get back the number of documents
you are looking for.

I suppose that there is performance overhead. I really prefer (if my
use case allows it) to use arrays then nested docs.

HTH
David :wink:
@dadoonet

Le 19 nov. 2011 à 07:12, Vineeth Mohan vineethmohan@algotree.com a
écrit :

Hi ,

While using nested types , i have noticed that number of documents
shown in head interface is not exactly the number of documents which are
really indexed.
Some of the documents are actually documents created out of nested
types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested
types over array ?

Thanks
Vineeth


(Shay Banon) #7

What is your question? Did not manage to get it from your post...

On Sun, Nov 20, 2011 at 11:19 AM, Vineeth Mohan
vineethmohan@algotree.comwrote:

Thanks Shay.

But then can you tell my why a seprate type called nested type was created
!!! Instead cant all the things nested type can offer over arrays be
implemented in array type itself

I am not criticizing anything but seeing this as a learning oppurtunaty ,
i would like to more about the same.

Thanks
Vineeth

On Sun, Nov 20, 2011 at 2:29 PM, Shay Banon kimchy@gmail.com wrote:

Each nested element will end up being another document in the index. If
you issue "count", they automatically get filtered out (as well as any
other type of query that is not wrapped by nested one). The index stats API
can return the actual number of docs in the index.

On Sat, Nov 19, 2011 at 8:54 PM, Vineeth Mohan <vineethmohan@algotree.com

wrote:

Hello Shay ,

Can you tell us what is the overhead of using nested type over arrays.
Will it eat up more conputation time or memory ??

Tbanks
Vineeth

On Sat, Nov 19, 2011 at 12:04 PM, Vineeth Mohan <
vineethmohan@algotree.com> wrote:

As i am having some complicated faceting needs , I have a situation
where am forced to use nested type. :frowning:

I beleive the elasticSearch head uses count API to get number of
documents.
But that number is not what i wanted.
Am still pondering on the question why coudlnt array have all these
features nested type have !!!!

Thanks
Vineeth

On Sat, Nov 19, 2011 at 11:57 AM, David Pilato david@pilato.fr wrote:

As I understand it, document is persisted once with the nested docs
and each nested doc is also persisted.

That's why the total number of documents seen by ES is growing faster
than the number of documents you inject.

If you use the count API you should get back the number of documents
you are looking for.

I suppose that there is performance overhead. I really prefer (if my
use case allows it) to use arrays then nested docs.

HTH
David :wink:
@dadoonet

Le 19 nov. 2011 à 07:12, Vineeth Mohan vineethmohan@algotree.com a
écrit :

Hi ,

While using nested types , i have noticed that number of documents
shown in head interface is not exactly the number of documents which are
really indexed.
Some of the documents are actually documents created out of nested
types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested
types over array ?

Thanks
Vineeth


(vineeth mohan) #8

My question goes like this .

It seems to me that both nested types and arrays (with maps) performes the
same function.
But i didnt very much get the idea why we need nested types.
Why cant we incoprate the functionalities of nested type to arrays and
hence avoid overhead's due to nested types ?

Once again i am really greatfull that you are taking time to understand my
question and answering it.

Thanks
Vineeth

On Sun, Nov 20, 2011 at 11:58 PM, Shay Banon kimchy@gmail.com wrote:

What is your question? Did not manage to get it from your post...

On Sun, Nov 20, 2011 at 11:19 AM, Vineeth Mohan <vineethmohan@algotree.com

wrote:

Thanks Shay.

But then can you tell my why a seprate type called nested type was
created !!! Instead cant all the things nested type can offer over arrays
be
implemented in array type itself

I am not criticizing anything but seeing this as a learning oppurtunaty ,
i would like to more about the same.

Thanks
Vineeth

On Sun, Nov 20, 2011 at 2:29 PM, Shay Banon kimchy@gmail.com wrote:

Each nested element will end up being another document in the index. If
you issue "count", they automatically get filtered out (as well as any
other type of query that is not wrapped by nested one). The index stats API
can return the actual number of docs in the index.

On Sat, Nov 19, 2011 at 8:54 PM, Vineeth Mohan <
vineethmohan@algotree.com> wrote:

Hello Shay ,

Can you tell us what is the overhead of using nested type over arrays.
Will it eat up more conputation time or memory ??

Tbanks
Vineeth

On Sat, Nov 19, 2011 at 12:04 PM, Vineeth Mohan <
vineethmohan@algotree.com> wrote:

As i am having some complicated faceting needs , I have a situation
where am forced to use nested type. :frowning:

I beleive the elasticSearch head uses count API to get number of
documents.
But that number is not what i wanted.
Am still pondering on the question why coudlnt array have all these
features nested type have !!!!

Thanks
Vineeth

On Sat, Nov 19, 2011 at 11:57 AM, David Pilato david@pilato.frwrote:

As I understand it, document is persisted once with the nested docs
and each nested doc is also persisted.

That's why the total number of documents seen by ES is growing faster
than the number of documents you inject.

If you use the count API you should get back the number of documents
you are looking for.

I suppose that there is performance overhead. I really prefer (if my
use case allows it) to use arrays then nested docs.

HTH
David :wink:
@dadoonet

Le 19 nov. 2011 à 07:12, Vineeth Mohan vineethmohan@algotree.com a
écrit :

Hi ,

While using nested types , i have noticed that number of documents
shown in head interface is not exactly the number of documents which are
really indexed.
Some of the documents are actually documents created out of nested
types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested
types over array ?

Thanks
Vineeth


(Shay Banon) #9

When indexing arrays in docs without nested types, they end up indexed in
the same doc (Lucene document). This means that any query on specific
element within the array and relationship to another element in the array
is problematic. Thats what nested types come to solve, at the price of
indexing each element as a separate doc.

On Mon, Nov 21, 2011 at 6:25 AM, Vineeth Mohan vineethmohan@algotree.comwrote:

My question goes like this .

It seems to me that both nested types and arrays (with maps) performes the
same function.
But i didnt very much get the idea why we need nested types.
Why cant we incoprate the functionalities of nested type to arrays and
hence avoid overhead's due to nested types ?

Once again i am really greatfull that you are taking time to understand my
question and answering it.

Thanks
Vineeth

On Sun, Nov 20, 2011 at 11:58 PM, Shay Banon kimchy@gmail.com wrote:

What is your question? Did not manage to get it from your post...

On Sun, Nov 20, 2011 at 11:19 AM, Vineeth Mohan <
vineethmohan@algotree.com> wrote:

Thanks Shay.

But then can you tell my why a seprate type called nested type was
created !!! Instead cant all the things nested type can offer over arrays
be
implemented in array type itself

I am not criticizing anything but seeing this as a learning oppurtunaty
, i would like to more about the same.

Thanks
Vineeth

On Sun, Nov 20, 2011 at 2:29 PM, Shay Banon kimchy@gmail.com wrote:

Each nested element will end up being another document in the index. If
you issue "count", they automatically get filtered out (as well as any
other type of query that is not wrapped by nested one). The index stats API
can return the actual number of docs in the index.

On Sat, Nov 19, 2011 at 8:54 PM, Vineeth Mohan <
vineethmohan@algotree.com> wrote:

Hello Shay ,

Can you tell us what is the overhead of using nested type over arrays.
Will it eat up more conputation time or memory ??

Tbanks
Vineeth

On Sat, Nov 19, 2011 at 12:04 PM, Vineeth Mohan <
vineethmohan@algotree.com> wrote:

As i am having some complicated faceting needs , I have a situation
where am forced to use nested type. :frowning:

I beleive the elasticSearch head uses count API to get number of
documents.
But that number is not what i wanted.
Am still pondering on the question why coudlnt array have all these
features nested type have !!!!

Thanks
Vineeth

On Sat, Nov 19, 2011 at 11:57 AM, David Pilato david@pilato.frwrote:

As I understand it, document is persisted once with the nested docs
and each nested doc is also persisted.

That's why the total number of documents seen by ES is growing
faster than the number of documents you inject.

If you use the count API you should get back the number of documents
you are looking for.

I suppose that there is performance overhead. I really prefer (if my
use case allows it) to use arrays then nested docs.

HTH
David :wink:
@dadoonet

Le 19 nov. 2011 à 07:12, Vineeth Mohan vineethmohan@algotree.com
a écrit :

Hi ,

While using nested types , i have noticed that number of documents
shown in head interface is not exactly the number of documents which are
really indexed.
Some of the documents are actually documents created out of nested
types.

How can i see the real number of documents ?

Also what are the possible performance overheads of using nested
types over array ?

Thanks
Vineeth


(system) #10