Lucene doesn't know about types under the hood. We index numeric types as
prefix coded tries to make range queries efficient. The number of bytes a
long / int value takes in the index is depending on the precision_step that
is used. but that is the data in the term index. if you are curious about
the stored document size, we only know about String / UTF-8 Bytes so we
don't store this in the most efficient way a Database would do in a
dedicated column. I don't think you can compare the index size to ensure
that the right type is applied, I am afraid!
simon
On Thursday, February 14, 2013 2:25:21 AM UTC+1, ryano wrote:
So I have a test index where I define a field1 as type: long. I then
define field2 as type:integer. I index 2 documents, each with array of
1000 numbers that are 9 digits in length. In the first document, the array
is put in the field1. In the second document, the array is put in the
field2.I then go to _stats and look at the document sizes.
I would have expected the document size of the one with the type:integer
to be half the size of the one with type:long as the integer is a 32 bit
type, and long is a 64 bit type. But both documents are almost exactly the
same size. And from my math (each document is around 9K, and 64bit =
8bytes*9000 characters = 7.2K), it seems that they are all being indexed as
long, 64bit.I double checked the _mapping, and the fields are definitely set as long &
integer respectively. Any idea why the document indexed with field
type:integer wouldn't be far less in size than the one with type:long?Thanks.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.