Maximum document size


(Antonios Giannopoulos) #1

I am new to ES, and i made a search the past days to determine if there is
a limitation on the size of a single document (i know from mongo that a
single document can't exceed 16 MB). I didn't found any relevant topic
about single document size limitation, so my question is if any limitation
on this exist or not?
Thank you in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Hit a log only all specific keyword exists?
(Adrien Grand) #2

Hi Antonio,

There is a 2GB limit at the Lucene level, but you might run into issues
before that depending on your memory constraints. It is however quite
unusual to have such large documents because it doesn't help much finding
what actually matched your query. For instance if you are indexing books,
splitting them into pages will help not only finding the matching books but
also the matching page numbers for instance.

On Tue, Aug 6, 2013 at 10:03 AM, Antonios Giannopoulos
antgiann@gmail.comwrote:

I am new to ES, and i made a search the past days to determine if there is
a limitation on the size of a single document (i know from mongo that a
single document can't exceed 16 MB). I didn't found any relevant topic
about single document size limitation, so my question is if any limitation
on this exist or not?
Thank you in advance.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Drew Raines) #3

Antonios Giannopoulos wrote:

I am new to ES, and i made a search the past days to determine
if there is a limitation on the size of a single document (i
know from mongo that a single document can't exceed 16 MB). I
didn't found any relevant topic about single document size
limitation, so my question is if any limitation on this exist or
not? Thank you in advance.

In addition to the info Adrien provided, from practical experience
ES handles documents of hundreds of MB just fine.

Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #4

AFAIK the 2G limit in Lucene has been dropped since Lucene 4?

https://issues.apache.org/jira/browse/LUCENE-2295

So document size should only be limited by upload API buffer (HTTP REST
default setting is 100m) and the available heap space.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(simonw-2) #5

The issues provided below is unrelated to the limit of the index writer.
Lucene uses a byte buffer internally that uses 32bit integers for
addressing. By definition this limits the size of the documents. We also
force a flush if a DocumentsWriter grows > 1950MB (that is a safety limit)
so 2GB is max in theory but I never tested it so the limit might be hit
earlier.

simon

On Thursday, August 8, 2013 12:22:58 AM UTC+2, Jörg Prante wrote:

AFAIK the 2G limit in Lucene has been dropped since Lucene 4?

https://issues.apache.org/jira/browse/LUCENE-2295

So document size should only be limited by upload API buffer (HTTP REST
default setting is 100m) and the available heap space.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6