I am new to ES, and i made a search the past days to determine if there is
a limitation on the size of a single document (i know from mongo that a
single document can't exceed 16 MB). I didn't found any relevant topic
about single document size limitation, so my question is if any limitation
on this exist or not?
Thank you in advance.
There is a 2GB limit at the Lucene level, but you might run into issues
before that depending on your memory constraints. It is however quite
unusual to have such large documents because it doesn't help much finding
what actually matched your query. For instance if you are indexing books,
splitting them into pages will help not only finding the matching books but
also the matching page numbers for instance.
On Tue, Aug 6, 2013 at 10:03 AM, Antonios Giannopoulos antgiann@gmail.comwrote:
I am new to ES, and i made a search the past days to determine if there is
a limitation on the size of a single document (i know from mongo that a
single document can't exceed 16 MB). I didn't found any relevant topic
about single document size limitation, so my question is if any limitation
on this exist or not?
Thank you in advance.
I am new to ES, and i made a search the past days to determine
if there is a limitation on the size of a single document (i
know from mongo that a single document can't exceed 16 MB). I
didn't found any relevant topic about single document size
limitation, so my question is if any limitation on this exist or
not? Thank you in advance.
In addition to the info Adrien provided, from practical experience
ES handles documents of hundreds of MB just fine.
The issues provided below is unrelated to the limit of the index writer.
Lucene uses a byte buffer internally that uses 32bit integers for
addressing. By definition this limits the size of the documents. We also
force a flush if a DocumentsWriter grows > 1950MB (that is a safety limit)
so 2GB is max in theory but I never tested it so the limit might be hit
earlier.
simon
On Thursday, August 8, 2013 12:22:58 AM UTC+2, Jörg Prante wrote:
AFAIK the 2G limit in Lucene has been dropped since Lucene 4?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.