I am new on elasticsearch and I would like to use elastic for indexing 10k - 50k documents (Files with metadata).
Anyone have experience for hardware requirement to handeling this volume of documents?
How many GB of RAM? CPU core? Disk space?
Anyone have experience on virtual environment (like HyperV or VMWare)?
What is the time frame for indexing? I assume this is on a per second basis?
It really depends on the size of the documents.
I'd recommend to go ahead and spin up your elk stack and do some test runs. See if it crumples and turn up the resources if it does.
I'm running on a XEN 2cpu 4G RAM cloud server, but I'm only doing some light testing right now, I've seen zero issues with it's ability to ingest small amounts of data so far.
I tried a simple search on my pc (i5 processor, 8Gb ram, ssd HD and Windows 8) with 3k files with good results.
I understand when you write "depends on the size of the documents". If I insert on archive a large file (30Mb pptx) my little application crash with an OutOfMemoryException.
Do you think it is a RAM size problem? Are there any method to calculate the "hardware" size?
Ingestion is cpu based, while searching is memory based.
If you are only trying to get the file ingested, then you will need to do some testing
Use increasingly larger files on a single shard, until you crash or see bad latency.
That should give you an idea, and then increase cpu from that point and run the same test again.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.