Is it okay to store a large data in ES?


(mp2893) #1

Hi,
This is a super simple question, but googling has not helped me yet.

Is it ok to store a large chunk of data ES?
For example, I want to store a whole HTML page under the field name of
"original_text".
(I'm planning to store a news article in its HTML format, which is
about 100KB.)
So in the future I can access the HTML document whenever I want.

Is this going to cause any problem for ES?
Increased latency? Or too much cache usage? Or memory constantly being
swapped?
If I choose never to cache the result when I access the HTML document,
is it going to solve any of the issues I mentioned?
Any tip would be appreciated.
Thanks.

Ed


(Radu Gheorghe) #2

I don't see why that would be an issue. Maybe some actually uses ES in
this kind of setup and might give you some tips, but it should work.

Swapping should only occur if you allocate more memory to ES than you
have RAM. And that's not recommended - typically half your RAM should
go to ES. OS caching should be as busy with large documents as it
would be with more smaller ones (or even less busy).

As for ES caching, AFAIK this applies only to filters, not queries. So
it depends on how you search.

On Jun 2, 3:27 pm, mp2893 mp2...@gmail.com wrote:

Hi,
This is a super simple question, but googling has not helped me yet.

Is it ok to store a large chunk of data ES?
For example, I want to store a whole HTML page under the field name of
"original_text".
(I'm planning to store a news article in its HTML format, which is
about 100KB.)
So in the future I can access the HTML document whenever I want.

Is this going to cause any problem for ES?
Increased latency? Or too much cache usage? Or memory constantly being
swapped?
If I choose never to cache the result when I access the HTML document,
is it going to solve any of the issues I mentioned?
Any tip would be appreciated.
Thanks.

Ed


(mp2893) #3

Thanks for the info Radu.
I would proceed with my plan.

If I see any odd behavior in my ES, I'll add another thread here.

Regards,
Ed

2012/6/5 Radu Gheorghe radu0gheorghe@gmail.com

I don't see why that would be an issue. Maybe some actually uses ES in
this kind of setup and might give you some tips, but it should work.

Swapping should only occur if you allocate more memory to ES than you
have RAM. And that's not recommended - typically half your RAM should
go to ES. OS caching should be as busy with large documents as it
would be with more smaller ones (or even less busy).

As for ES caching, AFAIK this applies only to filters, not queries. So
it depends on how you search.

On Jun 2, 3:27 pm, mp2893 mp2...@gmail.com wrote:

Hi,
This is a super simple question, but googling has not helped me yet.

Is it ok to store a large chunk of data ES?
For example, I want to store a whole HTML page under the field name of
"original_text".
(I'm planning to store a news article in its HTML format, which is
about 100KB.)
So in the future I can access the HTML document whenever I want.

Is this going to cause any problem for ES?
Increased latency? Or too much cache usage? Or memory constantly being
swapped?
If I choose never to cache the result when I access the HTML document,
is it going to solve any of the issues I mentioned?
Any tip would be appreciated.
Thanks.

Ed


(system) #4