Big document or plenty of small documents

Question about data model, I am using elasticsearch as a pure "JSON storage", no search, just index/update/get, with big document (JSON of 10 Mb), that contains a big nested array:

{"cars" : [ { "name" : ..., "model" : ... ... }, .... ]}

It's cool because I need all the cars of 1 document to render the web page, so a simple GET does the trick. But I see indexation is taking lot of memory (via VisualVM) and I see long GC pauses time to time (from elasticsearch log file).

I am wondering if elasticsearch fits well with such big-document or not ?

No its not a problem actually you can store max of 2^31-1.You need to check Cluster setup

@ebuildy What does your JVM heap over time look like? Sawtooth or not so much?

ref: https://www.elastic.co/blog/a-heap-of-trouble

Yes it's achieved GC, so I can see the sawtooth, but, in the log:

[2018-04-19 16:01:00,043][INFO ][monitor.jvm ] [elasticsearch05] [gc][old][1857][10] duration [6.6s], collections [1]/[7.1s], total [6.6s]/[15.5s], memory [7.3gb]->[1gb]/[7.9gb], all_pools {[young] [10.3kb]->[28.2mb]/[266.2mb]}{[survivor] [33.2mb]->[0b]/[33.2mb]}{[old] [7.3gb]->[1010.7mb]/[7.6gb]}

And, when I index a single document with 23.000 "cars" (nested objects), I can see the heap taking 200Mo:

http://b3.ms/P5DAvxNbyMgE

(the problem on prod. is there are plenty of such indexation)

yes it's working, but it slows and it's killing the memory; why cluster is involved here?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.