Ingest Attachment vs Mapper Attachment - Performance/memory Issues

Hi All,

Earlier we were using mapper attachments with ES 2.3.2 for attachments parsing. Now with ES 5.6.4 we are making use of ingest attachment plugin by creating pipelines. What we have noticed is for the same set of documents on the same type of hardware/configurations and HEAP set to 10G, on ES 2.3.2 indexing is working fine however with ES 5.6.4 with ingest attachment after processing around 90% of documents ES is crashing with OOM issue as shown below.

Is there an difference in the processing of docs / mem requirements between mapper attachments vs ingest attachment

pipeline definition in 5.6.4
    {
    attachment_pipe: {
    description: "attachment_pipe",
    processors: [
    {
    attachment: {
    field: "attachment_1",
    target_field: "attachmentval_1",
    ignore_missing: true,
    indexed_chars: "-1",
    properties: [
    "content"
    ]
    }
    }
    }
    }

[2018-02-19T01:39:53,146][WARN ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][13013] overhead, spent [7.5s] collecting in the last [7.9s]
[2018-02-19T01:39:55,799][INFO ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][13015] overhead, spent [656ms] collecting in the last [1.6s]
[2018-02-19T01:40:03,094][INFO ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][old][13016][328] duration [6.8s], collections [1]/[7.2s], total [6.8s]/[28.3m], memory [9.7gb]->[9.6gb]/[9.8gb], all_pools {[young] [1.4gb]->[1.4gb]/[1.4gb]}{[survivor] [124.1mb]->[63.8mb]/[191.3mb]}{[old] [8.1gb]->[8.1gb]/[8.1gb]}
[2018-02-19T01:40:03,094][WARN ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][13016] overhead, spent [6.8s] collecting in the last [7.2s]
[2018-02-19T01:40:11,410][INFO ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][old][13018][329] duration [6.3s], collections [1]/[6.6s], total [6.3s]/[28.4m], memory [9.7gb]->[9.6gb]/[9.8gb], all_pools {[young] [1.4gb]->[1.4gb]/[1.4gb]}{[survivor] [136mb]->[0b]/[191.3mb]}{[old] [8.1gb]->[8.1gb]/[8.1gb]}
[2018-02-19T01:40:11,410][WARN ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][13018] overhead, spent [6.3s] collecting in the last [6.6s]
[2018-02-19T01:41:09,037][INFO ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][old][13019][331] duration [13s], collections [2]/[57.6s], total [13s]/[28.7m], memory [9.6gb]->[9.7gb]/[9.8gb], all_pools {[young] [1.4gb]->[1.4gb]/[1.4gb]}{[survivor] [0b]->[128.6mb]/[191.3mb]}{[old] [8.1gb]->[8.1gb]/[8.1gb]}
[2018-02-19T01:41:18,451][INFO ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][old][13020][332] duration [9.8s], collections [1]/[10s], total [9.8s]/[28.8m], memory [9.7gb]->[9.5gb]/[9.8gb], all_pools {[young] [1.4gb]->[1.4gb]/[1.4gb]}{[survivor] [128.6mb]->[0b]/[191.3mb]}{[old] [8.1gb]->[8.1gb]/[8.1gb]}
[2018-02-19T01:41:18,452][WARN ][o.e.m.j.JvmGcMonitorService] [C0_AMsh] [gc][13020] overhead, spent [9.8s] collecting in the last [10s]
[2018-02-19T01:41:09,140][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [] fatal error in thread [elasticsearch[C0_AMsh][bulk][T#15]], exiting
java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOfRange(Arrays.java:3664) ~[?:1.8.0_144]
	at java.lang.String.<init>(String.java:207) ~[?:1.8.0_144]
	at java.lang.StringBuilder.toString(StringBuilder.java:407) ~[?:1.8.0_144]
	at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:356) ~[jackson-core-2.8.6.jar:2.8.6]
	at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishAndReturnString(UTF8StreamJsonParser.java:2470) ~[jackson-core-2.8.6.jar:2.8.6]
	at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:315) ~[jackson-core-2.8.6.jar:2.8.6]
	at org.elasticsearch.common.xcontent.json.JsonXContentParser.text(JsonXContentParser.java:86) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.common.xcontent.support.AbstractXContentParser.readValue(AbstractXContentParser.java:385) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.common.xcontent.support.AbstractXContentParser.readMap(AbstractXContentParser.java:333) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.common.xcontent.support.AbstractXContentParser.readMap(AbstractXContentParser.java:296) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.common.xcontent.support.AbstractXContentParser.map(AbstractXContentParser.java:251) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.common.xcontent.XContentHelper.convertToMap(XContentHelper.java:141) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.common.xcontent.XContentHelper.convertToMap(XContentHelper.java:114) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.action.index.IndexRequest.sourceAsMap(IndexRequest.java:355) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.ingest.PipelineExecutionService.innerExecute(PipelineExecutionService.java:164) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.ingest.PipelineExecutionService.access$000(PipelineExecutionService.java:41) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.ingest.PipelineExecutionService$2.doRun(PipelineExecutionService.java:88) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.6.4.jar:5.6.4]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.4.jar:5.6.4]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_144]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_144]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.