Hey,
using: ES 5.1
I'm trying to figure out the solution for indexing multiple attachments for single document into Elasticsearch Index.
I have some limitations around my service (using AWS) that limits my HTTP request up to 100MB per single POST.
Basically I have profiles in ES index, and for each profile I want to store multiple searchable attachments, let's say up to 50 x 10MB pdfs
That requirement limits my approach because I just cannot send to ES of total 500 MB of data.
One of the approach was to make some kind of partial updates, but still how to make 'partial-Update' by pushing NEW attachment to the existing attachments' array?
Maybe some flatten attachments index and reference to my main index to find profiles out?
I have to also support highliting in result, so the best approach for me is to have mapping like this:
{
"directory.index.v7": {
"mappings": {
"profile.event": {
"properties": {
"attachments": {
"properties": {
"attachment": {
"properties": {
"content": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content_length": {
"type": "long"
},
"content_type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"date": {
"type": "date"
},
"language": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"data": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"filename": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"email": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
}
But as I meanioned before:
- I cannot ingest all attachment at once.
- I don know how and if is possible to make ATTACHMENT PUSH to
attachments
attribute without including older docs (to not reach a limit for POST)
Please advise!