Hello all ,
I want to store large binary files (like xyz.exe, pqr.exe) into ES and search/query for particular binary patterns into these files.
e.g
data = 4DBEEF3B305367C9AEB8DBA22E3CD0ADB8EBB75C829E68DD8EEDDBA5EE5301FC2A2E087C952D325124F3B62AD548D7CD6C2C633F10E57686C6AE288476C0849985F3BE6C7CF19A48992FA121845FA3
search/query for = 829E68DD8EEDDBA
TO STORE
- I am trying make use of attachment plugin to store its base64 encoded data.
{
"test": {
"mappings": {
"logs": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
}
}
ALSO
- I am trying to store data as a string ( 4DBEEF3B305367C9AEB8DBA22E3CD......) in one field.
{
"test": {
"mappings": {
"logs": {
"properties": {
"file": {
"type": "string"
}
}
}
}
}
}
TO SEARCH
- I am trying "wildcard" search/query.
GET test/logs/_search
{
"query": {
"wildcard": {
"file" : "829E68DD8EEDDBA"
}
}
}
ALSO
- I am trying "query_string" search/query.
GET test/logs/_search
{
"query": {
"query_string": {
"query" : "829E68DD8EEDDBA"
}
}
}
I have managed to get expected results but these queries are taking very LONG time on big index.
What would be the alternative way to this use case.
Should i go for "N-Gram" analyzer for this use case or any other way to make search faster??
I am planning to store 1 Million binaries each of size (1KB to 1 MB).
Any help/hint would be appreciated.
Thanks.
Ankur Mathur