Hi. I'm quite new to elasticsearch, so far it's all been going great but
I've run into a wall and after a few days of no progress I thought it was
time to ask for help.
I'm trying to create a replacement search solution for a CMS system, one of
the requirements is that it needs to index binary files. The
mapping-attachments plugin appears to be just the thing, but I'm struggling
to get it to work.
I've tried this with ElasticSearch 1.3x and Mapper Attachements 2.3.2 and
ElasticSearch 1.4x and Mapper Attachments 2.4.2 running under Windows. I
have no errors in the log, the plugin appears to be loading correctly, so I
assume I'm doing something wrong with my requests.
I've simplified my requests down to the most basic level I can, and the
issue still occurs. Testing has been done with the Postman extension in
Chrome. But I've converted my posts to curl requests to help anyone who
might want to try this on Linux. The Base64 file is a .txt file with some
English text from the BBC News site.
Create test index
curl -XPUT 'http://localhost:9200/test/'
Response
{
"acknowledged": true
}
Create mapping for person
curl -XPUT 'http://localhost:9200/test/_mapping/person' -d '{
"person" : {
"properties" : {
"my_attachment" : { "type" : "attachment" }
}
}
}'
Response
{
"acknowledged": true
}
Get mapping for person
curl -XGET 'http://localhost:9200/test/_mapping/person'
Response
{
"test": {
"mappings": {
"person": {
"properties": {
"my_attachment": {
"type": "attachment",
"path": "full",
"fields": {
"my_attachment": {
"type": "string"
},
"author": {
"type": "string"
},
"title": {
"type": "string"
},
"name": {
"type": "string"
},
"date": {
"type": "date",
"format": "dateOptionalTime"
},
"keywords": {
"type": "string"
},
"content_type": {
"type": "string"
},
"content_length": {
"type": "integer"
},
"language": {
"type": "string"
}
}
}
}
}
}
}
}
This looks good, I have meta data fields for the file in the mapping
Create person id 1
curl -XPUT 'http://localhost:9200/test/person/1' -d '{
"my_attachment" :
"Rm9ybWVyIGNlbGVicml0eSBwdWJsaWNpc3QgTWF4IENsaWZmb3JkIGhhcyBoYWQgYW4gYXBwZWFsIGFnYWluc3QgaGlzIGVpZ2h0LXllYXIgc2VudGVuY2UgZm9yIHNleCBvZmZlbmNlcyByZWplY3RlZCBieSB0aGUgQ291cnQgb2YgQXBwZWFsLg0KDQpUaGUgY291cnQgcnVsZWQgdGhlIHNlbnRlbmNlIGhhbmRlZCB0byBDbGlmZm9yZCBlYXJsaWVyIHRoaXMgeWVhciB3YXMganVzdGlmaWVkIGFuZCBjb3JyZWN0Lg0KDQpDbGlmZm9yZCB3YXMgY29udmljdGVkIGluIEFwcmlsIG9mIGVpZ2h0IGhpc3RvcmljYWwgaW5kZWNlbnQgYXNzYXVsdHMgb24gd29tZW4gYW5kIG9uIGdpcmxzIGFzIHlvdW5nIGFzIDE1Lg0KDQpIaXMgbGF3eWVyIGhhZCBhcmd1ZWQgdGhlIHNlbnRlbmNlIHdhcyAidW5mYWlyIiBhbmQgY2xhaW1lZCBDbGlmZm9yZCB3YXMgbm90IGEgdGhyZWF0IHRvIHdvbWVuLg=="
}'
Response
{
"_index": "test",
"_type": "person",
"_id": "1",
"_version": 1,
"created": true
}
Looks good, let's get that record back
Get person id 1
curl -XGET 'http://localhost:9200/test/person/1'
{
"_index": "test",
"_type": "person",
"_id": "1",
"_version": 1,
"found": true,
"_source": {
"my_attachment":
"Rm9ybWVyIGNlbGVicml0eSBwdWJsaWNpc3QgTWF4IENsaWZmb3JkIGhhcyBoYWQgYW4gYXBwZWFsIGFnYWluc3QgaGlzIGVpZ2h0LXllYXIgc2VudGVuY2UgZm9yIHNleCBvZmZlbmNlcyByZWplY3RlZCBieSB0aGUgQ291cnQgb2YgQXBwZWFsLg0KDQpUaGUgY291cnQgcnVsZWQgdGhlIHNlbnRlbmNlIGhhbmRlZCB0byBDbGlmZm9yZCBlYXJsaWVyIHRoaXMgeWVhciB3YXMganVzdGlmaWVkIGFuZCBjb3JyZWN0Lg0KDQpDbGlmZm9yZCB3YXMgY29udmljdGVkIGluIEFwcmlsIG9mIGVpZ2h0IGhpc3RvcmljYWwgaW5kZWNlbnQgYXNzYXVsdHMgb24gd29tZW4gYW5kIG9uIGdpcmxzIGFzIHlvdW5nIGFzIDE1Lg0KDQpIaXMgbGF3eWVyIGhhZCBhcmd1ZWQgdGhlIHNlbnRlbmNlIHdhcyAidW5mYWlyIiBhbmQgY2xhaW1lZCBDbGlmZm9yZCB3YXMgbm90IGEgdGhyZWF0IHRvIHdvbWVuLg=="
}
}
Attachment has been added as a string, and there are no additional meta
data fields
Here's my system info got via
curl -XGET 'http://localhost:9200/_nodes'
{
"cluster_name": "elasticsearch",
"nodes": {
"QWhhRNIOTUWX_1OxGSJOvA": {
"name": "Franz Kafka",
"transport_address": "inet[/192.168.76.148:9300]",
"host": "WIN-23CNBGGKSSE",
"ip": "192.168.76.148",
"version": "1.4.0",
"build": "bc94bd8",
"http_address": "inet[/192.168.76.148:9200]",
"settings": {
"node": {
"name": "Franz Kafka"
},
"client": {
"type": "node"
},
"http": {
"cors": {
"enabled": "true",
"allow-origin":
"/https?:\/\/local.kibana(:[0-9]+)?/"
}
},
"name": "Franz Kafka",
"path": {
"data": "c:\apps\elasticsearch\data",
"work": "c:\apps\elasticsearch",
"home": "c:\apps\elasticsearch",
"conf": "c:\apps\elasticsearch\config",
"logs": "c:/apps/elasticsearch/logs"
},
"cluster": {
"name": "elasticsearch"
},
"config":
"c:\apps\elasticsearch\config\elasticsearch.yml",
"plugin": {
"mandatory": "mapper-attachments"
}
},
"os": {
"refresh_interval_in_millis": 1000,
"available_processors": 4,
"cpu": {
"vendor": "Intel",
"model": "Xeon",
"mhz": 2666,
"total_cores": 4,
"total_sockets": 1,
"cores_per_socket": 4,
"cache_size_in_bytes": -1
},
"mem": {
"total_in_bytes": 8589402112
},
"swap": {
"total_in_bytes": 17176915968
}
},
"process": {
"refresh_interval_in_millis": 1000,
"id": 6048,
"max_file_descriptors": -1,
"mlockall": false
},
"jvm": {
"pid": 6048,
"version": "1.7.0_71",
"vm_name": "Java HotSpot(TM) 64-Bit Server VM",
"vm_version": "24.71-b01",
"vm_vendor": "Oracle Corporation",
"start_time_in_millis": 1415361000462,
"mem": {
"heap_init_in_bytes": 268435456,
"heap_max_in_bytes": 1038876672,
"non_heap_init_in_bytes": 24313856,
"non_heap_max_in_bytes": 136314880,
"direct_max_in_bytes": 1038876672
},
"gc_collectors": [
"ParNew",
"ConcurrentMarkSweep"
],
"memory_pools": [
"Code Cache",
"Par Eden Space",
"Par Survivor Space",
"CMS Old Gen",
"CMS Perm Gen"
]
},
"thread_pool": {
"generic": {
"type": "cached",
"keep_alive": "30s",
"queue_size": -1
},
"index": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": "200"
},
"bench": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"get": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": "1k"
},
"snapshot": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"merge": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"suggest": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": "1k"
},
"bulk": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": "50"
},
"optimize": {
"type": "fixed",
"min": 1,
"max": 1,
"queue_size": -1
},
"warmer": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"flush": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"search": {
"type": "fixed",
"min": 12,
"max": 12,
"queue_size": "1k"
},
"listener": {
"type": "fixed",
"min": 2,
"max": 2,
"queue_size": -1
},
"percolate": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": "1k"
},
"management": {
"type": "scaling",
"min": 1,
"max": 5,
"keep_alive": "5m",
"queue_size": -1
},
"refresh": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
}
},
"network": {
"refresh_interval_in_millis": 5000,
"primary_interface": {
"address": "192.168.76.148",
"name": "eth6",
"mac_address": "00:0C:29:80:70:CA"
}
},
"transport": {
"bound_address": "inet[/0:0:0:0:0:0:0:0:9300]",
"publish_address": "inet[/192.168.76.148:9300]"
},
"http": {
"bound_address": "inet[/0:0:0:0:0:0:0:0:9200]",
"publish_address": "inet[/192.168.76.148:9200]",
"max_content_length_in_bytes": 104857600
},
"plugins": [
{
"name": "mapper-attachments",
"version": "2.4.1",
"description": "Adds the attachment type allowing to
parse difference attachment formats",
"jvm": true,
"site": false
},
{
"name": "kopf",
"version": "1.3.7",
"description": "kopf - simple web administration tool
for ElasticSearch",
"url": "/_plugin/kopf/",
"jvm": false,
"site": true
}
]
}
}
}
And my elasticsearch log from startup.
[2014-11-07 13:23:59,256][INFO ][node ] [Franz Kafka]
version[1.4.0], pid[6928], build[bc94bd8/2014-11-05T14:26:12Z]
[2014-11-07 13:23:59,256][INFO ][node ] [Franz Kafka]
initializing ...
[2014-11-07 13:23:59,319][INFO ][plugins ] [Franz Kafka]
loaded [mapper-attachments], sites [kopf]
[2014-11-07 13:24:03,503][INFO ][node ] [Franz Kafka]
initialized
[2014-11-07 13:24:03,503][INFO ][node ] [Franz Kafka]
starting ...
[2014-11-07 13:24:03,643][INFO ][transport ] [Franz Kafka]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/192.168.76.148:9300]}
[2014-11-07 13:24:03,784][INFO ][discovery ] [Franz Kafka]
elasticsearch/A4ONWcVyRIiJxaVw0Mm0uA
[2014-11-07 13:24:07,566][INFO ][cluster.service ] [Franz Kafka]
new_master [Franz
Kafka][A4ONWcVyRIiJxaVw0Mm0uA][WIN-23CNBGGKSSE][inet[/192.168.76.148:9300]],
reason: zen-disco-join (elected_as_master)
[2014-11-07 13:24:07,705][INFO ][http ] [Franz Kafka]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/192.168.76.148:9200]}
[2014-11-07 13:24:07,705][INFO ][node ] [Franz Kafka]
started
[2014-11-07 13:24:08,416][INFO ][gateway ] [Franz Kafka]
recovered [1] indices into cluster_state
I've also set Mapper Attachment as a mandatory plugin in the config, so
it's definitely loading as the node starts up ok.
I'd really appreciate some help on this. I'm sure I'm making some newbie
mistake with the mapping or something, but the documentation isn't helping
me here.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/95bcd0b7-844a-40b5-93cf-dce2ea4bc284%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.