ElasticSearch mapper-attachments 2.2

I was looking to use Apache-Tika via the mapper-attachments plugin to import some data from some Microsoft Word documents so i started following the documents here:


I PUT the base64 string into elasticsearch thinking the plugin would convert it as it indexes it but when I perform a search afterwards the data is still in base64.

Any idea what is happening?

Output looks like this:

  "took": 3,
  "timed_out": false,
  "_shards": {
  "total": 5,
  "successful": 5,
  "failed": 0
  "hits": {
  "total": 1,
  "max_score": 0.095891505,
  "hits": [
      "_index": "trying-out-mapper-attachments",
      "_type": "person",
      "_id": "1",
      "_score": 0.095891505,
      "_source": {
      "cv": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="

I was expecting to see something like this:

echo e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0= | base64 --decode

Lorem ipsum dolor sit amet
\par }

We never modify the source. So it's stored as you sent it.

But the content has been extracted, indexed behind the scene so you can probably search for it.

Thanks David, I suspected the source would not be modified but the output from this example search in the documentation returns the unmodified source which suggest that it is matching. How do I get to see the data that has been extracted?

POST /trying-out-mapper-attachments/person/_search
  "query": {
     "query_string": {
        "query": "ipsum"