[ANN] Elasticsearch Mapper Attachment plugin 2.3.0 released

Heya,

We are pleased to announce the release of the Elasticsearch Mapper Attachment plugin, version 2.3.0.

The mapper attachments plugin adds the attachment type to Elasticsearch using Apache Tika..

Release Notes - elasticsearch-mapper-attachments - Version 2.3.0

Update:

Issues, Pull requests, Feature requests are warmly welcome on elasticsearch-mapper-attachments project repository: https://github.com/elasticsearch/elasticsearch-mapper-attachments/
For questions or comments around this plugin, feel free to use elasticsearch mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53d2de83.4624b40a.1097.0157SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.

Hi dadoonet,

Currently we were using Elasticsearch 1.3.0 and supported Mapper Plugin for the same. Now we are moving Elasticsearch 1.5.2 so we need to have the updated mapper att plugin as well.

As in the latest one content is deprecated by _content, so in our application we need to change the content to _content while retrieving the document. So here my question is like is there way to re index the Elasticsearch data so that older records can also be returned using _content instead of content (any way like changing some conf or doing re indexing or something else)

Just a note before answering, the change is about the document you send to elasticsearch. Instead of sending:

{
"file": {
"content": "VGhpcyBpcyBhbiBlbGFzdGljc2VhcmNoIG1hcHBlciBhdHRhY2htZW50IHRlc3Qu",
"_name": "myfilename.txt"
}
}

You now send

{
"file": {
"_content": "VGhpcyBpcyBhbiBlbGFzdGljc2VhcmNoIG1hcHBlciBhdHRhY2htZW50IHRlc3Qu",
"_name": "myfilename.txt"
}
}

It’s not related to generated field names at index time.

That said, I think you should better handle that on client side unless you have a few docs to reindex.
That answers to your question, yes you have to reindex if you want to use the new format instead of the old one.
Though you can try to work around with script fields: Request body search | Elasticsearch Guide [8.11] | Elastic http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html

Does it help?

--
David Pilato - Developer | Evangelist

@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 30 avr. 2015 à 12:12, Prashant Agrawal prashant.agrawal@paladion.net a écrit :

Hi dadoonet,

Currently we were using Elasticsearch 1.3.0 and supported Mapper Plugin for
the same. Now we are moving Elasticsearch 1.5.2 so we need to have the
updated mapper att plugin as well.

As in the latest one content is deprecated by _content, so in our
application we need to change the content to _content while retrieving the
document. So here my question is like is there way to re index the
Elasticsearch data so that older records can also be returned using _content
instead of content (any way like changing some conf or doing re indexing or
something else)

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Mapper-Attachment-plugin-2-3-0-released-tp4060656p4074364.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1430388737162-4074364.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2E348C66-DC16-421C-B905-5F6BA3C74332%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Hi ,
Thanks for a quick response,

Yes , I am making a change content to _content as an index request.

Here my worry is like our application is already deployed , so with new changes being done I will not be able to fetch the older records using the same application as older records will be returned with "content" (in source) and new records will be returned with "_content".

So what could be the best way to handle this using elasticsearch.

~Prashant

So why not doing that in your application?

If you look into _source.file.content and _source.file._content, older docs will have BASE64 content in content and null in _content and the opposite for newer docs.

Do I miss anything?

--
David Pilato - Developer | Evangelist

@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 30 avr. 2015 à 12:33, Prashant Agrawal prashant.agrawal@paladion.net a écrit :

Hi ,
Thanks for a quick response,

Yes , I am making a change content to _content as an index request.

Here my worry is like our application is already deployed , so with new
changes being done I will not be able to fetch the older records using the
same application as older records will be returned with "content" (in
source) and new records will be returned with "_content".

So what could be the best way to handle this using elasticsearch.

~Prashant

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Mapper-Attachment-plugin-2-3-0-released-tp4060656p4074367.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1430390026413-4074367.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9FA37BF2-A60A-42CF-92BD-01DC5FDA90D0%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Yes, Though that can be handled as an exception in our application.

But I was just looking for as if any thing is there in ES which can re index old document in parallel, and make the content present in older records under "_content" only instead of "content".

reindex is not a feature in elasticsearch nowadays.
You can read Reindexing Your Data | Elasticsearch: The Definitive Guide [master] | Elastic http://www.elastic.co/guide/en/elasticsearch/guide/master/reindex.html

How many documents you have to reindex?
IIRC your case you could have a lot, right?

--
David Pilato - Developer | Evangelist

@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 30 avr. 2015 à 12:47, Prashant Agrawal prashant.agrawal@paladion.net a écrit :

Yes, Though that can be handled as an exception in our application.

But I was just looking for as if any thing is there in ES which can re index
old document in parallel, and make the content present in older records
under "_content" only instead of "content".

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Mapper-Attachment-plugin-2-3-0-released-tp4060656p4074370.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1430390860202-4074370.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1500B0C7-4C08-4A28-BD31-BB8C40476600%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Ok, So handling at the client side would be the best solution for the same.

Yes, we do have records in TB's which may lead to billions of documents to be reindexed or handled differently.