ElasticSearch Output Connector for Apache ManifoldCF


(Piergiorgio Lucidi) #1

Hi guys,

I would like to create an ElasticSearch Output Connector for Apache
ManifoldCF, here my proposal in the official mailing list of the
project:

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201111.mbox/<CAEO2op8wwS=tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-pV6G9Op_g@mail.gmail.com>

Is it possible to use the Attachment plugin feature directly from the
Java client?
I would prefer to use an unique Java client to execute all the needed
operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java connector
without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(Piergiorgio Lucidi) #2

I'm sorry here the correct link for the Mail Archive and please select
the discussion "Proposal about a roadmap for connectors":

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201111.mbox/browser

Piergiorgio

On Nov 4, 4:32 pm, Piergiorgio Lucidi piergiorgioluc...@gmail.com
wrote:

Hi guys,

I would like to create an ElasticSearch Output Connector for Apache
ManifoldCF, here my proposal in the official mailing list of the
project:

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201111.mb ox/%3CCAEO2op8wwS=tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-pV6G9O...@mail.gmail.com%3E

Is it possible to use the Attachment plugin feature directly from the
Java client?
I would prefer to use an unique Java client to execute all the needed
operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java connector
without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(Otis Gospodnetić) #3

Here's a direct link: http://search-lucene.com/m/Zwl7613RFMz1

The idea is to be able to suck content out of ES and index it with
Solr, say for migration purposes?

Otis

On Nov 4, 11:43 am, Piergiorgio Lucidi piergiorgioluc...@gmail.com
wrote:

I'm sorry here the correct link for the Mail Archive and please select
the discussion "Proposal about a roadmap for connectors":

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201...

Piergiorgio

On Nov 4, 4:32 pm, Piergiorgio Lucidi piergiorgioluc...@gmail.com
wrote:

Hi guys,

I would like to create an ElasticSearch Output Connector for Apache
ManifoldCF, here my proposal in the official mailing list of the
project:

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201...ox/<CAEO2op8wwS=tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-pV6G9O...@mail.gmail.com%3E

Is it possible to use the Attachment plugin feature directly from the
Java client?
I would prefer to use an unique Java client to execute all the needed
operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java connector
without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(Piergiorgio Lucidi) #4

On Nov 5, 2:23 pm, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:

Here's a direct link:http://search-lucene.com/m/Zwl7613RFMz1

The idea is to be able to suck content out of ES and index it with
Solr, say for migration purposes?

No, the idea behind ManifoldCF is to get content from repositories and
put contents into search servers such as Apache Solr, MetaCarta,
OpenSearchServer (and ElasticSearch? :slight_smile: ).
In this way with ManifoldCF you can schedule jobs to create indexes
for your contents.

For each job you can set the source repository (a Repository
Connection: CMIS, Documentum, Sharepoint, etc..) and the target server
(an Output Connection: Apache Solr, MetaCarta, OpenSearchServer).

I think that we could easily add an ElasticSearch Output Connector,
this means that ManifoldCF could get contents from repositories and
create indexes into ElasticSearch.

Here some references about ManifoldCF:
http://incubator.apache.org/connectors/

And how to write an Output Connector:
http://incubator.apache.org/connectors/writing-output-connectors.html

Let me know if someone (or you :wink: ) is interested to support me or
implement by yourself this contribution.
Thank you again.

Cheers,
Piergiorgio

Otis

On Nov 4, 11:43 am,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

I'm sorry here the correct link for the Mail Archive and please select
the discussion "Proposal about a roadmap for connectors":

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201...

Piergiorgio

On Nov 4, 4:32 pm,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

Hi guys,

I would like to create an ElasticSearch Output Connector for Apache
ManifoldCF, here my proposal in the official mailing list of the
project:

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201...ox/ %3CCAEO2op8wwS=tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-pV6G9O...@mail.gmail.com%3E

Is it possible to use the Attachment plugin feature directly from the
Java client?
I would prefer to use an unique Java client to execute all the needed
operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java connector
without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(Lukáš Vlček) #5

Hi,

I have been exploring this as well some time ago and it sounds like a good
idea to me (and from the quick code overview it did not sound like hard
task). Although I am not familiar with MCF I would like to help if you
decide to tackle this.

Regards,
Lukas

On Tue, Nov 8, 2011 at 10:54 AM, Piergiorgio Lucidi <
piergiorgiolucidi@gmail.com> wrote:

On Nov 5, 2:23 pm, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:

Here's a direct link:http://search-lucene.com/m/Zwl7613RFMz1

The idea is to be able to suck content out of ES and index it with
Solr, say for migration purposes?

No, the idea behind ManifoldCF is to get content from repositories and
put contents into search servers such as Apache Solr, MetaCarta,
OpenSearchServer (and ElasticSearch? :slight_smile: ).
In this way with ManifoldCF you can schedule jobs to create indexes
for your contents.

For each job you can set the source repository (a Repository
Connection: CMIS, Documentum, Sharepoint, etc..) and the target server
(an Output Connection: Apache Solr, MetaCarta, OpenSearchServer).

I think that we could easily add an ElasticSearch Output Connector,
this means that ManifoldCF could get contents from repositories and
create indexes into ElasticSearch.

Here some references about ManifoldCF:
http://incubator.apache.org/connectors/

And how to write an Output Connector:
http://incubator.apache.org/connectors/writing-output-connectors.html

Let me know if someone (or you :wink: ) is interested to support me or
implement by yourself this contribution.
Thank you again.

Cheers,
Piergiorgio

Otis

On Nov 4, 11:43 am,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

I'm sorry here the correct link for the Mail Archive and please select
the discussion "Proposal about a roadmap for connectors":

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201.
..

Piergiorgio

On Nov 4, 4:32 pm,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

Hi guys,

I would like to create an ElasticSearch Output Connector for Apache
ManifoldCF, here my proposal in the official mailing list of the
project:

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201...ox/<CAEO2op8wwS=
tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-pV6G9O...@mail.gmail.com%3E

Is it possible to use the Attachment plugin feature directly from the
Java client?
I would prefer to use an unique Java client to execute all the needed
operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java connector
without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(mjk) #6

I am interested in a es connector as well. If you need additional
developer cycles, let me know too. I am willing to help.

On Nov 11, 8:38 am, Lukáš Vlček lukas.vl...@gmail.com wrote:

Hi,

I have been exploring this as well some time ago and it sounds like a good
idea to me (and from the quick code overview it did not sound like hard
task). Although I am not familiar with MCF I would like to help if you
decide to tackle this.

Regards,
Lukas

On Tue, Nov 8, 2011 at 10:54 AM, Piergiorgio Lucidi piergiorgioluc...@gmail.com wrote:

On Nov 5, 2:23 pm, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:

Here's a direct link:http://search-lucene.com/m/Zwl7613RFMz1

The idea is to be able to suck content out of ES and index it with
Solr, say for migration purposes?

No, the idea behind ManifoldCF is to get content from repositories and
put contents into search servers such as Apache Solr, MetaCarta,
OpenSearchServer (and ElasticSearch? :slight_smile: ).
In this way with ManifoldCF you can schedule jobs to create indexes
for your contents.

For each job you can set the source repository (a Repository
Connection: CMIS, Documentum, Sharepoint, etc..) and the target server
(an Output Connection: Apache Solr, MetaCarta, OpenSearchServer).

I think that we could easily add an ElasticSearch Output Connector,
this means that ManifoldCF could get contents from repositories and
create indexes into ElasticSearch.

Here some references about ManifoldCF:
http://incubator.apache.org/connectors/

And how to write an Output Connector:
http://incubator.apache.org/connectors/writing-output-connectors.html

Let me know if someone (or you :wink: ) is interested to support me or
implement by yourself this contribution.
Thank you again.

Cheers,
Piergiorgio

Otis

On Nov 4, 11:43 am,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

I'm sorry here the correct link for the Mail Archive and please select
the discussion "Proposal about a roadmap for connectors":

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201.
..

Piergiorgio

On Nov 4, 4:32 pm,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

Hi guys,

I would like to create an ElasticSearch Output Connector for Apache
ManifoldCF, here my proposal in the official mailing list of the
project:

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201...
tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-pV6G9O...@mail.gmail.com%3E

Is it possible to use the Attachment plugin feature directly from the
Java client?
I would prefer to use an unique Java client to execute all the needed
operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java connector
without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(Piergiorgio Lucidi) #7

Lukas, Michael,

thank you very much for your interest in the development of this task
for Apache ManifoldCF.

I'm finishing the implementation of the Alfresco Connector for
(Alfresco 1.x, 2.x, 3.x and 4.x) and then I would like to start
working on the ElasticSearch Output Connector.
I think that your support will be appreciated and necessary to
consolidate all.

I'll let you know soon, about how we can start this contribution
together, or if one of you (or both ? :slight_smile: ) are confident to release an
initial contribution it could be a great start point to collaborate
with the community of Apache ManifoldCF :wink:

Thank you so much for your availability.

Regards,
Piergiorgio

On Nov 19, 5:08 am, mjk mj.kelle...@gmail.com wrote:

I am interested in a es connector as well. If you need additional
developer cycles, let me know too. I am willing to help.

On Nov 11, 8:38 am, Lukáš Vlček lukas.vl...@gmail.com wrote:

Hi,

I have been exploring this as well some time ago and it sounds like a good
idea to me (and from the quick code overview it did not sound like hard
task). Although I am not familiar with MCF I would like to help if you
decide to tackle this.

Regards,
Lukas

On Tue, Nov 8, 2011 at 10:54 AM, Piergiorgio Lucidi piergiorgioluc...@gmail.com wrote:

On Nov 5, 2:23 pm, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:

Here's a direct link:http://search-lucene.com/m/Zwl7613RFMz1

The idea is to be able to suck content out of ES and index it with
Solr, say for migration purposes?

No, the idea behind ManifoldCF is to get content from repositories and
put contents into search servers such as Apache Solr, MetaCarta,
OpenSearchServer (and ElasticSearch? :slight_smile: ).
In this way with ManifoldCF you can schedule jobs to create indexes
for your contents.

For each job you can set the source repository (a Repository
Connection: CMIS, Documentum, Sharepoint, etc..) and the target server
(an Output Connection: Apache Solr, MetaCarta, OpenSearchServer).

I think that we could easily add an ElasticSearch Output Connector,
this means that ManifoldCF could get contents from repositories and
create indexes into ElasticSearch.

Here some references about ManifoldCF:
http://incubator.apache.org/connectors/

And how to write an Output Connector:
http://incubator.apache.org/connectors/writing-output-connectors.html

Let me know if someone (or you :wink: ) is interested to support me or
implement by yourself this contribution.
Thank you again.

Cheers,
Piergiorgio

Otis

On Nov 4, 11:43 am,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

I'm sorry here the correct link for the Mail Archive and please select
the discussion "Proposal about a roadmap for connectors":

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201.
..

Piergiorgio

On Nov 4, 4:32 pm,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

Hi guys,

I would like to create an ElasticSearch Output Connector for Apache
ManifoldCF, here my proposal in the official mailing list of the
project:

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201...
tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-pV6G9O...@mail.gmail.com%3E

Is it possible to use the Attachment plugin feature directly from the
Java client?
I would prefer to use an unique Java client to execute all the needed
operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java connector
without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(Piergiorgio Lucidi) #8

Hi guys,

a first iteration of the development of the ElasticSearch plugin was
finished and now it is included in the latest version of Apache ManifolfCF.

I wrote a post on my website about the latest changes:

From this post you can visit some references of the project.

The source code of the ElasticSearch connector is available at this address:
https://svn.apache.org/repos/asf/incubator/lcf/trunk/connectors/elasticsearch/

If you have some tips or suggestions, please don't hesitate to contact me
for any type of contributions or ideas.

Thank you again.

Cheers,
Piergiorgio

Il giorno domenica 20 novembre 2011 13:57:22 UTC+1, Piergiorgio Lucidi ha
scritto:

Lukas, Michael,

thank you very much for your interest in the development of this task
for Apache ManifoldCF.

I'm finishing the implementation of the Alfresco Connector for
(Alfresco 1.x, 2.x, 3.x and 4.x) and then I would like to start
working on the ElasticSearch Output Connector.
I think that your support will be appreciated and necessary to
consolidate all.

I'll let you know soon, about how we can start this contribution
together, or if one of you (or both ? :slight_smile: ) are confident to release an
initial contribution it could be a great start point to collaborate
with the community of Apache ManifoldCF :wink:

Thank you so much for your availability.

Regards,
Piergiorgio

On Nov 19, 5:08 am, mjk mj.kelle...@gmail.com wrote:

I am interested in a es connector as well. If you need additional
developer cycles, let me know too. I am willing to help.

On Nov 11, 8:38 am, Lukáš Vlček lukas.vl...@gmail.com wrote:

Hi,

I have been exploring this as well some time ago and it sounds like a
good

idea to me (and from the quick code overview it did not sound like hard
task). Although I am not familiar with MCF I would like to help if you
decide to tackle this.

Regards,
Lukas

On Tue, Nov 8, 2011 at 10:54 AM, Piergiorgio Lucidi <
piergiorgioluc...@gmail.com> wrote:

On Nov 5, 2:23 pm, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:

Here's a direct link:http://search-lucene.com/m/Zwl7613RFMz1

The idea is to be able to suck content out of ES and index it with
Solr, say for migration purposes?

No, the idea behind ManifoldCF is to get content from repositories
and

put contents into search servers such as Apache Solr, MetaCarta,
OpenSearchServer (and ElasticSearch? :slight_smile: ).
In this way with ManifoldCF you can schedule jobs to create indexes
for your contents.

For each job you can set the source repository (a Repository
Connection: CMIS, Documentum, Sharepoint, etc..) and the target
server

(an Output Connection: Apache Solr, MetaCarta, OpenSearchServer).

I think that we could easily add an ElasticSearch Output Connector,
this means that ManifoldCF could get contents from repositories and
create indexes into ElasticSearch.

Here some references about ManifoldCF:
http://incubator.apache.org/connectors/

And how to write an Output Connector:
http://incubator.apache.org/connectors/writing-output-connectors.html

Let me know if someone (or you :wink: ) is interested to support me or
implement by yourself this contribution.
Thank you again.

Cheers,
Piergiorgio

Otis

On Nov 4, 11:43 am,PiergiorgioLucidi piergiorgioluc...@gmail.com
wrote:

I'm sorry here the correct link for the Mail Archive and please
select

the discussion "Proposal about a roadmap for connectors":

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201.

..

Piergiorgio

On Nov 4, 4:32 pm,PiergiorgioLucidi <piergiorgioluc...@gmail.com

wrote:

Hi guys,

I would like to create an ElasticSearch Output Connector for
Apache

ManifoldCF, here my proposal in the official mailing list of
the

project:

http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201...

tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-pV6G9O...@mail.gmail.com%3E

Is it possible to use the Attachment plugin feature directly
from the

Java client?
I would prefer to use an unique Java client to execute all the
needed

operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java
connector

without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(Shay Banon) #9

Great stuff!, thanks for the effort.

On Mon, Apr 23, 2012 at 11:22 AM, Piergiorgio Lucidi <
piergiorgiolucidi@gmail.com> wrote:

Hi guys,

a first iteration of the development of the ElasticSearch plugin was
finished and now it is included in the latest version of Apache ManifolfCF.

I wrote a post on my website about the latest changes:

http://www.open4dev.com/journal/2012/4/23/apache-manifoldcf-05-incubating-released.html

From this post you can visit some references of the project.

The source code of the ElasticSearch connector is available at this
address:

https://svn.apache.org/repos/asf/incubator/lcf/trunk/connectors/elasticsearch/

If you have some tips or suggestions, please don't hesitate to contact me
for any type of contributions or ideas.

Thank you again.

Cheers,
Piergiorgio

Il giorno domenica 20 novembre 2011 13:57:22 UTC+1, Piergiorgio Lucidi ha
scritto:

Lukas, Michael,

thank you very much for your interest in the development of this task
for Apache ManifoldCF.

I'm finishing the implementation of the Alfresco Connector for
(Alfresco 1.x, 2.x, 3.x and 4.x) and then I would like to start
working on the ElasticSearch Output Connector.
I think that your support will be appreciated and necessary to
consolidate all.

I'll let you know soon, about how we can start this contribution
together, or if one of you (or both ? :slight_smile: ) are confident to release an
initial contribution it could be a great start point to collaborate
with the community of Apache ManifoldCF :wink:

Thank you so much for your availability.

Regards,
Piergiorgio

On Nov 19, 5:08 am, mjk mj.kelle...@gmail.com wrote:

I am interested in a es connector as well. If you need additional
developer cycles, let me know too. I am willing to help.

On Nov 11, 8:38 am, Lukáš Vlček lukas.vl...@gmail.com wrote:

Hi,

I have been exploring this as well some time ago and it sounds like a
good

idea to me (and from the quick code overview it did not sound like
hard

task). Although I am not familiar with MCF I would like to help if you
decide to tackle this.

Regards,
Lukas

On Tue, Nov 8, 2011 at 10:54 AM, Piergiorgio Lucidi <
piergiorgioluc...@gmail.com> wrote:

On Nov 5, 2:23 pm, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:

Here's a direct link:http://search-lucene.com/**m/Zwl7613RFMz1http://search-lucene.com/m/Zwl7613RFMz1

The idea is to be able to suck content out of ES and index it with
Solr, say for migration purposes?

No, the idea behind ManifoldCF is to get content from repositories
and

put contents into search servers such as Apache Solr, MetaCarta,
OpenSearchServer (and ElasticSearch? :slight_smile: ).
In this way with ManifoldCF you can schedule jobs to create indexes
for your contents.

For each job you can set the source repository (a Repository
Connection: CMIS, Documentum, Sharepoint, etc..) and the target
server

(an Output Connection: Apache Solr, MetaCarta, OpenSearchServer).

I think that we could easily add an ElasticSearch Output Connector,
this means that ManifoldCF could get contents from repositories and
create indexes into ElasticSearch.

Here some references about ManifoldCF:
http://incubator.apache.org/**connectors/http://incubator.apache.org/connectors/

And how to write an Output Connector:
http://incubator.apache.org/connectors/writing-output-
connectors.htmlhttp://incubator.apache.org/connectors/writing-output-connectors.html

Let me know if someone (or you :wink: ) is interested to support me or
implement by yourself this contribution.
Thank you again.

Cheers,
Piergiorgio

Otis

On Nov 4, 11:43 am,PiergiorgioLucidi <piergiorgioluc...@gmail.com

wrote:

I'm sorry here the correct link for the Mail Archive and please
select

the discussion "Proposal about a roadmap for connectors":

http://mail-archives.apache.org/mod_mbox/incubator-
connectors-dev/201http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201
.

..

Piergiorgio

On Nov 4, 4:32 pm,PiergiorgioLucidi <
piergiorgioluc...@gmail.com>

wrote:

Hi guys,

I would like to create an ElasticSearch Output Connector for
Apache

ManifoldCF, here my proposal in the official mailing list of
the

project:

http://mail-archives.apache.org/mod_mbox/incubator-
connectors-dev/201.http://mail-archives.apache.org/mod_mbox/incubator-connectors-dev/201.
..

tAwmbs2ASGDKeuxLkgtkmMBaT5Gq+-**pV6G9O...@mail.gmail.com%3E

Is it possible to use the Attachment plugin feature directly
from the

Java client?
I would prefer to use an unique Java client to execute all
the needed

operations and binaries ingestion on ElasticSearch.
This because in this way I can implement a clean pure Java
connector

without using any other HTTP client connectors.

Let me know if someone can support me to achieve this goal.
Thank you for your support.

Regards,
Piergiorgio


(system) #10