Looking for Elasticsearch projects

For those that are not regulars on the mailing list, I am a fairly active
member that has used Elasticsearch for years.

I am leaving my full-time job to focus on other (techie and non-techie)
goals and would love to work on some interesting projects part-time. It can
be either paid assignments or free open-source projects. My main interests
are search with a focus on development. Not too keen on devops tasks such
as administering servers. I would rather work on my own stuff than be a
sysadmin. :slight_smile:

Feel free to contact me directly via email.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

We could always use help with CirrisSearch. It is the open source project
that links MediaWiki to Elasticsearch. We have it installed on all the
wikis at the wikimedia foundation but it isn't the default search backend
on the largest ones yet.

"Selling" points:
Huge user community
Basic queries work reasonably well
Expert syntax to support power users
PHP
Elastica
I manage the elasticsearch installation
I contribute changes we need upstream
Uses customized highlighter (also needs contributors)
Reasonably easy development installation with vagrant
Working on it is my full time job so review would be quick

Nik
On Sep 2, 2014 6:51 PM, "Ivan Brusic" ivan@brusic.com wrote:

For those that are not regulars on the mailing list, I am a fairly active
member that has used Elasticsearch for years.

I am leaving my full-time job to focus on other (techie and non-techie)
goals and would love to work on some interesting projects part-time. It can
be either paid assignments or free open-source projects. My main interests
are search with a focus on development. Not too keen on devops tasks such
as administering servers. I would rather work on my own stuff than be a
sysadmin. :slight_smile:

Feel free to contact me directly via email.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1NOc2YH8_G0Q0cR6pvGhHh%3DjKk0P8SivNsWVzOU2BKiw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

If you want to lend a hand for interesting projects, here are some of my
current favorites:

  • building a global library catalog index with Elasticsearch of all the
    open data / metadata on academic library servers, complete with harvester
    and updater, SRU, OAI etc. A starting point for SRU implementation is
    https://github.com/xbib/elasticsearch-sru

  • implementing a plugin for Elasticsearch that turns ES into a W3C Linked
    Data Platform
    Linked Data Platform 1.0 Primer,
    with HTTP PATCH support, JSON Patch RFC 6902, maybe even a Sparql-to-ES DSL
    translator

  • a harvester/pull plugin framework for ES, in order to supersede the river
    singleton concept, with provisioning for all kind of different sources,
    e.g. JDBC, or web crawling

  • helping British Library labs to find correct image legend texts in OCR
    XML from the book scanning project. See
    Millions of historical images posted to Flickr - BBC News I think Elasticsearch can
    handle the 230G zipped input. I got a copy from BL. No good algorithm
    exists yet. Maybe with ES? First step would be to design an index and to
    index/publish the OCR for better search?

Not sure where the incentives are. Ever lasting fame, honor, glory, world
domination, super power etc.

Jörg

On Wed, Sep 3, 2014 at 1:47 AM, Nikolas Everett nik9000@gmail.com wrote:

We could always use help with CirrisSearch. It is the open source project
that links MediaWiki to Elasticsearch. We have it installed on all the
wikis at the wikimedia foundation but it isn't the default search backend
on the largest ones yet.

"Selling" points:
Huge user community
Basic queries work reasonably well
Expert syntax to support power users
PHP
Elastica
I manage the elasticsearch installation
I contribute changes we need upstream
Uses customized highlighter (also needs contributors)
Reasonably easy development installation with vagrant
Working on it is my full time job so review would be quick

Nik
On Sep 2, 2014 6:51 PM, "Ivan Brusic" ivan@brusic.com wrote:

For those that are not regulars on the mailing list, I am a fairly active
member that has used Elasticsearch for years.

I am leaving my full-time job to focus on other (techie and non-techie)
goals and would love to work on some interesting projects part-time. It can
be either paid assignments or free open-source projects. My main interests
are search with a focus on development. Not too keen on devops tasks such
as administering servers. I would rather work on my own stuff than be a
sysadmin. :slight_smile:

Feel free to contact me directly via email.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1NOc2YH8_G0Q0cR6pvGhHh%3DjKk0P8SivNsWVzOU2BKiw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1NOc2YH8_G0Q0cR6pvGhHh%3DjKk0P8SivNsWVzOU2BKiw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH%3D5qDG5iUYiRgTQ_4jZiM8hYL7pcvP5di47BqZ-07ZeA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks Jörg.

The incentives for an open-source project is to pad my resume since I have
been working with obsolete technologies and processes for almost the past
three years. I implemented many changes at my company (Elasticsearch,
Maven, central logging, application-level monitoring), but there is only so
much one person can do. Plus, I love this stuff. The incentives for simply
contracting is purely money! Do not really need the cash, but I plan to
embark on some travels and it would easy my mind a bit.

Your project list reminds me of a project I have been working on, but I
could use some help. I am looking for datasets that also include example
queries and golden records for those queries. My goal is to test different
similarity algorithms using unknown data. Would love to use the Wikipedia
dump, but I never found any golden records. Perhaps Nik has something. The
only thing I have found are the TREC datasets, but I was hoping for a more
sizable example.

Ivan

On Wed, Sep 3, 2014 at 12:36 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

If you want to lend a hand for interesting projects, here are some of my
current favorites:

  • building a global library catalog index with Elasticsearch of all the
    open data / metadata on academic library servers, complete with harvester
    and updater, SRU, OAI etc. A starting point for SRU implementation is
    https://github.com/xbib/elasticsearch-sru

  • implementing a plugin for Elasticsearch that turns ES into a W3C Linked
    Data Platform
    Linked Data Platform 1.0 Primer,
    with HTTP PATCH support, JSON Patch RFC 6902, maybe even a Sparql-to-ES DSL
    translator

  • a harvester/pull plugin framework for ES, in order to supersede the
    river singleton concept, with provisioning for all kind of different
    sources, e.g. JDBC, or web crawling

  • helping British Library labs to find correct image legend texts in OCR
    XML from the book scanning project. See
    Millions of historical images posted to Flickr - BBC News I think Elasticsearch can
    handle the 230G zipped input. I got a copy from BL. No good algorithm
    exists yet. Maybe with ES? First step would be to design an index and to
    index/publish the OCR for better search?

Not sure where the incentives are. Ever lasting fame, honor, glory, world
domination, super power etc.

Jörg

On Wed, Sep 3, 2014 at 1:47 AM, Nikolas Everett nik9000@gmail.com wrote:

We could always use help with CirrisSearch. It is the open source project
that links MediaWiki to Elasticsearch. We have it installed on all the
wikis at the wikimedia foundation but it isn't the default search backend
on the largest ones yet.

"Selling" points:
Huge user community
Basic queries work reasonably well
Expert syntax to support power users
PHP
Elastica
I manage the elasticsearch installation
I contribute changes we need upstream
Uses customized highlighter (also needs contributors)
Reasonably easy development installation with vagrant
Working on it is my full time job so review would be quick

Nik
On Sep 2, 2014 6:51 PM, "Ivan Brusic" ivan@brusic.com wrote:

For those that are not regulars on the mailing list, I am a fairly
active member that has used Elasticsearch for years.

I am leaving my full-time job to focus on other (techie and non-techie)
goals and would love to work on some interesting projects part-time. It can
be either paid assignments or free open-source projects. My main interests
are search with a focus on development. Not too keen on devops tasks such
as administering servers. I would rather work on my own stuff than be a
sysadmin. :slight_smile:

Feel free to contact me directly via email.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC1U5cFm_wOOx378Aq0nwQRCwOQShLSaqLxmY7qMJOnEQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1NOc2YH8_G0Q0cR6pvGhHh%3DjKk0P8SivNsWVzOU2BKiw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1NOc2YH8_G0Q0cR6pvGhHh%3DjKk0P8SivNsWVzOU2BKiw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH%3D5qDG5iUYiRgTQ_4jZiM8hYL7pcvP5di47BqZ-07ZeA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH%3D5qDG5iUYiRgTQ_4jZiM8hYL7pcvP5di47BqZ-07ZeA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDiyk-0SdFpBft6%2BS%2BZ5x%3Dkpg5kufgzAxC0hgbrGt96Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

On Thu, Sep 4, 2014 at 12:10 AM, Ivan Brusic ivan@brusic.com wrote:

Thanks Jörg.

The incentives for an open-source project is to pad my resume since I have
been working with obsolete technologies and processes for almost the past
three years. I implemented many changes at my company (Elasticsearch,
Maven, central logging, application-level monitoring), but there is only so
much one person can do. Plus, I love this stuff. The incentives for simply
contracting is purely money! Do not really need the cash, but I plan to
embark on some travels and it would easy my mind a bit.

Your project list reminds me of a project I have been working on, but I
could use some help. I am looking for datasets that also include example
queries and golden records for those queries. My goal is to test different
similarity algorithms using unknown data. Would love to use the Wikipedia
dump, but I never found any golden records. Perhaps Nik has something. The
only thing I have found are the TREC datasets, but I was hoping for a more
sizable example.

Ivan, I'm actually working on something like this (and I don't thing Jorg
actually meant that..). I was involved with
Apache Lucene - but its now discontinued and in
some spare time I have I'm trying to take that initiative forward.

Ping me privately if that sounds interesting and we can continue discussing.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHdaaXFZVTH_rUDAvWzYXPnSoegKPOqs-km5AXJPkQfg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I did not realize Jörg's response was to the list and not privately (as
most other responses were). I am thankful that I did not bad mouth my
employer too badly! :slight_smile:

I am very aware of the open relevancy project and its discontinued status.
I emailed the Lucene mailing list about it not to long ago. Would love to
work on something in that regard.

Cheers,

Ivan

On Wed, Sep 3, 2014 at 2:16 PM, Itamar Syn-Hershko itamar@code972.com
wrote:

On Thu, Sep 4, 2014 at 12:10 AM, Ivan Brusic ivan@brusic.com wrote:

Thanks Jörg.

The incentives for an open-source project is to pad my resume since I
have been working with obsolete technologies and processes for almost the
past three years. I implemented many changes at my company (Elasticsearch,
Maven, central logging, application-level monitoring), but there is only so
much one person can do. Plus, I love this stuff. The incentives for simply
contracting is purely money! Do not really need the cash, but I plan to
embark on some travels and it would easy my mind a bit.

Your project list reminds me of a project I have been working on, but I
could use some help. I am looking for datasets that also include example
queries and golden records for those queries. My goal is to test different
similarity algorithms using unknown data. Would love to use the Wikipedia
dump, but I never found any golden records. Perhaps Nik has something. The
only thing I have found are the TREC datasets, but I was hoping for a more
sizable example.

Ivan, I'm actually working on something like this (and I don't thing Jorg
actually meant that..). I was involved with
Apache Lucene - but its now discontinued and in
some spare time I have I'm trying to take that initiative forward.

Ping me privately if that sounds interesting and we can continue
discussing.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHdaaXFZVTH_rUDAvWzYXPnSoegKPOqs-km5AXJPkQfg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHdaaXFZVTH_rUDAvWzYXPnSoegKPOqs-km5AXJPkQfg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAP6OuuxpAJ_ipJVX208AQZsjZrj1Sn2%2BmZbSvP%3DJna2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Well, this response is also public :slight_smile:

I'll ping you sometime next week with more details, juggling with too many
things currently. Would definitely love to have an extra set of eyes.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Sep 4, 2014 at 12:36 AM, Ivan Brusic ivan@brusic.com wrote:

I did not realize Jörg's response was to the list and not privately (as
most other responses were). I am thankful that I did not bad mouth my
employer too badly! :slight_smile:

I am very aware of the open relevancy project and its discontinued status.
I emailed the Lucene mailing list about it not to long ago. Would love to
work on something in that regard.

Cheers,

Ivan

On Wed, Sep 3, 2014 at 2:16 PM, Itamar Syn-Hershko itamar@code972.com
wrote:

On Thu, Sep 4, 2014 at 12:10 AM, Ivan Brusic ivan@brusic.com wrote:

Thanks Jörg.

The incentives for an open-source project is to pad my resume since I
have been working with obsolete technologies and processes for almost the
past three years. I implemented many changes at my company (Elasticsearch,
Maven, central logging, application-level monitoring), but there is only so
much one person can do. Plus, I love this stuff. The incentives for simply
contracting is purely money! Do not really need the cash, but I plan to
embark on some travels and it would easy my mind a bit.

Your project list reminds me of a project I have been working on, but I
could use some help. I am looking for datasets that also include example
queries and golden records for those queries. My goal is to test different
similarity algorithms using unknown data. Would love to use the Wikipedia
dump, but I never found any golden records. Perhaps Nik has something. The
only thing I have found are the TREC datasets, but I was hoping for a more
sizable example.

Ivan, I'm actually working on something like this (and I don't thing Jorg
actually meant that..). I was involved with
Apache Lucene - but its now discontinued and in
some spare time I have I'm trying to take that initiative forward.

Ping me privately if that sounds interesting and we can continue
discussing.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHdaaXFZVTH_rUDAvWzYXPnSoegKPOqs-km5AXJPkQfg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsHdaaXFZVTH_rUDAvWzYXPnSoegKPOqs-km5AXJPkQfg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAP6OuuxpAJ_ipJVX208AQZsjZrj1Sn2%2BmZbSvP%3DJna2A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAP6OuuxpAJ_ipJVX208AQZsjZrj1Sn2%2BmZbSvP%3DJna2A%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsAcnMUs%3Dk0eAviAKDDfm5tKNPey0qB_MAF-%2BpZKiqbTg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.