Post Processing of Indexed Data with Surifki Refine

Hi everyone, we are avid users of Elasticsearch ourselves as well as for
our clients. (http://www.intridea.com) In fact we use Elasticsearch in one
of our products, Surfiki. More on that here: http://surfiki.com/about

However that isn't why I am writing.

After using Elasticsearch for quite some time we found the need to do post
and inline processing on our data. Some examples are:

  1. Combining data from multiple index's in to a new index either adhoc or
    on a scheduled basis.
    1a. Splitting data from a single index in to new index's based upon some
    criteria
  2. Statistical facets had been causing us some heap issues and found an
    easier solution was to process/transform/count this data and expose within
    a new index on a set schedule.
  3. Transforming data already indexed with additional data. For example,
    some of our data has location information (lat, long) however we also
    wanted to store meta data associated with those locations Such as State,
    City and County information. While it makes sense to do some of this
    inline, it seemed to make more sense to do it after the initial data was
    already indexed.
  4. Accessing third party API's to append data to existing indexed data.

For this we decided to create a product (Surfiki Refine) that utilizes
both multiple combined open source tools as well as new code with the
result being a python based map-reduce tier. This met all of our needs for
the additional processing we needed to perform as well as more. We are
going to release this open source tool in the coming week.

My question is thus;

  1. For people who expose their index's publicly via API, is a hosted
    version of this platform of interest to you? Does hosting it and a minimal
    cost make sense in order to have a platform ready and running for use
    against your public API? We are considering a hosted version as a
    viable commercial offering. As well, we have a distributed version that can
    tackle more hefty data transforms and manipulation.

Either way, it will be released and you guys/gals can play with it and see
if it meets any needs or resolves any issues you are encountering when
needed to manipulate data that is already indexed.

I wanted to give a heads up and get your opinion on the above.

Attached are some screenshots of the tool/platform.

Some nice features are:

Completely browser based
Exclusively featured to work with Elasticsearch
Browser based code editing
Job creation and management

Thanks,

Anthony Nyström
Fellow, Managing Director of Engineering
Intridea, Inc. | www.intridea.com
anthony@intridea.com
(o) 888.968.4332 x502
(c) 650.417.3203

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.