External datasets in ES


(Michael) #1

From
reading http://www.elasticsearch.org/blog/enriching-searches-open-geo-data/
I have a few questions I hope the community might be able to answer

The post uses an open dataset in a static csv to map German cities meeting
certain conditions in Kibana as an example

I was wondering if its possible to take that idea and

  1. Combine an static csv dataset with other ES data so sticking with the
    Cities example I would be able to live map the visitors to my german
    website from cities with populations > 100k from the same ES cluster and
    ideally the same kibana interface
  2. If it is possible how do I then update the population details when a
    newer version of the dataset is available without ending up with 2 of every
    German city with possibly conflicting population values

Any ideas?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/76a076dd-d839-4b30-bed6-f11c2577550d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Reelsen) #2

Hey,

this would be a bit more tricky, as it requires you to merge two events
(the external dataset and your live visitor stats) into a single event as a
sort of preprocessing step. I think I would start with the geoip support
from logstash and use your apache logs, which at least allows you to filter
by city.

Need to think about this a bit more, how to merge this kind of information.

--Alex

On Thu, May 1, 2014 at 3:19 PM, michael.obrien@ul.ie wrote:

From reading
http://www.elasticsearch.org/blog/enriching-searches-open-geo-data/ I
have a few questions I hope the community might be able to answer

The post uses an open dataset in a static csv to map German cities meeting
certain conditions in Kibana as an example

I was wondering if its possible to take that idea and

  1. Combine an static csv dataset with other ES data so sticking with
    the Cities example I would be able to live map the visitors to my german
    website from cities with populations > 100k from the same ES cluster and
    ideally the same kibana interface
  2. If it is possible how do I then update the population details when
    a newer version of the dataset is available without ending up with 2 of
    every German city with possibly conflicting population values

Any ideas?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/76a076dd-d839-4b30-bed6-f11c2577550d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/76a076dd-d839-4b30-bed6-f11c2577550d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM9niCM-EJ0oYUjv9siUtUB0d2EikJOvwiOAndL7ZZyLUg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Michael) #3

I was wondering if you thought tribe nodes and a separate cluster for the open-geo data and another for the logstash would be a way to go ?

It would from my reading of the post above add a small bit more to the configuration but would allow separation of data updates or have I misunderstood the concept of tribes?
From: elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] On Behalf Of Alexander Reelsen
Sent: 05 May 2014 12:17
To: elasticsearch@googlegroups.com
Subject: Re: External datasets in ES

Hey,

this would be a bit more tricky, as it requires you to merge two events (the external dataset and your live visitor stats) into a single event as a sort of preprocessing step. I think I would start with the geoip support from logstash and use your apache logs, which at least allows you to filter by city.

Need to think about this a bit more, how to merge this kind of information.

--Alex

On Thu, May 1, 2014 at 3:19 PM, <michael.obrien@ul.iemailto:michael.obrien@ul.ie> wrote:
From reading http://www.elasticsearch.org/blog/enriching-searches-open-geo-data/ I have a few questions I hope the community might be able to answer

The post uses an open dataset in a static csv to map German cities meeting certain conditions in Kibana as an example

I was wondering if its possible to take that idea and

  1. Combine an static csv dataset with other ES data so sticking with the Cities example I would be able to live map the visitors to my german website from cities with populations > 100k from the same ES cluster and ideally the same kibana interface
  2. If it is possible how do I then update the population details when a newer version of the dataset is available without ending up with 2 of every German city with possibly conflicting population values
    Any ideas?
    --
    You received this message because you are subscribed to the Google Groups "elasticsearch" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.commailto:elasticsearch+unsubscribe@googlegroups.com.
    To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/76a076dd-d839-4b30-bed6-f11c2577550d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/76a076dd-d839-4b30-bed6-f11c2577550d%40googlegroups.com?utm_medium=email&utm_source=footer.
    For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/8gPxfa9qENM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.commailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM9niCM-EJ0oYUjv9siUtUB0d2EikJOvwiOAndL7ZZyLUg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAGCwEM9niCM-EJ0oYUjv9siUtUB0d2EikJOvwiOAndL7ZZyLUg%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/36667CDCAAF70140AE7738BB93CA8C9606555A%40ExMbx1.ul.campus.
For more options, visit https://groups.google.com/d/optout.


(Alexander Reelsen) #4

Hey,

the tribe node (or federated search) is intended to join different clusters
as one in order to execute operations against both. This is however only
useful, if you do not have control over those two clusters (for example if
they are managed by two different departments) or their data. In your case
this does not give you a lot of advantages, at least how I understood your
setup. Read more about it at http://www.elasticsearch.org/blog/tribe-node/

--Alex

On Tue, May 6, 2014 at 11:16 AM, Michael.OBrien Michael.OBrien@ul.iewrote:

I was wondering if you thought tribe nodes and a separate cluster for
the open-geo data and another for the logstash would be a way to go ?

http://www.elasticsearch.org/blog/tribe-node/

It would from my reading of the post above add a small bit more to the
configuration but would allow separation of data updates or have I
misunderstood the concept of tribes?

From: elasticsearch@googlegroups.com [mailto:
elasticsearch@googlegroups.com] *On Behalf Of *Alexander Reelsen
Sent: 05 May 2014 12:17
To: elasticsearch@googlegroups.com
Subject: Re: External datasets in ES

Hey,

this would be a bit more tricky, as it requires you to merge two events
(the external dataset and your live visitor stats) into a single event as a
sort of preprocessing step. I think I would start with the geoip support
from logstash and use your apache logs, which at least allows you to filter
by city.

Need to think about this a bit more, how to merge this kind of information.

--Alex

On Thu, May 1, 2014 at 3:19 PM, michael.obrien@ul.ie wrote:

From reading
http://www.elasticsearch.org/blog/enriching-searches-open-geo-data/ I
have a few questions I hope the community might be able to answer

The post uses an open dataset in a static csv to map German cities meeting
certain conditions in Kibana as an example

I was wondering if its possible to take that idea and

  1. Combine an static csv dataset with other ES data so sticking with
    the Cities example I would be able to live map the visitors to my german
    website from cities with populations > 100k from the same ES cluster and
    ideally the same kibana interface
  2. If it is possible how do I then update the population details when
    a newer version of the dataset is available without ending up with 2 of
    every German city with possibly conflicting population values

Any ideas?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/76a076dd-d839-4b30-bed6-f11c2577550d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/76a076dd-d839-4b30-bed6-f11c2577550d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8gPxfa9qENM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM9niCM-EJ0oYUjv9siUtUB0d2EikJOvwiOAndL7ZZyLUg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAGCwEM9niCM-EJ0oYUjv9siUtUB0d2EikJOvwiOAndL7ZZyLUg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/36667CDCAAF70140AE7738BB93CA8C9606555A%40ExMbx1.ul.campushttps://groups.google.com/d/msgid/elasticsearch/36667CDCAAF70140AE7738BB93CA8C9606555A%40ExMbx1.ul.campus?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM9nevauARi5533nLQpGbOQfb0BMBan15ufhz%3D8HJ44iOg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5