Reindexing Tweets changing a field type in Elasticsearch using Logstash


(Pablo Ernesto Vigneaux Wilton) #1

Hi guys,

I know twitter / tweets are a type of subject with many doubts and answers, but I don't find the solution. Well, the problem is the same than others guys find: coordinates isn't a geo_poin type.

I want to use the tweets that are already in elasticsearch, so I believe that I can reindex using logstash applying a "mutate", is it correct? I try the following conf file:

input {
   elasticsearch {
       hosts => ["localhost:9200"]
       index => "twitter_fln"
       size => 1000
       scroll => "5m"
       docinfo => true
       scan => true
   }
}

filter {
    mutate {
        update => {"coordinates.type" => "geo_point"}
    }
}

output {
    elasticsearch {
        hosts => ["localhost:9200"]
        index => "twitter_re"
        document_type => "twitter"
    }
    stdout {
        codec => "dots"
    } 
}

The index is created and the documents are saved, but the field "coordinates.type" persist as "double" and not as "geo_point", I believe that the name that I'm using is wrong, possibly the structure (coordinates.type). How I change this field type using logstash?

Logstash version: 2.4.0
Elasticsearch version: 2.4.1

Sorry my English.
Any help is welcome!

Regards!


(João Duarte) #2

you have to state in the elasticsearch mapping that the field coordinates.type will be of type geo_point.

so either you can install a custom elasticsearch template and turn off logstash's template management manage_template => false, or tell logstash to install your custom template that describes that field as a geo_point. The option is template => "/file/path"

See more in https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-template

and logstash default templates:

for es 2.x: https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/v5.3.3/lib/logstash/outputs/elasticsearch/elasticsearch-template-es2x.json
for es 5.x: https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/v5.3.3/lib/logstash/outputs/elasticsearch/elasticsearch-template-es5x.json


(Pablo Ernesto Vigneaux Wilton) #3

Hi João,

Thank you for the answer, but I have a dude... Is really necessary to change the mapping / template for a re-indexing using Logstash with "mutate" ? Obviously the field "coordinates.type" from the "input" is a Float, but using "mutate" the "output" is typed to geo_point... or is it incorrect? Only to understand the correct behavior of Logstash.

I know that I need to create a new "twitter.json" mapping with "coordinates.type" as a geo_type, and use a template to use it in a new "import" / "extraction" from Twitter, but I was understand that it is no necessary to a re-indexing...


(Magnus Bäck) #4

Obviously the field "coordinates.type" from the "input" is a Float, but using "mutate" the "output" is typed to geo_point... or is it incorrect?

The mutate filter's update option changes an existing value of a field. It doesn't change its type (that's what the convert option does). Secondly, you should think of an event inside Logstash as a JSON object. Values in JSON objects are strings, bools, numbers, arrays, or objects. There is no geo point type in JSON. Consequently, it's impossible to change a field's type to geo point using any kind of filter. Mapping a field as a geo point is all done by Elasticsearch.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.