Hello.
I am parsing twitter with following settings:
input {
twitter {
consumer_key => "XX"
consumer_secret => "XX"
oauth_token => "XX"
oauth_token_secret => "XX"
full_tweet => true
use_samples => true
languages => ["en", "de"]
}
}
output {
elasticsearch {
hosts => ["10.0.20.51:9200"]
index => "tweets-%{+YYYY.MM.dd}"
}
}
I do not need the massive json with more than 900 fields being in my ES.
For example:
~{
"_index": "tweets-2018.07.24",
"_type": "doc",
"_id": "wE6hzGQB2mGdQWLhJvXj",
"_version": 1,
"_score": null,
"_source": {
"entities": {
"hashtags": [],
"urls": [
{
"expanded_url": "https://twitter.com/i/web/status/1021759356671598592",
"display_url": "twitter.com/i/web/status/1…",
"url": "https://t.co/TVqHFnvmUG",
"indices": [
117,
140
]
}
],
"user_mentions": [],
"symbols": []
},
"text": "En todo lo que va del año hasta ahora entraba a gim 10.20 pensando que era ese el horario (y encima llegaba tarde),… https://t.co/TVqHFnvmUG",
"in_reply_to_user_id_str": null,
"extended_tweet": {
"full_text": "En todo lo que va del año hasta ahora entraba a gim 10.20 pensando que era ese el horario (y encima llegaba tarde), hoy me enteré que entrábamos a las 11🤦",
"display_text_range": [
0,
154
],
"entities": {
"hashtags": [],
"urls": [],
"user_mentions": [],
"symbols": []
}
},
"quote_count": 0,
"geo": null,
"timestamp_ms": "1532441388658",
"@timestamp": "2018-07-24T14:09:48.000Z",
"favorited": false,
"reply_count": 0,
"truncated": true,
"contributors": null,
"in_reply_to_status_id_str": null,
"place": null,
"lang": "es",
"is_quote_status": false,
"@version": "1",
"retweet_count": 0,
"favorite_count": 0,
"source": "<a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android",
"filter_level": "low"
}
How can I extract only following fields:
"@timestamp":
"lang":
etc.
using filter?
filter {
json {
source => "@timestamp"
}
}
It is so confusing for me.
If anyone could point me to the right place, or show how to filter the given fields, would be amazing.
Thanks!