Print arabic characters in logstash

Hello,

I created a python script which gathers tweets using “Tweepy”. Then, I return each tweet in a certain dict format. My tweets may contain arabic and latin caracters.

This is my python script:

import tweepy
import json

credentials = {
    "consumer_key": "4qpWxxx",
    "consumer_secret": "maHAYxxx",
    "access_key": "9260767xxx",
    "access_secret": "FYlz7Rxxx"}

auth = tweepy.OAuthHandler(credentials["consumer_key"], credentials["consumer_secret"])
auth.set_access_token(credentials["access_key"], credentials["access_secret"])

############ Tweepy version 4.10.0 ############
from datetime import date
today = date.today()
since = today.strftime("%Y-%m-%d")

api = tweepy.API(auth, 
                 retry_count=10,
                 timeout=300)

search = tweepy.Cursor(api.search_tweets,
                       q=["الجزائر"],
                       count=maxTweets,
                       tweet_mode = "extended",
                       result_type="mixed",                     
                       since="2022-07-21",
                       include_entities=True
                       )
val=  []
for tweet in search.items(10):
    val.append(tweet)

Tweets =[]
for i in range(len(val)):    
    model={    
    "text": val[i].full_text,
    "id_str": val[i].id_str,
    "lang": val[i].lang    
    }    
    
    Tweets.append(model)    
    
resultat= {"results": Tweets}	 
print(resultat)

then , I gather the returned value using the exec plugin of logstash like this:

input {
exec {
command => 'python python_twitter.py'
interval => 1000

}
}


output {
stdout {
codec => rubydebug
}
}

I have two prblems with this solution:
1- tweepy prints automatically some messages 'exp: unexcepcted value since" . I want to stop this printings.

2- i have a charset error when the tweet is in arabic. this error is returned by logstash

any help please.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.