I have tweets stored in a text field with field Data True. I want to remove the english stopwords and get a count of each word in the tweets.
Data Mapping:
{
"tweets" : {
"mappings" : {
"properties" : {
"_class" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"text" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
},
"fielddata" : true
}
}
}
}
}
Example of data:
{
"_index" : "tweets",
"_type" : "_doc",
"_id" : "1503057742633869318",
"_score" : 1.0,
"_source" : {
"_class" : "com.twitter.elastic.models.Tweet",
"text" : """RT @Ceszie_: youve worked hard! the show was awesome and fantastic! you did great!
make sure you rest well okay?
#ThankyouBTS πππ @BTS_β¦""",
"id" : "1503057742633869318"
}
},
{
"_index" : "tweets",
"_type" : "_doc",
"_id" : "1503057796983451651",
"_score" : 1.0,
"_source" : {
"_class" : "com.twitter.elastic.models.Tweet",
"text" : """RT @btsyauRJ2: Summary for D1, D2 & D3 by me, i will treasure these three days forever <3 BTS BTS BTS ππ₯Ί
#ThankYouBTS @BTS_twt
#PTD_ON_STβ¦""",
"id" : "1503057796983451651"
}
}