Hey,
We're building an application that allows people to save the Twitter handle
of a person and an associated tweet. The user has to select the handle and
then it fetches the tweet using the Twitter API. This information is stored
in an ElasticSearch document afterwards. The Twitter field is initialized
at {} (empty JSON object, we're using a Javascript ES wrapper). The
following is (part of) the mapping:
user: { type: 'String', index: 'not_analyzed' },
location : { type: 'String', analyzer: 'lowercase_keyword' },
tags : { type: 'String', index: 'not_analyzed' },
twitter : {
properties : {
handle : { type: 'String', index: 'not_analyzed' },
tweet : { type: 'String', index: 'not_analyzed' },
age : { type: 'Date' }
},
type : 'nested'
}
The location and tags fields are already filled in when we find the handle
and tweet of the user. When we find a handle and a tweet, we set all the
twitter fields (including the age, which is set at the current date), so it
becomes something like:
{
handle: 'twitter_handle',
tweet: 'my last tweet',
age: '....'
}
Now, after this twitter info is added, the score of the document changes,
when using a query that does not search these specific twitter fields. The
following query is used to search for documents we need:
query {
bool: {
must: {
[{
text: {
'tags': {
query: 'Technology',
operator: 'and'
}
}
},
{
text: {
'location': {
query: 'Madrid',
operator: 'and'
}
}
},
{
text: {
'user' : {
query: queryArgs.email,
operator: 'and'
}
}
}]
}
}
}
I have set the explanation to true and see that the score is lower after
adding the twitter information (to the same document as the tags, location
and user), but I don't understand why. Does this have to do with the term
frequency factor tf? Or what's the exact explanation for this behaviour?
Should I keep this twitter specific information in another type in the
index, with a link (by ID) to it to not affect the score? How can I avoid
the change in score in general while still adding fields to the document?
Thanks!