I want have my own wordcloud that shows phrase more than single word.
I use term aggregation in elasticsearch , and shingleFilter for tokenize token stream to get phrase result.
so result is not suitable for me.
i want to score phrase instead of single word , i mean in text field may be i have " A or B and C " , now it is suitable for me to get "A or B" above than "A" and "B" . so how i can do that? can i score phrase cross single word?
and my next issue is stop words ...
if i remove stop words in indexing field , i guess it is not suitable for me . because in this case i create a non-real phrase .for example if i have "A or B " and i remove stop words in indexing data so i get "A B" in wordcloud result , but may be "A B" is meaningless word.
so i don't remove stop words in indexind data, and want to remove this after shingle filter tokenize my text, then i want to remove all the stop words that occur start or end of the phrases.is it correct ?or any best approach any one has to create my owncloud in best way.