I have a project in NodeJS with TypeScript that uses the library "@elastic/elasticsearch": "^8.7.0", to connect the server with Elastic Search 8.7.1.
I'm trying to make a query where I retrieve the frequency of words from the "message" field.
I tried several approaches, but I only get it to return the entire sentence!
I need you to bring the words and how many times they appeared in total.
This is my function that creates the index and field mapping:
private async createIndexIfNotExists(indexName: string): Promise<void> {
const indexExists = await this.client.indices.exists({ index: indexName });
if (indexExists) return;
await this.client.indices.create({
index: indexName,
body: {
mappings: {
properties: {
id: {
type: 'keyword'
},
remoteJid: {
type: 'keyword'
},
participant: {
type: 'keyword'
},
fromMe: {
type: 'boolean'
},
message: {
type: 'text',
fields: {
raw: {
type: 'keyword'
}
}
},
}
}
},
});
};
This is my function that inserts the documents:
const handleMessage = async (
msg: proto.IWebMessageInfo,
wbot: Session
): Promise<void> => {
try {
if (msg.key.remoteJid && !msg.key.remoteJid.includes("@g.us")) return;
const message = getBodyMessage(msg);
const index = "groups";
const document = {
id: msg.key.id,
remoteJid: msg.key.remoteJid,
participant: msg.key.participant,
fromMe: msg.key.fromMe,
message: message,
};
const es = await esService.insertDocument(index, document);
} catch (err: any) {
logger.error(`CATCH HANDLEMESSAGE: ${err}`);
}
};
public async insertDocument(index: string, document: any): Promise<any> {
try {
await this.createIndexIfNotExists(index);
const response = await this.client.index({
index,
refresh: true,
document
});
return response;
} catch (error) {
console.error(error);
throw new Error('Erro ao inserir documento no Elasticsearch');
}
};
This is my function that executes the query where the frequency of the words should appear:
public async cloudWords(): Promise<Record<string, any>> {
const indexName = 'groups';
const query = {
size: 0,
aggs: {
palavras: {
terms: {
field: 'message.raw',
}
}
}
};
const result = await this.client.search({
index: indexName,
body: query,
});
return result;
};
But it always returns me this (whole sentence):
"message": {
"took": 11,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 7,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"palavras": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Alguém conseguiu",
"doc_count": 1
},
{
"key": "Atualizar?",
"doc_count": 1
},
{
"key": "Com Quepasa por enquanto, só vai funcionar se estiver tudo com localhost, pra url do webhook ficar menor",
"doc_count": 1
},
{
"key": "Imporam limite de caracteres",
"doc_count": 1
},
{
"key": "Meu esposo vc está com uma ótima fisionomia tá bonito ❤️ só Deus sabe o tanto de saudades q estou de vc meu velho ❤️ que Deus e Jesus te abençoe hoje e sempre 🙏🏻 como eu queria vc aqui comigo más pra Deus nada é impossível força e fé sempre🙏🏻 amém amém",
"doc_count": 1
},
{
"key": "Sim",
"doc_count": 1
},
{
"key": "To na 17 ja",
"doc_count": 1
}
]
}
}
}
}
My goal is to assemble a cloud of words by frequency.
Can anyone help? Where am I going wrong?