Aggregations (SQL converted to ES)?


(Trevor Oakley) #1

I am new to ES and converting an old sql system which slowed down to ES. I have really struggled with aggregations. I am using the Java API. I would welcome any suggestions. I will outline the problem below.

Old SQL - update word counts, two tables (wordpair, count) and (partnumber, wordpair). The system uses a group by on the partnumber, wordpair table to count all the word pairs and this is then updated on the wordpair table to show frequency (used in word density analysis). Hence we have some joins, group by, counts. This is fairly simple in SQL.

In ES - I have tried various forms of Terms, count, and others but nothing really is working and I am unsure how to proceed.

My latest attempt is below -
SearchResponse sr = client.prepareSearch("partwordpairindexfullarray2").setSearchType("count")
.setQuery(QueryBuilders.matchAllQuery())
.addAggregation(
AggregationBuilders
.count("agg")
.field("wordpairfullarray"))
.execute().actionGet();
Terms agg1 = sr.getAggregations().get("agg");
System.out.println(sr);
eg wordpair = white+eppr2005wc**
"_index" : "wordpairindex",
"_type" : "wordpair",
"_id" : "white eppr2005wc",
"_score" : 1.0,
"_source":{"wordpair":"white eppr2005wc","wordpaircount":1,"hexwordpair":"77686974652065707072323030357763"}
}
eg wordpairlist for partnumber S.TRAY051 is 1200mm+x etc
{
"_index" : "partwordpairindexfullarray2",
"_type" : "partwordpairfullarray2",
"_id" : "AVMnbwyRJHndLaDHa1Lv",
"_score" : 1.0,
"_source":{"partnumber":"S.TRAY051","wordpairfullarray":"[{"wordpairfull":"1200mm;x"},{"wordpairfull":"x;900mm"},{"wordpairfull":"900mm;rh"},{"wordpairfull":"rh;mere"},{"wordpairfull":"mere;svelte"},{"wordpairfull":"svelte;low"},{"wordpairfull":"low;level"},{"wordpairfull":"level;offset"},{"wordpairfull":"offset;quadrant"},{"wordpairfull":"quadrant;shower"},{"wordpairfull":"shower;tray"},{"wordpairfull":"tray;s"},{"wordpairfull":"s;tray051"},{"wordpairfull":"tray051;1200mm"}]","hexmanpartnumber":"732e74726179303531","hexshopname":"313230306d6d","wordpairs":":1200mm;x:x;900mm:900mm;rh:rh;mere:mere;svelte:svelte;low:low;level:level;offset:offset;quadrant:quadrant;shower:shower;tray:tray;s:s;tray051:tray051;1200mm:1200mm;x:x;900mm:900mm;rh:rh;mere:mere;svelte:svelte;low:low;level:level;offset:offset;quadrant:quadrant;shower:shower;tray:tray;s:s;tray051:tray051;1200mm:","hexpartnumber":"732e74726179303531","shopname":"1200mm","hexwordpairs":"3a313230306d6d3b783a783b3930306d6d3a3930306d6d3b72683a72683b6d6572653a6d6572653b7376656c74653a7376656c74653b6c6f773a6c6f773b6c6576656c3a6c6576656c3b6f66667365743a6f66667365743b7175616472616e743a7175616472616e743b73686f7765723a73686f7765723b747261793a747261793b733a733b747261793035313a747261793035313b313230306d6d3a313230306d6d3b783a783b3930306d6d3a3930306d6d3b72683a72683b6d6572653a6d6572653b7376656c74653a7376656c74653b6c6f773a6c6f773b6c6576656c3a6c6576656c3b6f66667365743a6f66667365743b7175616472616e743a7175616472616e743b73686f7765723a73686f7765723b747261793a747261793b733a733b747261793035313a747261793035313b313230306d6d3a","manpartnumber":"S.TRAY051"}
}


(Tim Garrett) #2

Hi Trevor,

I know this is an old post, but the topic intrigued me and I saw you hadn't had any replies. Did you come up with a good solution to this? I was curious whether your concept of word pairs lent itself to a usage of a more built-in concept in Elasticsearch like bigram shingles (https://www.elastic.co/guide/en/elasticsearch/guide/current/shingles.html).

Thanks,
Tim


(system) #3