Hi,
I am trying to find the distinct values from one of my index. My requirement is, need to find distinct count of more than one filed and I am using group by also. I am getting the distinct value of single field. But I have to find distinct count of many fields.
This is the part of mysql query for easy understanding.
COUNT(DISTINCT CASE WHEN sm.zygosity = 'Heterozygous' THEN sm.variant_id END) het,
COUNT(DISTINCT CASE WHEN sm.zygosity = 'Homozygous' THEN sm.variant_id END) hom,
CASE WHEN sm.recessive = 'Y' THEN COUNT(DISTINCT sm.variant_id) ELSE 0 END rec,
CASE WHEN sm.dominant = 'Y' THEN COUNT(DISTINCT sm.variant_id) ELSE 0 END dom,
COUNT(DISTINCT CASE WHEN sm.varclass IN ('INTRONIC-SS-ACR-PRX', 'INTRONIC-SS-DNR-PRX', 'INTRONIC-SS-DNR', 'INTRONIC-SS-ACR') THEN sm.variant_id ELSE NULL END) as spl,
COUNT(DISTINCT CASE WHEN sm.varclass IN ('MISSENSE', 'MISSENSE-SS-PRX') THEN sm.variant_id ELSE NULL END) as missence,
COUNT(DISTINCT CASE WHEN sm.varclass IN ('NONSENSE', 'NONSENSE-SS-PRX') THEN sm.variant_id ELSE NULL END) as nonsense,
COUNT(DISTINCT CASE WHEN sm.varclass IN ('FRAMESHIFT-DEL', 'FRAMESHIFT-DEL-SS-PRX', 'FRAMESHIFT-INS', 'FRAMESHIFT-INS-SS-PRX', 'INFRAME-DEL', 'INFRAME-INS') THEN sm.variant_id ELSE NULL END) as frameshift,
COUNT(DISTINCT CASE WHEN disease_annotations.disease IS NOT NULL THEN sm.variant_id ELSE NULL END) disease_associated,
This is the ES query which I am using now. This is working for single field.
"gene_group" => Array
(
"terms" => Array
(
"field" => 'gene_name',
),
"aggs" => [
"variant_count" => Array
(
"cardinality" => Array
(
"field" => 'variant_id',
"precision_threshold" => 40000,
)
),
"distinct" => Array
(
"terms" => Array
(
"field" => 'zygosity'
)
),
],
),
Please help me for finding these distinct counts.