Distinct count of many fields in a single ES query


(Shemil R) #1

Hi,

I am trying to find the distinct values from one of my index. My requirement is, need to find distinct count of more than one filed and I am using group by also. I am getting the distinct value of single field. But I have to find distinct count of many fields.

This is the part of mysql query for easy understanding.

COUNT(DISTINCT CASE WHEN sm.zygosity = 'Heterozygous' THEN sm.variant_id END) het,
			COUNT(DISTINCT CASE WHEN sm.zygosity = 'Homozygous' THEN sm.variant_id END) hom,
			CASE WHEN sm.recessive = 'Y' THEN COUNT(DISTINCT sm.variant_id) ELSE 0 END rec,
			CASE WHEN sm.dominant = 'Y' THEN COUNT(DISTINCT sm.variant_id) ELSE 0 END dom,
			COUNT(DISTINCT CASE WHEN sm.varclass IN ('INTRONIC-SS-ACR-PRX', 'INTRONIC-SS-DNR-PRX', 'INTRONIC-SS-DNR', 'INTRONIC-SS-ACR') THEN sm.variant_id ELSE NULL END) as spl,
			COUNT(DISTINCT CASE WHEN sm.varclass IN ('MISSENSE', 'MISSENSE-SS-PRX') THEN sm.variant_id ELSE NULL END) as missence,
			COUNT(DISTINCT CASE WHEN sm.varclass IN ('NONSENSE', 'NONSENSE-SS-PRX') THEN sm.variant_id ELSE NULL END) as nonsense,
			COUNT(DISTINCT CASE WHEN sm.varclass IN ('FRAMESHIFT-DEL', 'FRAMESHIFT-DEL-SS-PRX', 'FRAMESHIFT-INS', 'FRAMESHIFT-INS-SS-PRX', 'INFRAME-DEL', 'INFRAME-INS') THEN sm.variant_id ELSE NULL END) as frameshift,
			COUNT(DISTINCT CASE WHEN disease_annotations.disease IS NOT NULL THEN sm.variant_id ELSE NULL END) disease_associated,

This is the ES query which I am using now. This is working for single field.

"gene_group" => Array
                    (
                        "terms" => Array
                            (
                                "field" => 'gene_name',
                            ),
                   
                       "aggs" => [
                       		"variant_count" => Array
		                        (
		                            "cardinality" => Array
		                                (
		                                    "field" => 'variant_id',
		                                    "precision_threshold" => 40000,
		                                )

		                        ),
		                        "distinct" => Array
		                        (
		                            "terms" => Array
	                                (
	                                    "field" => 'zygosity'
	                                )
		                        ),
		                        

                       	],
                   	),

Please help me for finding these distinct counts.


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.