Converting queries returning certain distinct records to ES


(heather) #1

Hello

I am currently trying to migrate an sql application to Elasticsearch.

I need to be able to select a collection of results from an index
which, for given search conditions, have distinct pairings of two certain
columns. In sql I do the following two queries:

Query 1:
SELECT column_A, column_B, GROUP_CONCAT (table_name..id) id FROM
table_name WHERE column_? = '' GROUP BY column_A, column_B,column_
?

Query 2:
SELECT table_name.* FROM table_name WHERE column_? = '' AND
(table_name.id IN (<ids_from_previous_query>))

The first query returns me a list of ids from table_name such that each id
satisfies the condition column_? = '' and the record with that
id has a distinct [column_A,column_B]

The second query then returns me all the records satisfying column_? =
'' but only from that range of ids (I realise I probably do not
need to do column_? = ' again in the second query.)

The result is that each record returned by the second query has satisfies
the condition column_? = '' and I am only returned one
record for each [column_A,column_B] paring.

Since there is not really a 'distinct' option yet I am having trouble
finding a way replicate this output with ES and wondered if anyone might
have any thoughts as how I might go about it?

At the moment I am open to any mapping / query combinations that will
achieve what I need.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a857778-0399-4b3c-9973-a3e353436311%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

May be you could find a way to do that with a single query if you design your documents in another way?
Or using facets for the first query and Ids filter for the second?
It's hard to tell without a concrete example of JSON documents.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 9 janvier 2014 at 01:28:06, heather@hodgetastic.com (heather@hodgetastic.com) a écrit:

Hello

I am currently trying to migrate an sql application to Elasticsearch.

I need to be able to select a collection of results from an index which, for given search conditions, have distinct pairings of two certain columns. In sql I do the following two queries:

Query 1:

SELECT column_A, column_B, GROUP_CONCAT (table_name..id) id FROM table_name WHERE column_? = '' GROUP BY column_A, column_B, column_?
Query 2:

SELECT table_name.* FROM table_name WHERE column_? = '' AND (table_name.id IN (<ids_from_previous_query>))
The first query returns me a list of ids from table_name such that each id satisfies the condition column_? = '' and the record with that id has a distinct [column_A,column_B]

The second query then returns me all the records satisfying column_? = '' but only from that range of ids (I realise I probably do not need to do column_? = ' again in the second query.)

The result is that each record returned by the second query has satisfies the condition column_? = '' and I am only returned one record for each [column_A,column_B] paring.

Since there is not really a 'distinct' option yet I am having trouble finding a way replicate this output with ES and wondered if anyone might have any thoughts as how I might go about it?

At the moment I am open to any mapping / query combinations that will achieve what I need.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a857778-0399-4b3c-9973-a3e353436311%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52ce5655.354fe9f9.1449b%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(heather) #3

Okay, thank you for your response, here is an attempt of an example of what
I am trying to achieve.

Lets say I have the documents;

{

id: 1

name: peter

class: 2

grade: b

hair:grey

}

{

id:2

name: paul

class:2

grade:b

hair:purple

}

{

id:3

name:john

class:1

grade:b

hair:grey

}

{

id:4

name:sandra

class:1

grade:a

hair:green

}

{

id:5

name:sarah

class:1

grade:a

hair:green

}

Initially I want to get only one student from each possible [class, grade]
combinaion so I want ES to return peter, john and sandra but not paul or
sarah . The grades will range from the letters [a,b,c,d,e] but the class
could be anything.

Additionally I might want to add a condition to this, such as only getting
students with green hair. In that case I would only want to return sandra
as while sarah has green hair - they have the same [class,grade] as sandra.

I thought about using facets for the first query but I cannot see how that
would give me a collection of the right ids to make the second query with.

On Thursday, January 9, 2014 7:57:09 AM UTC, David Pilato wrote:

May be you could find a way to do that with a single query if you design
your documents in another way?
Or using facets for the first query and Ids filter for the second?
It's hard to tell without a concrete example of JSON documents.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 9 janvier 2014 at 01:28:06, hea...@hodgetastic.com <javascript:> (
hea...@hodgetastic.com <javascript:>) a écrit:

Hello

I am currently trying to migrate an sql application to Elasticsearch.

I need to be able to select a collection of results from an index
which, for given search conditions, have distinct pairings of two certain
columns. In sql I do the following two queries:

Query 1:
SELECT column_A, column_B, GROUP_CONCAT (table_name..id) id FROM
table_name WHERE column_? = '' GROUP BY column_A, column_B,
column_?

Query 2:
SELECT table_name.* FROM table_name WHERE column_? =
'' AND (table_name.id IN (<ids_from_previous_query>))

The first query returns me a list of ids from table_name such that each id
satisfies the condition column_? = '' and the record with that
id has a distinct [column_A,column_B]

The second query then returns me all the records satisfying column_? =
'' but only from that range of ids (I realise I probably do not
need to do column_? = ' again in the second query.)

The result is that each record returned by the second query has satisfies
the condition column_? = '' and I am only returned one
record for each [column_A,column_B] paring.

Since there is not really a 'distinct' option yet I am having trouble
finding a way replicate this output with ES and wondered if anyone might
have any thoughts as how I might go about it?

At the moment I am open to any mapping / query combinations that will
achieve what I need.

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6a857778-0399-4b3c-9973-a3e353436311%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/49dc9bd6-cccc-4398-aabf-7133852907e5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4