Filter facet values on another field

Hi,

We store in arrays values with related flags. The following example stores
if a book is available in a specific format (we need this format for the
none ElasticSearch-part of our application):

{
"formats" : [
{ "name" : "paperback" , "available" : true },
{ "name" : "hardcover" , "available" : false }
]
}

We want to create facets for "available formats" (would contain
"paperback") and "unavailable formats" (would contain "hardcover"), i.e. if
a "name" is included as a facet value depends on the field "available".

What's the best way to achieve this with ElasticSearch? Can we create a
mapping which tells the indexer to add only "name" values into a field
"available_formats" if the related "available" value is true? Or would we
have to do this separation already before the data is send to ElasticSearch?

Many thanks!

Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Can you illustrate a little what kind of output you are expecting?
I'm wondering if you could use here 4 filter_facet, one for each combination of name/availabilty.

But you need to use nested documents prior to this.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 21:05, Ron elasticsearch@collaborne.com a écrit :

Hi,

We store in arrays values with related flags. The following example stores if a book is available in a specific format (we need this format for the none Elasticsearch-part of our application):

{
"formats" : [
{ "name" : "paperback" , "available" : true },
{ "name" : "hardcover" , "available" : false }
]
}

We want to create facets for "available formats" (would contain "paperback") and "unavailable formats" (would contain "hardcover"), i.e. if a "name" is included as a facet value depends on the field "available".

What's the best way to achieve this with Elasticsearch? Can we create a mapping which tells the indexer to add only "name" values into a field "available_formats" if the related "available" value is true? Or would we have to do this separation already before the data is send to Elasticsearch?

Many thanks!

Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Can you illustrate a little what kind of output you are expecting?
I'm wondering if you could use here 4 filter_facet, one for each combination of name/availabilty.

But you need to use nested documents prior to this.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 21:05, Ron elasticsearch@collaborne.com a écrit :

Hi,

We store in arrays values with related flags. The following example stores if a book is available in a specific format (we need this format for the none Elasticsearch-part of our application):

{
"formats" : [
{ "name" : "paperback" , "available" : true },
{ "name" : "hardcover" , "available" : false }
]
}

We want to create facets for "available formats" (would contain "paperback") and "unavailable formats" (would contain "hardcover"), i.e. if a "name" is included as a facet value depends on the field "available".

What's the best way to achieve this with Elasticsearch? Can we create a mapping which tells the indexer to add only "name" values into a field "available_formats" if the related "available" value is true? Or would we have to do this separation already before the data is send to Elasticsearch?

Many thanks!

Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi David,

Many thanks for your response!

The output should look like two normal facets:
"facets" : {
"available_formats" : {
...
"terms" : [ {
"term" : "paperback",
"count" : 10
},
{
"term" : "hardcover",
"count" : 7
},
{
"term" : "audio",
"count" : 3
} ]
},
"unavailable_formats" : {
...
"terms" : [ {
"term" : "paperback",
"count" : 4
},
{
"term" : "hardcover",
"count" : 8
},
{
"term" : "audio",
"count" : 1
} ]
}
}

The Paperback/hardcover field is just an example. We have other examples
where there is a flag (like 'available') but 100s of values (like
'paperback/hardcover').

Ron

On Friday, June 14, 2013 6:00:57 PM UTC+2, David Pilato wrote:

Can you illustrate a little what kind of output you are expecting?
I'm wondering if you could use here 4 filter_facet, one for each
combination of name/availabilty.

But you need to use nested documents prior to this.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 13 juin 2013 à 21:05, Ron <elasti...@collaborne.com <javascript:>> a
écrit :

Hi,

We store in arrays values with related flags. The following example stores
if a book is available in a specific format (we need this format for the
none Elasticsearch-part of our application):

{
"formats" : [
{ "name" : "paperback" , "available" : true },
{ "name" : "hardcover" , "available" : false }
]
}

We want to create facets for "available formats" (would contain
"paperback") and "unavailable formats" (would contain "hardcover"), i.e. if
a "name" is included as a facet value depends on the field "available".

What's the best way to achieve this with Elasticsearch? Can we create a
mapping which tells the indexer to add only "name" values into a field
"available_formats" if the related "available" value is true? Or would we
have to do this separation already before the data is send to Elasticsearch?

Many thanks!

Ron

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Why not indexing your documents in another way?
{
"formats" : {
"available" : [ "paperback" ],
"unavailable" : [ "hardcover" ]
}
}

Does it help?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 18 juin 2013 à 16:55, Ron elasticsearch@collaborne.com a écrit :

Hi David,

Many thanks for your response!

The output should look like two normal facets:
"facets" : {
"available_formats" : {
...
"terms" : [ {
"term" : "paperback",
"count" : 10
},
{
"term" : "hardcover",
"count" : 7
},
{
"term" : "audio",
"count" : 3
} ]
},
"unavailable_formats" : {
...
"terms" : [ {
"term" : "paperback",
"count" : 4
},
{
"term" : "hardcover",
"count" : 8
},
{
"term" : "audio",
"count" : 1
} ]
}
}

The Paperback/hardcover field is just an example. We have other examples where there is a flag (like 'available') but 100s of values (like 'paperback/hardcover').

Ron

On Friday, June 14, 2013 6:00:57 PM UTC+2, David Pilato wrote:
Can you illustrate a little what kind of output you are expecting?
I'm wondering if you could use here 4 filter_facet, one for each combination of name/availabilty.

But you need to use nested documents prior to this.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 21:05, Ron elasti...@collaborne.com a écrit :

Hi,

We store in arrays values with related flags. The following example stores if a book is available in a specific format (we need this format for the none Elasticsearch-part of our application):

{
"formats" : [
{ "name" : "paperback" , "available" : true },
{ "name" : "hardcover" , "available" : false }
]
}

We want to create facets for "available formats" (would contain "paperback") and "unavailable formats" (would contain "hardcover"), i.e. if a "name" is included as a facet value depends on the field "available".

What's the best way to achieve this with Elasticsearch? Can we create a mapping which tells the indexer to add only "name" values into a field "available_formats" if the related "available" value is true? Or would we have to do this separation already before the data is send to Elasticsearch?

Many thanks!

Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Pre-processing the input data is indeed what we ended up with. It's just
not so nice because it requires us to store the data in both formats (the
original one which we need for the rest of our application and the one for
Elasticsearch) in MongoDB, from where it's pushed via the MongoDB river.

It would had been cool if we could have defined this filtering just on the
Elasticsearch indexing level. Anyway, now we implemented the pre-processing
logic and push the data in the proposed format into Elasticsearch.

Ron

On Tuesday, June 18, 2013 5:31:56 PM UTC+2, David Pilato wrote:

Why not indexing your documents in another way?
{
"formats" : {
"available" : [ "paperback" ],
"unavailable" : [ "hardcover" ]
}
}

Does it help?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 18 juin 2013 à 16:55, Ron <elasti...@collaborne.com <javascript:>> a
écrit :

Hi David,

Many thanks for your response!

The output should look like two normal facets:
"facets" : {
"available_formats" : {
...
"terms" : [ {
"term" : "paperback",
"count" : 10
},
{
"term" : "hardcover",
"count" : 7
},
{
"term" : "audio",
"count" : 3
} ]
},
"unavailable_formats" : {
...
"terms" : [ {
"term" : "paperback",
"count" : 4
},
{
"term" : "hardcover",
"count" : 8
},
{
"term" : "audio",
"count" : 1
} ]
}
}

The Paperback/hardcover field is just an example. We have other examples
where there is a flag (like 'available') but 100s of values (like
'paperback/hardcover').

Ron

On Friday, June 14, 2013 6:00:57 PM UTC+2, David Pilato wrote:

Can you illustrate a little what kind of output you are expecting?
I'm wondering if you could use here 4 filter_facet, one for each
combination of name/availabilty.

But you need to use nested documents prior to this.

--
David Pilato | Technical Advocate | *Elasticsearch.comhttp://elasticsearch.com/
*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 13 juin 2013 à 21:05, Ron elasti...@collaborne.com a écrit :

Hi,

We store in arrays values with related flags. The following example
stores if a book is available in a specific format (we need this format for
the none Elasticsearch-part of our application):

{
"formats" : [
{ "name" : "paperback" , "available" : true },
{ "name" : "hardcover" , "available" : false }
]
}

We want to create facets for "available formats" (would contain
"paperback") and "unavailable formats" (would contain "hardcover"), i.e. if
a "name" is included as a facet value depends on the field "available".

What's the best way to achieve this with Elasticsearch? Can we create a
mapping which tells the indexer to add only "name" values into a field
"available_formats" if the related "available" value is true? Or would we
have to do this separation already before the data is send to Elasticsearch?

Many thanks!

Ron

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

May be you can use a ScriptFilter in MongoDB river to handle it: Home · richardwilly98/elasticsearch-river-mongodb Wiki · GitHub

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 18 juin 2013 à 18:14, Ron elasticsearch@collaborne.com a écrit :

Pre-processing the input data is indeed what we ended up with. It's just not so nice because it requires us to store the data in both formats (the original one which we need for the rest of our application and the one for Elasticsearch) in MongoDB, from where it's pushed via the MongoDB river.

It would had been cool if we could have defined this filtering just on the Elasticsearch indexing level. Anyway, now we implemented the pre-processing logic and push the data in the proposed format into Elasticsearch.

Ron

On Tuesday, June 18, 2013 5:31:56 PM UTC+2, David Pilato wrote:
Why not indexing your documents in another way?
{
"formats" : {
"available" : [ "paperback" ],
"unavailable" : [ "hardcover" ]
}
}

Does it help?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 18 juin 2013 à 16:55, Ron elasti...@collaborne.com a écrit :

Hi David,

Many thanks for your response!

The output should look like two normal facets:
"facets" : {
"available_formats" : {
...
"terms" : [ {
"term" : "paperback",
"count" : 10
},
{
"term" : "hardcover",
"count" : 7
},
{
"term" : "audio",
"count" : 3
} ]
},
"unavailable_formats" : {
...
"terms" : [ {
"term" : "paperback",
"count" : 4
},
{
"term" : "hardcover",
"count" : 8
},
{
"term" : "audio",
"count" : 1
} ]
}
}

The Paperback/hardcover field is just an example. We have other examples where there is a flag (like 'available') but 100s of values (like 'paperback/hardcover').

Ron

On Friday, June 14, 2013 6:00:57 PM UTC+2, David Pilato wrote:
Can you illustrate a little what kind of output you are expecting?
I'm wondering if you could use here 4 filter_facet, one for each combination of name/availabilty.

But you need to use nested documents prior to this.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 13 juin 2013 à 21:05, Ron elasti...@collaborne.com a écrit :

Hi,

We store in arrays values with related flags. The following example stores if a book is available in a specific format (we need this format for the none Elasticsearch-part of our application):

{
"formats" : [
{ "name" : "paperback" , "available" : true },
{ "name" : "hardcover" , "available" : false }
]
}

We want to create facets for "available formats" (would contain "paperback") and "unavailable formats" (would contain "hardcover"), i.e. if a "name" is included as a facet value depends on the field "available".

What's the best way to achieve this with Elasticsearch? Can we create a mapping which tells the indexer to add only "name" values into a field "available_formats" if the related "available" value is true? Or would we have to do this separation already before the data is send to Elasticsearch?

Many thanks!

Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.