JSON structure & querying object hierarchies

Hi all,

Apologies if this is a silly question, but I'm still learning the
commonalities of JSON. I'm able to make a JSON structure which is
being indexed, but not queryable using object-dot-notation. eg. a
document with:

"id" : ["12345",
{
"special" : "09876"
}

If I do an open query for either value (eg. ?q=12345 or q?=09876), I
get the document. But, I cannot query for "id.special" or simply
"id" (eg. ?q=id:12345 or ?q=id.special:09876), neither match.

What I'm looking for is to be able to query a term in a "hierarchy"
and have it match values within/below, whether in objects, arrays, etc
(eg. ?q=id:09876 would match). Am I searching incorrectly or is there
a proper way to express this in JSON that ES could handle? Is
something like this even possible?

Thanks in advance,
MJ

Can you oust the json you index, the one you specified is not proper
JSON....

On Mon, Jun 14, 2010 at 10:45 PM, MJ Suhonos suhonos@gmail.com wrote:

Hi all,

Apologies if this is a silly question, but I'm still learning the
commonalities of JSON. I'm able to make a JSON structure which is
being indexed, but not queryable using object-dot-notation. eg. a
document with:

"id" : ["12345",
{
"special" : "09876"
}

If I do an open query for either value (eg. ?q=12345 or q?=09876), I
get the document. But, I cannot query for "id.special" or simply
"id" (eg. ?q=id:12345 or ?q=id.special:09876), neither match.

What I'm looking for is to be able to query a term in a "hierarchy"
and have it match values within/below, whether in objects, arrays, etc
(eg. ?q=id:09876 would match). Am I searching incorrectly or is there
a proper way to express this in JSON that ES could handle? Is
something like this even possible?

Thanks in advance,
MJ

Apologies -- this got truncated. The document is actually quite
large, but the relevant portion (which does seem to get parsed fine by
ES and is valid JSON) is:

{
"id" : ["81881",
{
"special" : "051110"
}]
}

Thanks,
MJ

On Jun 14, 3:57 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Can you oust the json you index, the one you specified is not proper
JSON....

On Mon, Jun 14, 2010 at 10:45 PM, MJ Suhonos suho...@gmail.com wrote:

Hi all,

Apologies if this is a silly question, but I'm still learning the
commonalities of JSON. I'm able to make a JSON structure which is
being indexed, but not queryable using object-dot-notation. eg. a
document with:

"id" : ["12345",
{
"special" : "09876"
}

If I do an open query for either value (eg. ?q=12345 or q?=09876), I
get the document. But, I cannot query for "id.special" or simply
"id" (eg. ?q=id:12345 or ?q=id.special:09876), neither match.

What I'm looking for is to be able to query a term in a "hierarchy"
and have it match values within/below, whether in objects, arrays, etc
(eg. ?q=id:09876 would match). Am I searching incorrectly or is there
a proper way to express this in JSON that ES could handle? Is
something like this even possible?

Thanks in advance,
MJ

Instead of
{
"id" : ["81881",
{
"special" : "051110"
}]
}

Represent your json as
{
"id" : [81881, 051110]
}

Then if you query using q=id:81881 or q=id:051110, it will work.

Ideally you would want to keep your array elements homogenous.

On Jun 14, 4:49 pm, MJ Suhonos suho...@gmail.com wrote:

Apologies -- this got truncated. The document is actually quite
large, but the relevant portion (which does seem to get parsed fine by
ES and is valid JSON) is:

{
"id" : ["81881",
{
"special" : "051110"
}]

}

Thanks,
MJ

On Jun 14, 3:57 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Can you oust the json you index, the one you specified is not proper
JSON....

On Mon, Jun 14, 2010 at 10:45 PM, MJ Suhonos suho...@gmail.com wrote:

Hi all,

Apologies if this is a silly question, but I'm still learning the
commonalities of JSON. I'm able to make a JSON structure which is
being indexed, but not queryable using object-dot-notation. eg. a
document with:

"id" : ["12345",
{
"special" : "09876"
}

If I do an open query for either value (eg. ?q=12345 or q?=09876), I
get the document. But, I cannot query for "id.special" or simply
"id" (eg. ?q=id:12345 or ?q=id.special:09876), neither match.

What I'm looking for is to be able to query a term in a "hierarchy"
and have it match values within/below, whether in objects, arrays, etc
(eg. ?q=id:09876 would match). Am I searching incorrectly or is there
a proper way to express this in JSON that ES could handle? Is
something like this even possible?

Thanks in advance,
MJ

Yes, this is probably the problem here...

-shay.banon

On Tue, Jun 15, 2010 at 5:23 AM, diptamay diptamay@gmail.com wrote:

Instead of
{
"id" : ["81881",
{
"special" : "051110"
}]
}

Represent your json as
{
"id" : [81881, 051110]
}

Then if you query using q=id:81881 or q=id:051110, it will work.

Ideally you would want to keep your array elements homogenous.

On Jun 14, 4:49 pm, MJ Suhonos suho...@gmail.com wrote:

Apologies -- this got truncated. The document is actually quite
large, but the relevant portion (which does seem to get parsed fine by
ES and is valid JSON) is:

{
"id" : ["81881",
{
"special" : "051110"
}]

}

Thanks,
MJ

On Jun 14, 3:57 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Can you oust the json you index, the one you specified is not proper
JSON....

On Mon, Jun 14, 2010 at 10:45 PM, MJ Suhonos suho...@gmail.com
wrote:

Hi all,

Apologies if this is a silly question, but I'm still learning the
commonalities of JSON. I'm able to make a JSON structure which is
being indexed, but not queryable using object-dot-notation. eg. a
document with:

"id" : ["12345",
{
"special" : "09876"
}

If I do an open query for either value (eg. ?q=12345 or q?=09876), I
get the document. But, I cannot query for "id.special" or simply
"id" (eg. ?q=id:12345 or ?q=id.special:09876), neither match.

What I'm looking for is to be able to query a term in a "hierarchy"
and have it match values within/below, whether in objects, arrays,
etc
(eg. ?q=id:09876 would match). Am I searching incorrectly or is
there
a proper way to express this in JSON that ES could handle? Is
something like this even possible?

Thanks in advance,
MJ

Thanks diptamay, I was expecting the response was that Elastic Search
isn't intended to be used for searching in this way. Let me clarify
my use a case a bit to explain my approach. A clearer example is
this:

{
"title" : ["Robin Hood",
{
"subtitle" : "Men in tights"
}]

}

Ideally, I'd like to be able to do a search on q?title:tights and have
this match, since the user may not distinguish the subtitle (they may
simply think it is "part of the title"). Semantically, subtitle is
a sub-object of title, and q?title.subtitle:tights should also
match, if I were searching at that level of specificity.

I shouldn't have to separate them into homogeneous units, since q?
title:tights still wouldn't match, eg:

{
"title" : "Robin Hood",
"subtitle" : "Men in tights"
}

And I lose the semantic label of "subtitle" if I do:
{
"title" : ["Robin Hood", "Men in tights"]
}

I've been following Elastic Search for a little while now, and for my
problem domain (library science), having an index which is data-model
neutral but still provides the advanced structure, faceting, etc.
features is an ideal solution that we have been waiting for a very
long time. Unfortunately, if I have to limit my JSON structure to
ordered arrays and key-value objects in order to be able to search
them, then it's not a viable option.

Shay (and thanks in advance for taking the time to look at my
question, I really appreciate it a lot), is there an architectural
reason why ES can't handle this kind of JSON construct in the way I
describe above? I can imagine from a performance perspective that
indexing a hierarchy would be potentially expensive, but I'd be
willing to make that trade-off for this kind of functionality.

MJ

On Tue, Jun 15, 2010 at 5:23 AM, diptamay dipta...@gmail.com wrote:

Instead of
{
"id" : ["81881",
{
"special" : "051110"
}]
}

Represent your json as
{
"id" : [81881, 051110]
}

Then if you query using q=id:81881 or q=id:051110, it will work.

Ideally you would want to keep your array elements homogenous.

On Jun 14, 4:49 pm, MJ Suhonos suho...@gmail.com wrote:

Apologies -- this got truncated. The document is actually quite
large, but the relevant portion (which does seem to get parsed fine by
ES and is valid JSON) is:

{
"id" : ["81881",
{
"special" : "051110"
}]

}

Thanks,
MJ

On Jun 14, 3:57 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Can you oust the json you index, the one you specified is not proper
JSON....

On Mon, Jun 14, 2010 at 10:45 PM, MJ Suhonos suho...@gmail.com
wrote:

Hi all,

Apologies if this is a silly question, but I'm still learning the
commonalities of JSON. I'm able to make a JSON structure which is
being indexed, but not queryable using object-dot-notation. eg. a
document with:

"id" : ["12345",
{
"special" : "09876"
}

If I do an open query for either value (eg. ?q=12345 or q?=09876), I
get the document. But, I cannot query for "id.special" or simply
"id" (eg. ?q=id:12345 or ?q=id.special:09876), neither match.

What I'm looking for is to be able to query a term in a "hierarchy"
and have it match values within/below, whether in objects, arrays,
etc
(eg. ?q=id:09876 would match). Am I searching incorrectly or is
there
a proper way to express this in JSON that ES could handle? Is
something like this even possible?

Thanks in advance,
MJ

On Tue, 2010-06-15 at 09:52 -0700, MJ Suhonos wrote:

Thanks diptamay, I was expecting the response was that Elastic Search
isn't intended to be used for searching in this way. Let me clarify
my use a case a bit to explain my approach. A clearer example is
this:

{
"title" : ["Robin Hood",
{
"subtitle" : "Men in tights"
}]

}

Why not just:
{
"title": "Robin Hood",
"subtitle: "Men in tights"
}

Then search for ?q=title:tights%20or%20subtitle:tights

I don't think you lose anything semantically by expressing it as above.

clint

The way elasticsearch works is that array elements should keep the same
"type" of json, either string/int/... or object or another array, but not
mix them. The limitation is because of how that json is mapped to a Lucene
document (at the end, it gets deconstructed into a Lucene Document). I am
not sure it is such a constraint, and I would say it make more sense when it
comes to modeling your data (make more sense in statically typed langs, but
not just).

-shay.banon

On Tue, Jun 15, 2010 at 8:33 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

On Tue, 2010-06-15 at 09:52 -0700, MJ Suhonos wrote:

Thanks diptamay, I was expecting the response was that Elastic Search
isn't intended to be used for searching in this way. Let me clarify
my use a case a bit to explain my approach. A clearer example is
this:

{
"title" : ["Robin Hood",
{
"subtitle" : "Men in tights"
}]

}

Why not just:
{
"title": "Robin Hood",
"subtitle: "Men in tights"
}

Then search for ?q=title:tights%20or%20subtitle:tights

I don't think you lose anything semantically by expressing it as above.

clint

Clint, thanks -- you're absolutely right, the better way is to
construct my search queries more intelligently. It'll take some more
mapping in the application layer, but I'll give that a go.

Shay -- thanks for the explanation, and understood; I certainly agree
with the "purity" of requiring homogeneity among JSON elements, and of
course, if it forces me to think harder about my data model, so much
the better. (This is one of the reasons I'm so keen on Elastic Search,
it frees you to focus on your data model design instead of worrying
about limitations)

Thanks all for your comments, and wonderful work once again on ES!
I'm very excited to play with faceting in 0.9. :slight_smile:

MJ

On Jun 15, 1:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

The way elasticsearch works is that array elements should keep the same
"type" of json, either string/int/... or object or another array, but not
mix them. The limitation is because of how that json is mapped to a Lucene
document (at the end, it gets deconstructed into a Lucene Document). I am
not sure it is such a constraint, and I would say it make more sense when it
comes to modeling your data (make more sense in statically typed langs, but
not just).

-shay.banon

On Tue, Jun 15, 2010 at 8:33 PM, Clinton Gormley clin...@iannounce.co.ukwrote:

On Tue, 2010-06-15 at 09:52 -0700, MJ Suhonos wrote:

Thanks diptamay, I was expecting the response was that Elastic Search
isn't intended to be used for searching in this way. Let me clarify
my use a case a bit to explain my approach. A clearer example is
this:

{
"title" : ["Robin Hood",
{
"subtitle" : "Men in tights"
}]

}

Why not just:
{
"title": "Robin Hood",
"subtitle: "Men in tights"
}

Then search for ?q=title:tights%20or%20subtitle:tights

I don't think you lose anything semantically by expressing it as above.

clint

Or, depending on your requirement / goal, get a more google-style
search experience by use of the "all" field (what is it called in ES?)

-N

On 15 Jun 2010, at 18:33, Clinton Gormley clinton@iannounce.co.uk
wrote:

On Tue, 2010-06-15 at 09:52 -0700, MJ Suhonos wrote:

Thanks diptamay, I was expecting the response was that Elastic Search
isn't intended to be used for searching in this way. Let me clarify
my use a case a bit to explain my approach. A clearer example is
this:

{
"title" : ["Robin Hood",
{
"subtitle" : "Men in tights"
}]

}

Why not just:
{
"title": "Robin Hood",
"subtitle: "Men in tights"
}

Then search for ?q=title:tights%20or%20subtitle:tights

I don't think you lose anything semantically by expressing it as
above.

clint

Its still called "all", the name of the field is "_all" (as is the
convention for internal fields). It is automatically used when not "prefix
field" is specified in the query.

-shay.banon

On Tue, Jun 15, 2010 at 11:31 PM, Nick Minutello
nick.minutello@gmail.comwrote:

Or, depending on your requirement / goal, get a more google-style search
experience by use of the "all" field (what is it called in ES?)

-N

On 15 Jun 2010, at 18:33, Clinton Gormley clinton@iannounce.co.uk wrote:

On Tue, 2010-06-15 at 09:52 -0700, MJ Suhonos wrote:

Thanks diptamay, I was expecting the response was that Elastic Search
isn't intended to be used for searching in this way. Let me clarify
my use a case a bit to explain my approach. A clearer example is
this:

{
"title" : ["Robin Hood",
{
"subtitle" : "Men in tights"
}]

}

Why not just:
{
"title": "Robin Hood",
"subtitle: "Men in tights"
}

Then search for ?q=title:tights%20or%20subtitle:tights

I don't think you lose anything semantically by expressing it as above.

clint