Create Object Type with Arbitrary Fields in Dynamic:false type


(davrob) #1

Hi,

I'd like to have one of my fields to behave like a map / associative
array, or else a JavaScript object that has arbitrary fields.
Currently, I've turned off dynamic fields altogether, by specifying
this in my mapping file:

{
"contact" : {
"dynamic" : false,
"properties" : {
...
}}}

This is an example of the object:

{
"id": 110004456,
"person_name": "Smith, Josh",
"activityDateMap": {
"2011-09-14": {
"date": "2011-09-14T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1007": "T",
"1005": "E",
"1000": "E"
}
},
"2011-09-21": {
"date": "2011-09-21T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1001": "E",
"1002": "T",
"1000": "E"
}
}
}
}

The field "activityDateMap" is what I would like to create, during
indexing, each contact.activityDateMap would have an arbitray number
of date based string fields that map to an object, that again, has an
arbitrary map of id to type string, the "partyIdActivityTypeMap"
field.

Questions:

  1. Can I turn on dynamic mapping for this one object but not the
    whole type? I like to know exactly what I'm allowing into my index.
  2. Is having an arbitrary number of fields associated with an object
    in this way good practise / performant?

Best Regards,

David


(Shay Banon) #2

On Wed, Sep 21, 2011 at 8:27 PM, davrob2 daviroberts@gmail.com wrote:

Hi,

I'd like to have one of my fields to behave like a map / associative
array, or else a JavaScript object that has arbitrary fields.
Currently, I've turned off dynamic fields altogether, by specifying
this in my mapping file:

{
"contact" : {
"dynamic" : false,
"properties" : {
...
}}}

This is an example of the object:

{
"id": 110004456,
"person_name": "Smith, Josh",
"activityDateMap": {
"2011-09-14": {
"date": "2011-09-14T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1007": "T",
"1005": "E",
"1000": "E"
}
},
"2011-09-21": {
"date": "2011-09-21T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1001": "E",
"1002": "T",
"1000": "E"
}
}
}
}

The field "activityDateMap" is what I would like to create, during
indexing, each contact.activityDateMap would have an arbitray number
of date based string fields that map to an object, that again, has an
arbitrary map of id to type string, the "partyIdActivityTypeMap"
field.

Questions:

  1. Can I turn on dynamic mapping for this one object but not the
    whole type? I like to know exactly what I'm allowing into my index.

Yes, the dynamic mapping can be set on an object level mapping. If not set,
it will default to the dynamic set on the root object mapping.

  1. Is having an arbitrary number of fields associated with an object
    in this way good practise / performant?

It will create potentially many internal fields that end up being indexed.
Why do you want to do it? Maybe adding a date field to the object, and using
nested mapping will help?

Best Regards,

David


(davrob) #3

What I want to achieve is:

i) For a particular week, determined by the user, show 5 columns, one
for each day, with the numOfActivities in.
ii) If the user id, got from the session, equals one of the user ids
in partyIdActivityTypeMap, append the type (E or T).

So for each contact, for a particular week I would have

| 18th | 19th| 20th | 21st| 22nd|
| 1 | 2 | E3 | 2 | T3 |

My idea was to create 5 scripted fields that would query

i) For the presence of the date using the date string
2) For the presnce of an id using the users id from the session.

So potentially we have 3 years of dates giving 1000 fields, plus 5,000
users giving another 5,000 fields.

thanks.

On Sep 22, 12:09 pm, Shay Banon kim...@gmail.com wrote:

On Wed, Sep 21, 2011 at 8:27 PM, davrob2 davirobe...@gmail.com wrote:

Hi,

I'd like to have one of my fields to behave like a map / associative
array, or else a JavaScript object that has arbitrary fields.
Currently, I've turned off dynamic fields altogether, by specifying
this in my mapping file:

{
"contact" : {
"dynamic" : false,
"properties" : {
...
}}}

This is an example of the object:

{
"id": 110004456,
"person_name": "Smith, Josh",
"activityDateMap": {
"2011-09-14": {
"date": "2011-09-14T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1007": "T",
"1005": "E",
"1000": "E"
}
},
"2011-09-21": {
"date": "2011-09-21T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1001": "E",
"1002": "T",
"1000": "E"
}
}
}
}

The field "activityDateMap" is what I would like to create, during
indexing, each contact.activityDateMap would have an arbitray number
of date based string fields that map to an object, that again, has an
arbitrary map of id to type string, the "partyIdActivityTypeMap"
field.

Questions:

  1. Can I turn on dynamic mapping for this one object but not the
    whole type? I like to know exactly what I'm allowing into my index.

Yes, the dynamic mapping can be set on an object level mapping. If not set,
it will default to the dynamic set on the root object mapping.

  1. Is having an arbitrary number of fields associated with an object
    in this way good practise / performant?

It will create potentially many internal fields that end up being indexed.
Why do you want to do it? Maybe adding a date field to the object, and using
nested mapping will help?

Best Regards,

David


(davrob) #4

Any other suggestions on how to achieve what I want here would be
welcome, I'm shooting in the dark a bit.

In many ways I don't need to do a nested query, for this use case for
the benefits described in the docs: http://www.elasticsearch.org/guide/reference/mapping/nested-type.html,
but I might as well make the object nested because I could use those
searches in the future.

If I made the object nested would it relieve the root object "contact"
from the performance drag of haveing 1000s of fields, when I execute
searches that do not involve the nested field "activityDateMap"?

I think what I'm really wanting here, in the relational world, would
be an outer-join, on the activityDateMap field

  • I want to return all the contacts in a person's list by doing a
    termFilter on listId.
  • then for each contact in the list I want to query activityDateMap
    to see if there are any events on that week involving the contact, and
    if so, see if any of them were also involving the logged in user, and
    create scripted fields accordingly.

I've been thinking about converting these maps to strings that could
be parsed into Comma Separated arrays and / or regular expressions but
that approach seems to cause as many problems as it solves.

-David.

On Sep 22, 12:21 pm, davrob2 davirobe...@gmail.com wrote:

What I want to achieve is:

i) For a particular week, determined by the user, show 5 columns, one
for each day, with the numOfActivities in.
ii) If the user id, got from the session, equals one of the user ids
in partyIdActivityTypeMap, append thetype(E or T).

So for each contact, for a particular week I would have

| 18th | 19th| 20th | 21st| 22nd|
| 1 | 2 | E3 | 2 | T3 |

My idea was tocreate5 scriptedfieldsthat would query

i) For the presence of the date using the date string
2) For the presnce of an id using the users id from the session.

So potentially we have 3 years of dates giving 1000fields, plus 5,000
users giving another 5,000fields.

thanks.

On Sep 22, 12:09 pm, Shay Banon kim...@gmail.com wrote:

On Wed, Sep 21, 2011 at 8:27 PM, davrob2 davirobe...@gmail.com wrote:

Hi,

I'd like to have one of myfieldsto behave like a map / associative
array, or else a JavaScriptobjectthat hasarbitraryfields.
Currently, I've turned off dynamicfieldsaltogether, by specifying
this in my mapping file:

{
"contact" : {
"dynamic" : false,
"properties" : {
...
}}}

This is an example of theobject:

{
"id": 110004456,
"person_name": "Smith, Josh",
"activityDateMap": {
"2011-09-14": {
"date": "2011-09-14T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1007": "T",
"1005": "E",
"1000": "E"
}
},
"2011-09-21": {
"date": "2011-09-21T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1001": "E",
"1002": "T",
"1000": "E"
}
}
}
}

The field "activityDateMap" is what I would like tocreate, during
indexing, each contact.activityDateMap would have an arbitray number
of date based stringfieldsthat map to anobject, that again, has an
arbitrarymap of id totypestring, the "partyIdActivityTypeMap"
field.

Questions:

  1. Can I turn on dynamic mapping for this oneobjectbut not the
    wholetype? I like to know exactly what I'm allowing into my index.

Yes, the dynamic mapping can be set on anobjectlevel mapping. If not set,
it will default to the dynamic set on the rootobjectmapping.

  1. Is having anarbitrarynumber offieldsassociated with anobject
    in this way good practise / performant?

It willcreatepotentially many internalfieldsthat end up being indexed.
Why do you want to do it? Maybe adding a date field to theobject, and using
nested mapping will help?

Best Regards,

David


(Shay Banon) #5

I think I understand what you are after. If you use nested types, and use
the date as a nested field, you can do nested object level aggregation and
filtering. Trying to improve the performance, how about creating two maps,
one for "E" , and one for "T", and then use terms filter / aggregation to
aggregate it?

On Fri, Sep 23, 2011 at 12:41 PM, davrob2 daviroberts@gmail.com wrote:

Any other suggestions on how to achieve what I want here would be
welcome, I'm shooting in the dark a bit.

In many ways I don't need to do a nested query, for this use case for
the benefits described in the docs:
http://www.elasticsearch.org/guide/reference/mapping/nested-type.html,
but I might as well make the object nested because I could use those
searches in the future.

If I made the object nested would it relieve the root object "contact"
from the performance drag of haveing 1000s of fields, when I execute
searches that do not involve the nested field "activityDateMap"?

I think what I'm really wanting here, in the relational world, would
be an outer-join, on the activityDateMap field

  • I want to return all the contacts in a person's list by doing a
    termFilter on listId.
  • then for each contact in the list I want to query activityDateMap
    to see if there are any events on that week involving the contact, and
    if so, see if any of them were also involving the logged in user, and
    create scripted fields accordingly.

I've been thinking about converting these maps to strings that could
be parsed into Comma Separated arrays and / or regular expressions but
that approach seems to cause as many problems as it solves.

-David.

On Sep 22, 12:21 pm, davrob2 davirobe...@gmail.com wrote:

What I want to achieve is:

i) For a particular week, determined by the user, show 5 columns, one
for each day, with the numOfActivities in.
ii) If the user id, got from the session, equals one of the user ids
in partyIdActivityTypeMap, append thetype(E or T).

So for each contact, for a particular week I would have

| 18th | 19th| 20th | 21st| 22nd|
| 1 | 2 | E3 | 2 | T3 |

My idea was tocreate5 scriptedfieldsthat would query

i) For the presence of the date using the date string
2) For the presnce of an id using the users id from the session.

So potentially we have 3 years of dates giving 1000fields, plus 5,000
users giving another 5,000fields.

thanks.

On Sep 22, 12:09 pm, Shay Banon kim...@gmail.com wrote:

On Wed, Sep 21, 2011 at 8:27 PM, davrob2 davirobe...@gmail.com
wrote:

Hi,

I'd like to have one of myfieldsto behave like a map / associative
array, or else a JavaScriptobjectthat hasarbitraryfields.
Currently, I've turned off dynamicfieldsaltogether, by specifying
this in my mapping file:

{
"contact" : {
"dynamic" : false,
"properties" : {
...
}}}

This is an example of theobject:

{
"id": 110004456,
"person_name": "Smith, Josh",
"activityDateMap": {
"2011-09-14": {
"date": "2011-09-14T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1007": "T",
"1005": "E",
"1000": "E"
}
},
"2011-09-21": {
"date": "2011-09-21T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1001": "E",
"1002": "T",
"1000": "E"
}
}
}
}

The field "activityDateMap" is what I would like tocreate, during
indexing, each contact.activityDateMap would have an arbitray number
of date based stringfieldsthat map to anobject, that again, has an
arbitrarymap of id totypestring, the "partyIdActivityTypeMap"
field.

Questions:

  1. Can I turn on dynamic mapping for this oneobjectbut not the
    wholetype? I like to know exactly what I'm allowing into my index.

Yes, the dynamic mapping can be set on anobjectlevel mapping. If not
set,

it will default to the dynamic set on the rootobjectmapping.

  1. Is having anarbitrarynumber offieldsassociated with anobject
    in this way good practise / performant?

It willcreatepotentially many internalfieldsthat end up being indexed.
Why do you want to do it? Maybe adding a date field to theobject, and
using

nested mapping will help?

Best Regards,

David


(davrob) #6

Hi Shay, yes nice idea, that'd cut down on the number of fields.

On Sep 26, 11:50 am, Shay Banon kim...@gmail.com wrote:

I think I understand what you are after. If you use nested types, and use
the date as a nested field, you can do nested object level aggregation and
filtering. Trying to improve the performance, how about creating two maps,
one for "E" , and one for "T", and then use terms filter / aggregation to
aggregate it?

On Fri, Sep 23, 2011 at 12:41 PM, davrob2 davirobe...@gmail.com wrote:

Any other suggestions on how to achieve what I want here would be
welcome, I'm shooting in the dark a bit.

In many ways I don't need to do a nested query, for this use case for
the benefits described in the docs:
http://www.elasticsearch.org/guide/reference/mapping/nested-type.html,
but I might as well make the object nested because I could use those
searches in the future.

If I made the object nested would it relieve the root object "contact"
from the performance drag of haveing 1000s of fields, when I execute
searches that do not involve the nested field "activityDateMap"?

I think what I'm really wanting here, in the relational world, would
be an outer-join, on the activityDateMap field

  • I want to return all the contacts in a person's list by doing a
    termFilter on listId.
  • then for each contact in the list I want to query activityDateMap
    to see if there are any events on that week involving the contact, and
    if so, see if any of them were also involving the logged in user, and
    create scripted fields accordingly.

I've been thinking about converting these maps to strings that could
be parsed into Comma Separated arrays and / or regular expressions but
that approach seems to cause as many problems as it solves.

-David.

On Sep 22, 12:21 pm, davrob2 davirobe...@gmail.com wrote:

What I want to achieve is:

i) For a particular week, determined by the user, show 5 columns, one
for each day, with the numOfActivities in.
ii) If the user id, got from the session, equals one of the user ids
in partyIdActivityTypeMap, append thetype(E or T).

So for each contact, for a particular week I would have

| 18th | 19th| 20th | 21st| 22nd|
| 1 | 2 | E3 | 2 | T3 |

My idea was tocreate5 scriptedfieldsthat would query

i) For the presence of the date using the date string
2) For the presnce of an id using the users id from the session.

So potentially we have 3 years of dates giving 1000fields, plus 5,000
users giving another 5,000fields.

thanks.

On Sep 22, 12:09 pm, Shay Banon kim...@gmail.com wrote:

On Wed, Sep 21, 2011 at 8:27 PM, davrob2 davirobe...@gmail.com
wrote:

Hi,

I'd like to have one of myfieldsto behave like a map / associative
array, or else a JavaScriptobjectthat hasarbitraryfields.
Currently, I've turned off dynamicfieldsaltogether, by specifying
this in my mapping file:

{
"contact" : {
"dynamic" : false,
"properties" : {
...
}}}

This is an example of theobject:

{
"id": 110004456,
"person_name": "Smith, Josh",
"activityDateMap": {
"2011-09-14": {
"date": "2011-09-14T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1007": "T",
"1005": "E",
"1000": "E"
}
},
"2011-09-21": {
"date": "2011-09-21T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1001": "E",
"1002": "T",
"1000": "E"
}
}
}
}

The field "activityDateMap" is what I would like tocreate, during
indexing, each contact.activityDateMap would have an arbitray number
of date based stringfieldsthat map to anobject, that again, has an
arbitrarymap of id totypestring, the "partyIdActivityTypeMap"
field.

Questions:

  1. Can I turn on dynamic mapping for this oneobjectbut not the
    wholetype? I like to know exactly what I'm allowing into my index.

Yes, the dynamic mapping can be set on anobjectlevel mapping. If not
set,

it will default to the dynamic set on the rootobjectmapping.

  1. Is having anarbitrarynumber offieldsassociated with anobject
    in this way good practise / performant?

It willcreatepotentially many internalfieldsthat end up being indexed.
Why do you want to do it? Maybe adding a date field to theobject, and
using

nested mapping will help?

Best Regards,

David


(davrob) #7

Hi - revisiting this there are a couple of things that are keeping me
from going with what Shay suggested:

  1. I don't think there is any way I can use facetting. Once I've
    done my query (on listId) I want ALL the rows, facetting will give me
    term counts etc. that are separate from the other fields. I don't
    want that, what I want is 5 new fields on every row, that change their
    value depending on the date and person logged in.

  2. If having many fields in activityDateMap is a problem because they
    will be indexed, surely the best thing for me to do is to turn off
    indexing for activityDateMap using "index": "no" in my mapping. This
    means the field will not be searchable, but that is fine, if I want an
    activity field that is searchable with facets etc, then I will create
    a seperated nested field specifically for that purpose. The purpose
    of the activityDateMap field will be simply to hold a map which will
    return values quickly to a scripted field.

If anyone sees any tragic flaws in this argument, I'd appreciate their
comments.

-David.

On Sep 26, 1:50 pm, davrob2 davirobe...@gmail.com wrote:

Hi Shay, yes nice idea, that'd cut down on the number offields.

On Sep 26, 11:50 am, Shay Banon kim...@gmail.com wrote:

I think I understand what you are after. If you use nested types, and use
the date as a nested field, you can do nestedobjectlevel aggregation and
filtering. Trying to improve the performance, how about creating two maps,
one for "E" , and one for "T", and then use terms filter / aggregation to
aggregate it?

On Fri, Sep 23, 2011 at 12:41 PM, davrob2 davirobe...@gmail.com wrote:

Any other suggestions on how to achieve what I want here would be
welcome, I'm shooting in the dark a bit.

In many ways I don't need to do a nested query, for this use case for
the benefits described in the docs:
http://www.elasticsearch.org/guide/reference/mapping/nested-type.html,
but I might as well make theobjectnested because I could use those
searches in the future.

If I made theobjectnested would it relieve the rootobject"contact"
from the performance drag of haveing 1000s offields, when I execute
searches that do not involve the nested field "activityDateMap"?

I think what I'm really wanting here, in the relational world, would
be an outer-join, on the activityDateMap field

  • I want to return all the contacts in a person's list by doing a
    termFilter on listId.
  • then for each contact in the list I want to query activityDateMap
    to see if there are any events on that week involving the contact, and
    if so, see if any of them were also involving the logged in user, and
    createscriptedfieldsaccordingly.

I've been thinking about converting these maps to strings that could
be parsed into Comma Separated arrays and / or regular expressions but
that approach seems to cause as many problems as it solves.

-David.

On Sep 22, 12:21 pm, davrob2 davirobe...@gmail.com wrote:

What I want to achieve is:

i) For a particular week, determined by the user, show 5 columns, one
for each day, with the numOfActivities in.
ii) If the user id, got from the session, equals one of the user ids
in partyIdActivityTypeMap, append thetype(E or T).

So for each contact, for a particular week I would have

| 18th | 19th| 20th | 21st| 22nd|
| 1 | 2 | E3 | 2 | T3 |

My idea was tocreate5 scriptedfieldsthat would query

i) For the presence of the date using the date string
2) For the presnce of an id using the users id from the session.

So potentially we have 3 years of dates giving 1000fields, plus 5,000
users giving another 5,000fields.

thanks.

On Sep 22, 12:09 pm, Shay Banon kim...@gmail.com wrote:

On Wed, Sep 21, 2011 at 8:27 PM, davrob2 davirobe...@gmail.com
wrote:

Hi,

I'd like to have one of myfieldsto behave like a map / associative
array, or else a JavaScriptobjectthat hasarbitraryfields.
Currently, I've turned off dynamicfieldsaltogether, by specifying
this in my mapping file:

{
"contact" : {
"dynamic" : false,
"properties" : {
...
}}}

This is an example of theobject:

{
"id": 110004456,
"person_name": "Smith, Josh",
"activityDateMap": {
"2011-09-14": {
"date": "2011-09-14T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1007": "T",
"1005": "E",
"1000": "E"
}
},
"2011-09-21": {
"date": "2011-09-21T16:58:16.937+0000",
"numOfActivities": 3,
"partyIdActivityTypeMap": {
"1001": "E",
"1002": "T",
"1000": "E"
}
}
}
}

The field "activityDateMap" is what I would like tocreate, during
indexing, each contact.activityDateMap would have an arbitray number
of date based stringfieldsthat map to anobject, that again, has an
arbitrarymap of id totypestring, the "partyIdActivityTypeMap"
field.

Questions:

  1. Can I turn ondynamicmapping for this oneobjectbut not the
    wholetype? I like to know exactly what I'm allowing into my index.

Yes, thedynamicmapping can be set on anobjectlevel mapping. If not
set,

it will default to thedynamicset on the rootobjectmapping.

  1. Is having anarbitrarynumber offieldsassociated with anobject
    in this way good practise / performant?

It willcreatepotentially many internalfieldsthat end up being indexed.
Why do you want to do it? Maybe adding a date field to theobject, and
using

nested mapping will help?

Best Regards,

David


(system) #8