Converting from MySQL


(Clinton Gormley) #1

Hiya

I'm in the process of converting my queries from MySQL to ElasticSearch,
and would like to know if my mapping makes sense.

I've identified 6 common types of data that I want to store:

  • ID - a single integer/long
  • enum - a single value string, eg status: active | inactive |
    pending
  • text - freeform text, names, email addresses etc
  • date - dates, datetimes, times
  • ID's - multiple ID's, eg a list of object ancestor IDs
  • enum's - multiple enums, eg tags for a blog

Mappings:

  • ID - { type: long }
  • enum - { type: string, index: not_analyzed }
  • text - { type: string, analyzer: standard }
  • date - { type: date | datetime etc }
     - ID's - { type string, analyzer: whitespace }
     - enum's - { type string, analyzer: whitespace }

So when searching for eg tags ( == enum's), I'd do something like:

{
bool: {
should: [
 { term: { tag: 'foo' }},
 { term: { tag: 'bar' }}
]
}}

Does this look right?

thanks

Clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Shay Banon) #2

Looks good. A point about IDs, if you can have then as a JSON array of
each id, then its better to use it and then use not_analyzed for it.
Similar, you can do the same for enum's.

-shay.banon

On Tue, Mar 9, 2010 at 8:08 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

Hiya

I'm in the process of converting my queries from MySQL to ElasticSearch,
and would like to know if my mapping makes sense.

I've identified 6 common types of data that I want to store:

  • ID - a single integer/long
  • enum - a single value string, eg status: active | inactive |
    pending
  • text - freeform text, names, email addresses etc
  • date - dates, datetimes, times
  • ID's - multiple ID's, eg a list of object ancestor IDs
  • enum's - multiple enums, eg tags for a blog

Mappings:

  • ID - { type: long }
  • enum - { type: string, index: not_analyzed }
  • text - { type: string, analyzer: standard }
  • date - { type: date | datetime etc }
     - ID's - { type string, analyzer: whitespace }
     - enum's - { type string, analyzer: whitespace }

So when searching for eg tags ( == enum's), I'd do something like:

{
bool: {
should: [
 { term: { tag: 'foo' }},
 { term: { tag: 'bar' }}
]
}}

Does this look right?

thanks

Clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Clinton Gormley) #3

On Tue, 2010-03-09 at 20:21 +0200, Shay Banon wrote:

Looks good. A point about IDs, if you can have then as a JSON array
of each id, then its better to use it and then use not_analyzed for
it. Similar, you can do the same for enum's.

Ah fantastic! I had no idea that you could do that - I didn't see it
mentioned in the docs.

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Shay Banon) #4

Its here:
http://www.elasticsearch.com/docs/elasticsearch/mapping/array_type/

-shay.banon

On Tue, Mar 9, 2010 at 8:30 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

On Tue, 2010-03-09 at 20:21 +0200, Shay Banon wrote:

Looks good. A point about IDs, if you can have then as a JSON array
of each id, then its better to use it and then use not_analyzed for
it. Similar, you can do the same for enum's.

Ah fantastic! I had no idea that you could do that - I didn't see it
mentioned in the docs.

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Clinton Gormley) #5

On Wed, 2010-03-10 at 00:34 +0200, Shay Banon wrote:

Its here:
http://www.elasticsearch.com/docs/elasticsearch/mapping/array_type/

Ah thanks - it should be added to the main page on
http://www.elasticsearch.com/docs/elasticsearch/mapping/

clint

-shay.banon

On Tue, Mar 9, 2010 at 8:30 PM, Clinton Gormley
clinton@iannounce.co.uk wrote:
On Tue, 2010-03-09 at 20:21 +0200, Shay Banon wrote:
> Looks good. A point about IDs, if you can have then as a
JSON array
> of each id, then its better to use it and then use
not_analyzed for
> it. Similar, you can do the same for enum's.

    Ah fantastic! I had no idea that you could do that - I didn't
    see it
    mentioned in the docs.
    
    --
    
    Web Announcements Limited is a company registered in England
    and Wales,
    with company number 05608868, with registered address at 10
    Arvon Road,
    London, N5 1PR.

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(Shay Banon) #6

Oops..., added.

On Wed, Mar 10, 2010 at 1:13 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

On Wed, 2010-03-10 at 00:34 +0200, Shay Banon wrote:

Its here:
http://www.elasticsearch.com/docs/elasticsearch/mapping/array_type/

Ah thanks - it should be added to the main page on
http://www.elasticsearch.com/docs/elasticsearch/mapping/

clint

-shay.banon

On Tue, Mar 9, 2010 at 8:30 PM, Clinton Gormley
clinton@iannounce.co.uk wrote:
On Tue, 2010-03-09 at 20:21 +0200, Shay Banon wrote:
> Looks good. A point about IDs, if you can have then as a
JSON array
> of each id, then its better to use it and then use
not_analyzed for
> it. Similar, you can do the same for enum's.

    Ah fantastic! I had no idea that you could do that - I didn't
    see it
    mentioned in the docs.

    --

    Web Announcements Limited is a company registered in England
    and Wales,
    with company number 05608868, with registered address at 10
    Arvon Road,
    London, N5 1PR.

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(system) #7