Best way to index/search a group of numbers


(Derf) #1

Hi,

I'm new with ES, and need some basic help (I guess).

I have 5 million records, every record is a string with 20 numbers from 01
to 60, like this:
record 1 = 01 03 06 08 13 19 22 25 28 29 30 31 40 42 44 46 53 55 57 59
record 2 = 02 05 06 08 11 12 23 25 29 31 34 36 37 43 44 52 54 55 57 58
record 3 = 04 05 12 14 19 20 24 29 31 35 38 39 40 44 47 53 54 56 57 58
...

There is no repeated combinations. Every record is unique.

My question is: what is the best way to index those records so I can find
records that have specific numbers really fast?
For example: I must find one record that have numbers 01, 06, 44 and 60.

I had indexed the 5million exactly like this, as string, and I'm looking
for it this way: {"query": "01 AND 06 AND 44 AND 60"}
Now, it takes about 0.064 to answer. This seems to be fast, but we need a
faster way. Is that possible?

Some further information that may help:

  • When we look for a record, we only need to find the first match, not all
    of them.
  • After we find the record, we need some way to exclude it from future
    searchs

I guess the first point is to index the information in the best way.
Can some one help me with that? Sorry if that is a too basic question.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7ce37af2-0387-4401-96d8-e36de8a7537a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

Derf,

I'd probably just use the whitespace analyzer and combine with a terms
filter query. Something like this for the mapping:

{
"mappings": {
"doc": {
"properties": {
"foo": {
"type": "string",
"analyzer": "whitespace"
}
}
}
}
}

And then something like this for the query:

{
"query": {
"filtered": {
"filter": {
"terms": {
"foo": [
"01", "06", "44", "60"
],
"execution": "and"
}
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4ddfe9b-f331-43a4-8fb3-f79e36634fca%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3