How did soundcloud.com build their search?


(joa) #1

The soundcloud.com search seems to be an auto suggest search over multiple
fields, maybe over multiple types/indeces.

Example call with suggestions for "aa":
https://api.soundcloud.com/search/suggest?q=aa&pretty=true

{
"tx_id" : "47f7ce51ad784cc98fe8f1201729fb87",
"query_time_in_millis" : 0,
"query" : "aa",
"limit" : 3,
"suggestions" : [ {
"query" : "A$AP Rocky",
"kind" : "user",
"id" : 26482329,
"score" : 165994
}, {
"query" : "A$AP Rocky, F**kin' Problems (ft. Drake, 2 Chainz & Kendrick Lamar)",
"kind" : "track",
"id" : 64506899,
"score" : 97811
}, {
"query" : "A$AP Rocky, Wild For The Night",
"kind" : "track",
"id" : 74570738,
"score" : 64564
} ]
}

How to build an auto suggest function that works for multiple fields in
general? And how did they include a "kind" field in the result, to
determine where the hit comes from?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #2

Hey,

you may want to checkout the completion suggester, which allows you also to
add custom content to return by using payloads.
Also from my high level point of view, it merely looks like they are
maintaining an own suggest index instead of querying many others... (dont
know internals though)

See
http://www.elasticsearch.org/guide/reference/api/search/completion-suggest/and

--Alex

On Thu, Sep 12, 2013 at 9:03 PM, joa joafeldmann@gmail.com wrote:

The soundcloud.com search seems to be an auto suggest search over
multiple fields, maybe over multiple types/indeces.

Example call with suggestions for "aa":
https://api.soundcloud.com/search/suggest?q=aa&pretty=true

{
"tx_id" : "47f7ce51ad784cc98fe8f1201729fb87",
"query_time_in_millis" : 0,
"query" : "aa",
"limit" : 3,
"suggestions" : [ {
"query" : "A$AP Rocky",
"kind" : "user",
"id" : 26482329,
"score" : 165994
}, {
"query" : "A$AP Rocky, F**kin' Problems (ft. Drake, 2 Chainz & Kendrick Lamar)",
"kind" : "track",
"id" : 64506899,
"score" : 97811
}, {
"query" : "A$AP Rocky, Wild For The Night",
"kind" : "track",
"id" : 74570738,
"score" : 64564
} ]
}

How to build an auto suggest function that works for multiple fields in
general? And how did they include a "kind" field in the result, to
determine where the hit comes from?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(ddorian43) #3

http://backstage.soundcloud.com/tag/elastic-search/

On Thursday, September 12, 2013 9:03:18 PM UTC+2, joa wrote:

The soundcloud.com search seems to be an auto suggest search over
multiple fields, maybe over multiple types/indeces.

Example call with suggestions for "aa":
https://api.soundcloud.com/search/suggest?q=aa&pretty=true

{
"tx_id" : "47f7ce51ad784cc98fe8f1201729fb87",
"query_time_in_millis" : 0,
"query" : "aa",
"limit" : 3,
"suggestions" : [ {
"query" : "A$AP Rocky",
"kind" : "user",
"id" : 26482329,
"score" : 165994
}, {
"query" : "A$AP Rocky, F**kin' Problems (ft. Drake, 2 Chainz & Kendrick Lamar)",
"kind" : "track",
"id" : 64506899,
"score" : 97811
}, {
"query" : "A$AP Rocky, Wild For The Night",
"kind" : "track",
"id" : 74570738,
"score" : 64564
} ]
}

How to build an auto suggest function that works for multiple fields in
general? And how did they include a "kind" field in the result, to
determine where the hit comes from?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(joa) #4

@ddorian43: Thanks, I've already read their blog, but it has no ES
internals.
@Alexander: Your guess was right, they're using an own suggest index. I've
posted my question to their blog and their answer came today:

The auto-suggest feature is totally disconnected from ElasticSearch. We

build a finite state transducer (FST) containing the most relevant of each
entity-type offline, and serve it from memory. The technique we use is
very similar to the one described here:

http://www.elasticsearch.org/blog/you-complete-me/

...and if we were building the feature today, I think we would definitely
try first to do it with that API.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5