Hey Binh, Thanks a lot and it is really nice to hear from someone with
practical experience on this. Is it correct to say if I had a thousand
tags, I would need to make thousands of
curl -XPUT 'localhost:9200/my-index1/.percolator/tagname1'
to register each tags? In your implementation is there any pitfalls or nice
tricks that is worth noting?
On Wednesday, January 22, 2014 8:27:03 AM UTC+8, Binh Ly wrote:
Arthur,
You should be able to use filters in your percolator queries so for
example you can use a term/terms filter. Also, in ES 1.0 you can shard the
percolator query index out so that percolation can distribute that load
around for better scalability. The best way is to experiment with it:
Elasticsearch Platform — Find real-time answers at scale | Elastic.I actually worked for a company that did content classification this way,
and the percolator was a perfect fit for that use-case.On Tuesday, January 21, 2014 10:01:36 AM UTC-5, Arthur Denning wrote:
I am considering using the percolator API to classify document, namely,
by posting query like "football", "art" to the percolator, and then when
adding new documents, percolator should return the right tags. My concerns
is, suppose there is thousands of tag to be identified in this way, would
it be a performance nightmare? Is there thousands of query that is
implicitly running behind the scene?And what would be the recommended way to tackle these kind of
classification problem in Elasticsearch?It seems that Lucene has a classification api. Is it already integrated
elsewhere in Elasticsearch? Is there any roadmap concerning its
implementation?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/965b464c-1cf2-4ae5-83c1-5f18fe8d0228%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.