ElasticSearch and duplicate content

(Joffrey Hercule) #1

Hi all,
i'm an elasticsearch's newbie.

I've a little problem about duplicate content in order to guess if an item
must be inserted or updated.


  • A car contains a brand, a model, a color, ... (max 10 criterias)
  • in order to know if a record exists into ES, we use an algorithm with
    points system
  • so if the brand exists, we count 1 point. If the brand and the model
    exist, we count 5 points, etc.
  • after the search, we do a sum. If the total is high, we need to merge the
    record, otherwise, we create it.

I tried a method with bool/should match and top score but it took very long
time (more than 2 seconds) to retrieved the datas for 8 bool term.

Do you have a better idea about my problem ? Thanks in advance for your
help and sorry for my bad english !

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/23e8fd48-ab04-444e-bf89-7d5ab8025898%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

(system) #2