Duplicate Data problem

I have problem to push mysql data to elasticsearch using mysql replicator.

I have two table for chat__user and chat_message.

chat_user:

   id    name  socketid

   1     raj    123
   2     kumar   1234

chat_message:

  id   chat_from  chat_to  message

   1      123     1235       hello
   2      1235    123        how can i help you? 
   3      123     1235       HOw track my order?

then, John this two tables Using "chat_message.chat_from = chat_user.socketid OR chat_message.chat_to = chat_user.socketid"

Myquery:

    SELECT * FROM `chat_message` INNER JOIN `chat_user` ON chat_message.chat_from = chat_user.socketid OR chat_message.chat_to = chat_user.socketid

Result:

chat_from  chat_to  message                    id   name  socketid 

123        1235     hello                      1    raj    123 
1235        123        how can i help you?     1    raj    123 
123        1235     HOw track my order?        1    raj    123 

If I push this data to elasticsearch, only push last row data.

 123        1235     HOw track my order?        1    raj    123 

Because Duplication occur in primary key I set primary key chat_user id is a primary key in td-agent configuration file.

Td-Agent Configuration File:

    ####
  ## Output descriptions:
  ##
  # HTTP input
  # POST http://localhost:8888/<tag>?json=<json>
  # POST http://localhost:8888/td.myapp.login?json={"user"%3A"me"}
  # @see http://docs.fluentd.org/articles/in_http
  <source>
    @type http
    port 8888
  </source>

  ## live debugging agent
  <source>
    @type debug_agent
    bind localhost
    port 24230
  </source>

  ####
  ## Examples:
  ##


  <source>
    @type mysql_replicator
    host localhost
    username root
    password gworks.mobi2
    database livechat
    query SELECT * FROM `chat_message` INNER JOIN `chat_user` ON chat_message.chat_from = chat_user.socketid OR chat_message.chat_to = chat_user.socketid;
    primary_key id 
    interval 10s  
    enable_delete yes
    tag replicator.history5.histestb.${event}.${primary_key}
  </source>
  <match replicator.**>
   @type stdout
  </match>

  <match replicator.**>
    @type mysql_replicator_elasticsearch
    host localhost
    port 9200
    tag_format (?<index_name>[^\.]+)\.(?<type_name>[^\.]+)\.(?<event>[^\.]+)\.(?<primary_key>[^\.]+)$
    flush_interval 5s
    max_retry_wait 1800
    flush_at_shutdown yes 
    buffer_type file
    buffer_path /var/log/td-agent/buffer/mysql_replicator_elasticsearch.*
  </match>

Reference : https://github.com/elastic/elasticsearch/issues/18882

I need to push all data to elasticsearch, Suggest me How to solve this Problem? .

As I'm not familiar with the plugin, I don't really know how to help you here. I see, however, that you opened an issue as well for the plugin that does the replication.

Someone seems to be helping you there, so I would have preferred that you properly link that here instead of just cross-posting to various platforms.

In the result set you seem to have id set to 1 for all records. It is therefore not a suitable primary key. As you use this as document ID in Elasticsearch, you are updating the same document over and over. You need to correct your query so that the id field is unique, e.g. by making sure it corresponds to the id from the chat_message table, assuming this is unique.

I have solved my problem , for removing chat_user id field.

run join query:

          SELECT * FROM `chat_message` INNER JOIN `chat_user` ON chat_message.chat_from = chat_user.socketid OR chat_message.chat_to = chat_user.socketid

now i got no duplication result.

                    id    chat_from  chat_to    message                            name   socketid 

                      1    123              1235           hello                                         raj        123 
                     2     1235            123             how can i help you?           raj        123 
                     3     123              1235          How track my order?          raj       123

so, all Record are pushed to elasticsearch. its worked for me.