i am using elastic search 5.6.4
GET /clinicaldata/_search
{
"query": {
"match" : {
"mrdno" : "112627"
}
}
}
in that mrdno 112627 have duplicate of 75 records.
when run in the above shows ouput as follows
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 75,
"max_score": 1
total 75 records. in this 75 duplicate records is there.
i want to delete the duplicate records using elastic search
for that how to write the query using elastic search
dadoonet
(David Pilato)
December 18, 2017, 12:32pm
2
You can probably use the delete by query feature and index again one of the docs
ok. please tell me. how to do. because i am new to elastic search
dadoonet
(David Pilato)
December 18, 2017, 3:35pm
4
i have duplicate 75 duplicate records. mrdno 11657
POST twitter/_delete_by_query?scroll_size=5000
{
"query": {
"term": {
"user": "kimchy"
}
}
}
from the above example i am replacing my field name.
my index name is test
POST test/_delete_by_query?scroll_size=5000
{
"query": {
"term": {
"mrdno": "11657"
}
}
}
The above query will delete the duplicate values for the column mrdno.
the above one is correct please let me know
dadoonet
(David Pilato)
December 18, 2017, 4:38pm
6
Please format your code using </>
icon as explained in this guide . It will make your post more readable.
Or use markdown style like:
```
CODE
```
Your query looks good but the index name which should be probably clinicaldata
.
clinical data is my index name
POST clinicaldata/_delete_by_query?scroll_size=5000
{
"query": {
"term": {
"mrdno": "11657"
}
}
}
The above query is correct ?
to delete duplicate records of mrd no
dadoonet
(David Pilato)
December 18, 2017, 5:45pm
8
It will delete ALL records that match "mrdno": "11657"
not only duplicates.
So you will need to create again after a new document...
i do not want delete the all record that matches mrdno 11657.
i want to delete only duplicates
POST clinicaldata/_delete_by_query?scroll_size=5000
{
"query": {
"term": {
"mrdno": "11657"
}
}
}
you told that above query will delete all the record that matches mrdno 11657.
i want to delete duplicates records of mrdo 16657.
for that what changes i have to make from the above query.
please do the needful.
dadoonet
(David Pilato)
December 19, 2017, 5:41am
10
I can see that you asked the same question at
want to delete the duplicates the below code is correct
the below code is written in the logstash file under config file.
file type is conf file.
output {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "test" }
stdout { codec => rubydebug }
document_id => "%{[@metadata][_mrdno]}"
}
want to remove the duplicates of mrdno column.
please let me know
So the answer is: there is no way to remove duplicates in one single call.
As I said, you need to:
Remove all docs
Add back one of the docs you removed
Is it a one time operation or something you want to do in the long run? If the later, what is the usecase of allowing duplicates ?
dadoonet
(David Pilato)
December 19, 2017, 5:54am
11
Unless you have another field which can help like a timestamp...
in kibana discover tab data is displaying, below i show the discover tab details in kibana.
clinicaldata (index name)
Selected Fields
? _source
Available Fields
? @timestamp
? @version
t _id
t _index
add
_score
t _type
Fields name as follows
? age
? cpa_addr_1
? cpa_addr_2
? cpa_addr_3
? cpa_addr_area
? cpa_addr_city
? cpa_country_cd
? cpa_pin_code
? cpa_state_cd
? mcs_case_summary
? mcs_crt_dt
? mcs_crt_uid
? rrh_first_name
? rrh_location_cd
? rrh_mr_num
? rrh_pat_dob
? rrh_pat_sex
? rrh_regn_dt
_source
cpa_addr_2:POST- EKBARNA cpa_addr_area: - mcs_crt_uid:MACTCS cpa_addr_1:VILL- EKB ARNA rrh_mr_num:3416558 cpa_pin_code:732 204 rrh_pat_sex:Fema mcs_crt_dt:2015-01-03T07:23:00.000Z @timestamp :2017-12-15T08:54:28.231Z cpa_country_cd:INDIA cpa_state_cd:WB @version :1 rrh_regn_dt:2014-11-18T18:30:00.000Z rrh_first_name:ANITA KARMAKAR rrh_pat_dob:1987-08-24T18:30:00.000Z mcs_case_summary:External File Uploaded - EXTNFILEINFO SHEET age:28 rrh_location_cd:MAIN cpa_addr_3:PS- RATUA cpa_addr_city:india
_id:AWBZYcj9fbvNIqwo2O6C _type:logs _index:clinicaldata _score:1
But when i do in kibana visualization using tag cloud i want to display the data in visualization
Steps i follows to display data in visualization
Create a new visualization
select the tag cloud (in visualization type)
select the index name
Then select add a fliter + (Button)
another popup is opened from the fliter list select the field name and selector operator is and in value textbox type the india and click save button
then Message shows no result found
what is the problem in visualization tab data not displaying.
please do the needful. Steps i follows to display data in visualization i mentioned above.
is there any mistake in above steps.
dadoonet
(David Pilato)
December 19, 2017, 6:23am
13
So you can run a first request to get the min value of the timestamp field with a min aggregation. And then exclude this value in your search body.
dadoonet
(David Pilato)
December 19, 2017, 6:51am
14
And please format your code using </>
icon as explained in this guide . It will make your post more readable.
Or use markdown style like:
```
CODE
```
ok. you told that first request to get the min value of the timestamp field with a min aggregation. And then exclude this value in your search body.
how to do please tell me which each step.
Because i am new to elastic search. please do the needful.
dadoonet
(David Pilato)
December 19, 2017, 7:53am
16
If you still don't know, please provide a sample data set with some docs that we can use as described in
The heart of the free and open Elastic Stack
Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data for lightning fast search, fine‑tuned relevancy, and powerful analytics that scale with ease.
PLEASE READ THIS SECTION IF IT'S YOUR FIRST POST
Some useful links:
elasticsearch reference guide
elasticsearch user guide
elasticsearch plugins
elasticsearch cl…
It will help to better understand what you are doing.
Please, try to keep the example as simple as possible.
Please read carefully the instructions. It must be easily runnable.
removing the duplicates the below code is not working
input {
jdbc {
jdbc_driver_library => "D:\mysql-connector-java-5.1.44\mysql-connector-java-5.1.44\mysql-connector-java-5.1.44-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/sample"
jdbc_user => "root"
jdbc_password => "root"
jdbc_fetch_size => 10000
schedule => "* * * * *"
statement => "SELECT * from sample"
#codec => "json"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "clinical" }
document_id => "%{[@metadata ][_RRH_MR_NUM]}"
stdout { codec => rubydebug }
}
to remove the duplicates in elastic search i write the above code in my CONF file above.
but above code is not working.
Please do the needful.
what is the mistake in my above code.
dadoonet
(David Pilato)
December 19, 2017, 9:22am
18
Please format your code if you expect someone to read it.
format my code and sent it again
input {
jdbc {
jdbc_driver_library => "D:\mysql-connector-java-5.1.44\mysql-connector-java-5.1.44\mysql-connector-java-5.1.44-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/sample"
jdbc_user => "root"
jdbc_password => "root"
jdbc_fetch_size => 10000
schedule => "* * * * *"
statement => "SELECT * from sample"
#codec => "json"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "clinical" }
document_id => "%{[@metadata ][_RRH_MR_NUM]}"
stdout { codec => rubydebug }
}
to remove the duplicates in elastic search i write the above code in my CONF file above.
but above code is not working.
please help me. what is wrong in my above code. i tried several times it is not working,
dadoonet
(David Pilato)
December 19, 2017, 10:22am
20
Please read what I wrote here.
And please format your code using </> icon as explained in this guide . It will make your post more readable.
Or use markdown style like:
```
CODE
```
Edit your post. Make sure in the preview window that it is correct. Thanks.