Elasticsearch Performance Extreme Under Very long string field


(Zhengda Wu) #1
    try {
        Settings settings = Settings.builder().put("cluster.name", "phm").build();
        TransportClient client = new PreBuiltTransportClient(settings)
                .addTransportAddress(new TransportAddress(InetAddress.getByName("..."), 9300))
                .addTransportAddress(new TransportAddress(InetAddress.getByName("..."), 9300))
                .addTransportAddress(new TransportAddress(InetAddress.getByName("..."), 9300));
        client.admin().indices().create(new CreateIndexRequest("newtest")).actionGet();
        System.out.println("Step1");

        PutMappingResponse putMappingResponse = client.admin().indices()
                .preparePutMapping("newtest")
                .setType("doc")
                .setSource(jsonBuilder().prettyPrint()
                    .startObject()
                            .startObject("properties")
                        .startObject("rawdata").field("type", "keyword").field("index", "false").endObject()
                        .startObject("spectrum").field("type", "keyword").field("index", "false").endObject()
                        .startObject("id").field("type", "integer").endObject()
                         .startObject("timestamp").field("type", "integer").endObject()
                        .startObject("health").field("type", "integer").endObject()
                        .startObject("rul").field("type", "integer").endObject()
                        .startObject("RMS").field("type", "integer").endObject()
                        .startObject("VAR").field("type", "integer").endObject()
                        .startObject("peak").field("type", "integer").endObject()
                        .startObject("CrestIndex").field("type", "integer").endObject()
                        .startObject("peakpeak").field("type", "integer").endObject()
                        .startObject("MarginIndex").field("type", "integer").endObject()
                        .startObject("SkewNess").field("type", "integer").endObject()
                        .startObject("SkewnessIndex").field("type", "integer").endObject()
                        .startObject("kurtosis").field("type", "integer").endObject()
                        .startObject("KurtosisIndex").field("type", "integer").endObject()
                        .startObject("InpulseIndex").field("type", "integer").endObject()
                        .startObject("WaveformIndex").field("type", "integer").endObject()
                                .endObject()
                        .endObject())
                .execute().actionGet();

        System.out.println("Step2");






        String raw_data = "";
        String spectrum = "";
        for (int i = 0; i < 100000; i++) {
            raw_data = raw_data + "aaaaaaaaaa";
            System.out.println(i);
        }
        for (int i = 0; i < 50000; i++) {
            spectrum = spectrum + "bbbbbbbbbb";

        }


        for (int j = 0; j < BULK_NUM; j++) {
            BulkRequestBuilder request = client.prepareBulk();
            for (int i = 0; i < BULK_SIZE; i++) {
                Map<String, Object> parseObject = new HashMap<String, Object>();
                int Max = 10000;
                int Min = 0;
                parseObject.put("id", 10000 * j + i);
                parseObject.put("timestamp", timestamp);
                parseObject.put("health", health);
                parseObject.put("rul", rul);
                parseObject.put("RMS", RMS);
                parseObject.put("VAR", VAR);
                parseObject.put("peak", peak);
                parseObject.put("CrestIndex", CrestIndex);
                parseObject.put("peakpeak", peakpeak);
                parseObject.put("MarginIndex", MarginIndex);
                parseObject.put("SkewNess", SkewNess);
                parseObject.put("SkewnessIndex", SkewnessIndex);
                parseObject.put("kurtosis", kurtosis);
                parseObject.put("KurtosisIndex", KurtosisIndex);
                parseObject.put("InpulseIndex", InpulseIndex);
                parseObject.put("WaveformIndex", WaveformIndex);
                parseObject.put("RawData", raw_data);
                parseObject.put("Spectrum", spectrum);

                request.add(new IndexRequest("newtest", "doc")
                        .source(parseObject));
            }

             BulkResponse bulkResponse = request.execute().get();

            System.out.println(j);

        }
            client.close();



    }
    catch (Exception e){
        System.out.println("cluster error!");
        exit(2);
    }

Hi everyone, I am having the issue of slow inserting. More specifically, the data I want to insert includes like 100,0000 length strings(see code for detail). I used bulksize of 5(more would cause memory issue). I make those long string field with "index:no" .But still, when testing my insertion on a 3-node cluster(intel Xeon 4 core, HDD, 16GB).. the speed is around 30-50 such data per second, which is very slow. I searched online and get to know that there are many configs or settings I could modify, but I am curious on based on current situation, what is the estimated maximum inserting speeed, like if it is possible to improve it to 10k per second maybe or it is simply not possible because of the bottleneck of something. Thank you so much. My company is dealing with this issue, and I have totally no idea..


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.