Elastic machine learning code source

Hello everybody,

I started using elastic and kibana from code source, could someone please tell me how can I integrate machine learning "https://github.com/elastic/ml-cpp" with elasticsearch "https://github.com/elastic/elasticsearch"

Thanks for your help

You have to set up a directory structure like this:

elasticsearch
elasticsearch-extra/ml-cpp

In other words, the elasticsearch-extra directory is at the same level as elasticsearch , which is a clone of the elasticsearch repo. Then ml-cpp needs to be a sub-directory of elasticsearch-extra . ml-cpp in this structure can be a symlink to wherever you currently have it cloned, which makes things easier.

Then, when you run ./gradlew assemble in the elasticsearch directory it will build the C++ locally instead of downloading a pre-built bundle from S3. This allows you to incorporate local C++ changes into a local Elasticsearch build.

Before you can build the C++ at all you will need to set your machine up using the appropriate set of instructions from https://github.com/elastic/ml-cpp/tree/master/build-setup

1 Like

Thanks for your reply, I followed the instructions and now when I try to run elasticsearch, I see that it is taking the machine learning repo into consideration so the integration is working.
The problem now is that when I try to run in ml repo make -j 7 test or when I try to run Elasticsearch, I am getting the error :

***** 1 failure is detected in the test module "lib.maths"**
make[3]: *** [/home/halim/ELK/ml-cpp/mk/stdboosttest.mk:41: test] Error 201
make[3]: Leaving directory '/home/halim/ELK/ml-cpp/lib/maths/unittest'
/home/halim/ELK/ml-cpp/lib/maths make test FAILURE!!!
make[2]: *** [/home/halim/ELK/ml-cpp/mk/dynamiclib.mk:35: test] Error 1
make[2]: Leaving directory '/home/halim/ELK/ml-cpp/lib/maths'
make[1]: *** [/home/halim/ELK/ml-cpp/mk/toplevel.mk:105: test] Error 1
make[1]: Leaving directory '/home/halim/ELK/ml-cpp/lib'
make: *** [/home/halim/ELK/ml-cpp/mk/toplevel.mk:105: test] Error 1

Look in /home/halim/ELK/ml-cpp/lib/maths/unittest/boost_test_results.xml. That will show which test failed.

I have error in this line:

FatalError file="CBayesianOptimisationTest.cc" line="318"
critical check improvementBopt > improvementRs has failed [0 <= 0.24690042419103589]

You can run that test individually like this (assuming you previously ran make test so all the unit test code is built):

cd lib/maths/unittest
./ml_test --run_test=CBayesianOptimisationTest/testMaximumExpectedImprovement

This is what I see when I do that:

Running 1 test case...
2020-07-29 08:12:03,855221 UTC [4405] DEBUG CTestObserver.cc@23 +------------------------------------------------------------+
2020-07-29 08:12:03,855313 UTC [4405] DEBUG CTestObserver.cc@24 |  CBayesianOptimisationTest/testMaximumExpectedImprovement  |
2020-07-29 08:12:03,855323 UTC [4405] DEBUG CTestObserver.cc@25 +------------------------------------------------------------+
2020-07-29 08:12:04,364666 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 60.0624, % improvement RS = 35.4576
2020-07-29 08:12:04,925346 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 13.7483, % improvement RS = 0
2020-07-29 08:12:05,408619 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 8.06812, % improvement RS = 4.17033
2020-07-29 08:12:06,432775 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 92.8491, % improvement RS = 71.9993
2020-07-29 08:12:06,887594 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 39.7416, % improvement RS = 14.5358
2020-07-29 08:12:07,538218 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 80.5928, % improvement RS = 59.4046
2020-07-29 08:12:07,992483 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 41.0946, % improvement RS = 24.69
2020-07-29 08:12:08,441352 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 54.3243, % improvement RS = 17.3618
2020-07-29 08:12:08,980578 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 33.8824, % improvement RS = 25.7448
2020-07-29 08:12:09,472610 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 94.8113, % improvement RS = 32.3035
2020-07-29 08:12:10,072331 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 71.0107, % improvement RS = 62.1166
2020-07-29 08:12:10,405810 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 39.8728, % improvement RS = 27.7905
2020-07-29 08:12:11,071414 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 12.8327, % improvement RS = 0
2020-07-29 08:12:11,529477 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 29.7552, % improvement RS = 20.5083
2020-07-29 08:12:12,036492 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 85.1209, % improvement RS = 29.2093
2020-07-29 08:12:12,561507 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 45.6663, % improvement RS = 25.6787
2020-07-29 08:12:13,041572 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 70.5058, % improvement RS = 50.2083
2020-07-29 08:12:13,529508 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 74.2169, % improvement RS = 38.4275
2020-07-29 08:12:14,109159 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 47.1911, % improvement RS = 11.6442
2020-07-29 08:12:14,603419 UTC [4405] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 33.9187, % improvement RS = 0
2020-07-29 08:12:14,603456 UTC [4405] DEBUG CBayesianOptimisationTest.cc@324 mean % improvement BO = 51.4633
2020-07-29 08:12:14,603464 UTC [4405] DEBUG CBayesianOptimisationTest.cc@326 mean % improvement RS = 27.5626
2020-07-29 08:12:14,603509 UTC [4405] INFO  CTestObserver.cc@35 Unit test timing - CBayesianOptimisationTest/testMaximumExpectedImprovement took 10748ms

You will see the test failure error message at some point during that output.

Did you change any code? If you did then that may be the cause of your failure.

If you didn't change any code then the problem may be related to the exact platform you're running on. What is your OS, OS version and hardware architecture? For example, if you are trying to get the ML code to compile on Linux on s390x, PowerPC or RaspberryPi then it's not surprising if there are differences in the floating point implementation on those different hardware architectures.

Thanks for your reply,

here is the output that I get when I run the command :
./ml_test --run_test=CBayesianOptimisationTest/testMaximumExpectedImprovement


Running 1 test case...
2020-07-29 08:26:12,204855 UTC [15652] DEBUG CTestObserver.cc@23 +------------------------------------------------------------+
2020-07-29 08:26:12,204924 UTC [15652] DEBUG CTestObserver.cc@24 |  CBayesianOptimisationTest/testMaximumExpectedImprovement  |
2020-07-29 08:26:12,204936 UTC [15652] DEBUG CTestObserver.cc@25 +------------------------------------------------------------+
2020-07-29 08:26:12,651002 UTC [15652] DEBUG CBayesianOptimisationTest.cc@316 % improvement BO = 60.0783, % improvement RS = 35.4576
2020-07-29 08:26:13,092090 UTC [15652] DEBUG CBayesianOptimisationTest.cc@316 % improvement BO = 12.7077, % improvement RS = 0
2020-07-29 08:26:13,506686 UTC [15652] DEBUG CBayesianOptimisationTest.cc@316 % improvement BO = 8.06812, % improvement RS = 4.17033
2020-07-29 08:26:14,370432 UTC [15652] DEBUG CBayesianOptimisationTest.cc@316 % improvement BO = 92.8491, % improvement RS = 71.9993
2020-07-29 08:26:14,720186 UTC [15652] DEBUG CBayesianOptimisationTest.cc@316 % improvement BO = 39.7416, % improvement RS = 14.5358
2020-07-29 08:26:15,231838 UTC [15652] DEBUG CBayesianOptimisationTest.cc@316 % improvement BO = 80.4244, % improvement RS = 59.4046
2020-07-29 08:26:15,740521 UTC [15652] DEBUG CBayesianOptimisationTest.cc@316 % improvement BO = 0, % improvement RS = 24.69
CBayesianOptimisationTest.cc(318): fatal error: in "CBayesianOptimisationTest/testMaximumExpectedImprovement": critical check improvementBopt > improvementRs has failed [0 <= 0.24690042419103589]
2020-07-29 08:26:15,740848 UTC [15652] INFO  CTestObserver.cc@35 Unit test timing - CBayesianOptimisationTest/testMaximumExpectedImprovement took 3535ms

*** 1 failure is detected in the test module "lib.maths"

I didn't change the source code,
here are the information of my machine :

OS information:
Ubuntu 20.04.1 LTS
64 bits
gnome version: 3.36.3

CPU information
Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities

Thanks for the information. Maybe you have found a bug. I have opened https://github.com/elastic/ml-cpp/issues/1440 to get it investigated.

In the meantime, if you change line 236 of CBayesianOptimisationTest.cc from:

BOOST_AUTO_TEST_CASE(testMaximumExpectedImprovement) {

to:

BOOST_AUTO_TEST_CASE(testMaximumExpectedImprovement, *boost::unit_test::disabled()) {

then you should be able to run the rest of the unit tests and see if anything else fails on your machine.

Thanks for your help, I tried to make the changes as you suggested but unfortunately I had the same error

You would need to rebuild the unit test code after making that change, i.e. just type make in the lib/maths/unittest directory.

If you did that then maybe a different test is failing now?

I rebuilt the unit by running make in the lib/math/unittest after making the changes in CBayesianOptimisationTest.cc , but I still have the same issue

Running 1 test case...
2020-07-29 09:46:31,885895 UTC [20357] DEBUG CTestObserver.cc@23 +------------------------------------------------------------+
2020-07-29 09:46:31,885945 UTC [20357] DEBUG CTestObserver.cc@24 |  CBayesianOptimisationTest/testMaximumExpectedImprovement  |
2020-07-29 09:46:31,885958 UTC [20357] DEBUG CTestObserver.cc@25 +------------------------------------------------------------+
2020-07-29 09:46:32,330315 UTC [20357] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 60.0783, % improvement RS = 35.4576
2020-07-29 09:46:32,766911 UTC [20357] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 12.7077, % improvement RS = 0
2020-07-29 09:46:33,179550 UTC [20357] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 8.06812, % improvement RS = 4.17033
2020-07-29 09:46:34,047593 UTC [20357] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 92.8491, % improvement RS = 71.9993
2020-07-29 09:46:34,409967 UTC [20357] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 39.7416, % improvement RS = 14.5358
2020-07-29 09:46:34,932113 UTC [20357] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 80.4244, % improvement RS = 59.4046
2020-07-29 09:46:35,330756 UTC [20357] DEBUG CBayesianOptimisationTest.cc@317 % improvement BO = 0, % improvement RS = 24.69
CBayesianOptimisationTest.cc(319): fatal error: in "CBayesianOptimisationTest/testMaximumExpectedImprovement": critical check improvementBopt > improvementRs has failed [0 <= 0.24690042419103589]
2020-07-29 09:46:35,330996 UTC [20357] INFO  CTestObserver.cc@35 Unit test timing - CBayesianOptimisationTest/testMaximumExpectedImprovement took 3444ms

*** 1 failure is detected in the test module "lib.maths"

The change is to disable that test by default, but you are still running it explicitly.

Does it work now if you just run:

cd lib/maths
make test

That should run all the tests that are enabled by default, and the one that has the problem should be skipped.

If that works then running make -j 7 test from the top level of the repo will get further than it did before.

Thanks now it's working when I run make test on the lib/maths directory
I will try make -j 7 test from the top of the repo and will keep you posted

now when I run the make -j 7 test command on the top level of the repo, I am getting these errors:

*** 11 failures are detected in the test module "lib.api"
make[4]: *** [/home/halim/ELK/elasticsearch-extra/ml-cpp/mk/stdboosttest.mk:41: test] Error 201
make[4]: Leaving directory '/home/halim/ELK/ml-cpp/lib/api/unittest'
/home/halim/ELK/ml-cpp/lib/api make test FAILURE!!!
make[3]: *** [/home/halim/ELK/elasticsearch-extra/ml-cpp/mk/dynamiclib.mk:35: test] Error 1
make[3]: Leaving directory '/home/halim/ELK/ml-cpp/lib/api'
make[2]: *** [/home/halim/ELK/elasticsearch-extra/ml-cpp/mk/toplevel.mk:105: test] Error 1
make[2]: Leaving directory '/home/halim/ELK/ml-cpp/lib/api'
make[1]: *** [/home/halim/ELK/elasticsearch-extra/ml-cpp/mk/toplevel.mk:105: test] Error 1
make[1]: Leaving directory '/home/halim/ELK/ml-cpp/lib'
make: *** [/home/halim/ELK/elasticsearch-extra/ml-cpp/mk/toplevel.mk:105: test] Error 1

And when I try to run elasticsearch, I am getting this exception :

org.elasticsearch.bootstrap.StartupException: org.elasticsearch.ElasticsearchException: Failure running machine learning native code. This could be due to running on an unsupported OS or distribution, missing OS libraries, or a problem with the temp directory. To bypass this problem by running Elasticsearch without machine learning functionality set [xpack.ml.enabled: false].

Presumably whatever caused the difference you are observing in the maths library code is having a greater effect on the downstream code that uses it.

If you run lib/api/unittest/ml_test --report_sink=report.txt and then look in report.txt you should see the names of the 11 tests that failed and which assertion failed in each.

Hopefully there is just a single underlying problem in the code that is responsible for all of these so that when we fix https://github.com/elastic/ml-cpp/issues/1440 they will all go away.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.