Workplace /Salesforce connector missing data

Hi there,

I installed Enterprise Search 7.9.1 (Workplace) with Elasticsearch 7.9.0 on a Mac (localhost).
I was able to connect the Github and Dropbox connector and retrieve the data in the index. I also activated the Salesforce connector and retrieved Salesforce data.
According to the user guide, the SF connector will index: Contacts, Opportunities, Leads, Accounts, Campaigns and Attachments.
The problem: the index has everything except Opportunities and Attachments. Those do exist on the SF website (Lightning).
A) What can I do to fix that?

B) In general, is there a limit on the number of documents that will be indexed? The Github connector shows that it indexed 2800 documents, but I really have the feeling that I have much more data on Github.

C) How can I install Kibana and have it work on the same index?

D) How can I customize the Index and the default Workplace Search UI? I would like to add more fields in the index that I can use to generate facets. (Thus customize the default connectors e.g. Salesforce and add a new field to index).

E) How can I interact with the Index using the (dev) console?

Thx for any ideas!

Hi @Gertjan, thanks for the interest in Elastic Workplace Search!

A) There is a known issue with Lightning attachments. Salesforce represents them differently in the backend between classic and lightning. We hope to fix this discrepancy so that you can index either with Workplace Search. Your problem with Opportunities is not one we've seen. Do you have any error messages in your logs? Have you verified that the user used to connect to Salesforce through Workplace Search has permission to view the missing opportunities?

B) There is a limit, per sync job, of 10,000,000 documents. So that's not your issue with your github source. Remember, the Github source isn't indexing your sourcecode - only the Repositories, Issues, Pull Requests, and Organizations. Make sure that when you connected, you chose all of the Organizations that you want synced. If you only chose one of many, you will only have synced the one.

C) With Elasticsearch on localhost, you could also install Kibana on localhost. All of the Enterprise Search indexes start with .ent-search*. However, these are subject to change, and are not intended to be directly operated on by a user - you should use Workplace Search to apply any changes. Is there something in particular you're wanting to view/do with Kibana?

D) You cannot currently customize the fields retrieved by the Out-of-th-box connectors (like Salesforce). If you want custom fields, I suggest you take a look into the Custom API Sources. If you have a support relationship with Elastic, I'd also encourage you to file an enhancement request for support of Salesforce Custom Objects/Fields.

E) Again, all the indexes are prefixed with .ent-search, but I recommend that you not take this approach as the indexes and their contents are subject to change without notice. If you explain what you're wanting to do, perhaps we can help you in a more supportable fashion.

Thanks,

Sean

Hi Sean,
Thanks for reply.

I checked the logs and found some problems.
Salesforce:

[2020-09-18T18:26:11.916+00:00][52589][2446][connectors][INFO]: [Job 5f64fbbf39b13222c8f7a5aa] Beginning to work on purge job service_type=salesforce; number of purge ids=1838; cursors={}
[2020-09-18T18:26:11.985+00:00][52589][2642][connectors][INFO]: [Job 5f64fbbf39b13222c8f7a5aa] Successfully updated status to enqueued
[2020-09-18T18:26:12.099+00:00][52589][2642][connectors][INFO]: [Job 5f64fbbf39b13222c8f7a5aa] Successfully updated status to working
[2020-09-18T18:26:12.678+00:00][52589][2446][connectors][WARN]: ContentSource[5f64faf539b132fa19f75181, salesforce]: #yield_deleted_ids not implemented yet for Salesforce
[2020-09-18T18:26:13.311+00:00][52589][2446][connectors][WARN]: ContentSource[5f64faf539b132fa19f75181, salesforce]: #yield_deleted_ids not implemented yet for Salesforce
[2020-09-18T18:26:13.679+00:00][52589][2446][connectors][WARN]: ContentSource[5f64faf539b132fa19f75181, salesforce]: #yield_deleted_ids not implemented yet for Salesforce

This warning continues with every loop, but I can't find an error within a Salesforce job.
The salesforce connector user has admin rights and I can see all the opportunities within Salesforce with that user.
When I search an SF account in the Workplace UI and I select the account name, a 'quick view'/summary of the Account appears and in that section I see an opportunity and the status. However, it shows per account only 1 random opportunity.

I would like to see all the opportunities as facets at the left side.

Github connector shows errors: Abuse, Throttling

[2020-09-21T15:04:42.858+00:00][52589][2404][connectors][ERROR]: Encountered error in Index: '{:job_id=>BSON::ObjectId('5f68c10539b13213b6fa4b21'), :content_source_id=>BSON::ObjectId('5f637bc739b132544bf68f63'), :org_id=>BSON::ObjectId('5f62e06839b1328434d43dc0'), :config=>{:service_type=>"github", :cursors=>{}, :user_repositories=>[], :organizations=>{"*xxxx*"=>{:repositories=>[]}}}}'
/Users/xxxx/enterprise-search-7.9.1/lib/war/gems/gems/octokit-4.6.2/lib/octokit/response/raise_error.rb:16:in 'on_complete': GET https://api.github.com/orgs/*XXXX*: 403 - You have triggered an abuse detection mechanism. Please wait a few minutes before you try again. // See: https://developer.github.com/v3/#abuse-rate-limits (Octokit::AbuseDetected)

[2020-09-21T15:09:34.622+00:00][52589][2642][connectors][WARN]: [Job 5f680f0839b132d347f9dce6] Tried to remove job that does not exist! [2020-09-21T15:09:34.622+00:00][52589][2642][connectors][INFO]: [Job 5f680f0839b132d347f9dce6] Successfully updated status to error [2020-09-21T15:10:01.696+00:00][52589][2430][connectors][ERROR]: Encountered error in Purge: '{:job_id=>BSON::ObjectId('5f680f0839b132d347f9dce6'), :content_source_id=>BSON::ObjectId('5f637bc739b132544bf68f63'), :org_id=>BSON::ObjectId('5f62e06839b1328434d43dc0'), :config=>{:service_type=>"github", :cursors=>{}, :user_repositories=>[], :organizations=>{"xxxx"=>{:repositories=>[]}}}}' xxxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/github/extractor.class:94:in ''convert_rate_limit_errors': Connectors::ThrottlingError (Connectors::ThrottlingError) from /Users/xxxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:95:in 'block in with_auth_tokens_and_retry' from /Users/xxxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:189:in 'convert_transient_server_errors' from /xxxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:94:in 'with_auth_tokens_and_retry' from /xxxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:163:in 'block in deleted_ids'

and

[2020-09-21T15:10:42.635+00:00][52589][2448][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] Worker *xxx*.local is attempting to claim job [2020-09-21T15:10:42.700+00:00][52589][2432][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] Beginning to work on indexing job; service_type=github; cursors={} [2020-09-21T15:10:42.759+00:00][52589][2642][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] Successfully updated status to enqueued [2020-09-21T15:10:42.820+00:00][52589][2642][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] Successfully updated status to working [2020-09-21T15:10:43.190+00:00][52589][2432][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] service type: github was suspended until 2020-09-21 15:22:18 +0000 but made no progress with cursors: {} [2020-09-21T15:10:43.194+00:00][52589][2432][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] Suspending work on job until 2020-09-21 15:22:18 +0000 [2020-09-21T15:10:43.253+00:00][52589][2432][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] Batch of documents was empty. Skipping. [2020-09-21T15:10:43.257+00:00][52589][2432][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] Batch of documents was empty. Skipping. [2020-09-21T15:10:43.262+00:00][52589][2432][connectors][INFO]: [Job 5f68c26e39b1325fa9fa6a06] Batch of user fields was empty. Skipping.

[2020-09-21T15:12:57.892+00:00][52589][2432][connectors][ERROR]: Encountered error in Purge: '{:job_id=>BSON::ObjectId('5f68c10539b13213b6fa4b22'), :content_source_id=>BSON::ObjectId('5f637bc739b132544bf68f63'), :org_id=>BSON::ObjectId('5f62e06839b1328434d43dc0'), :config=>{:service_type=>"github", :cursors=>{}, :user_repositories=>[], :organizations=>{"*xxxx*"=>{:repositories=>[]}}}}' /Users/xxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/github/extractor.class:94:in 'convert_rate_limit_errors': Connectors::ThrottlingError (Connectors::ThrottlingError) from /Users/xxx/Downloads/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:95:in 'block in with_auth_tokens_and_retry' from /Users/xxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:189:in 'convert_transient_server_errors' from /Users/xxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:94:in 'with_auth_tokens_and_retry' from /Users/xxxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:163:in 'block in deleted_ids' from /Users/xxx/enterprise-search-7.9.1/lib/war/gems/gems/statsd-instrument-2.1.1/lib/statsd/instrument.rb:284:in 'block in measure' from /Users/xxxx/enterprise-search-7.9.1/lib/war/gems/gems/statsd-instrument-2.1.1/lib/statsd/instrument.rb:53:in 'duration' from /Users/xxxx/enterprise-search-7.9.1/lib/war/gems/gems/statsd-instrument-2.1.1/lib/statsd/instrument.rb:284:in 'measure' from /Users/xxxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/stats.class:4:in 'measure' from /Users/xxxxs/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/content_sources/base/extractor.class:162:in 'deleted_ids' from /Users/xxxx/enterprise-search-7.9.1/lib/war/connectors/lib/connectors/work/purge.class:15:in 'block in run' from org/jruby/RubyYielder.java:117:in 'yield' from org/jruby/RubyYielder.java:122:in '<<' from uri:classloader:/jruby/kernel/enumerator.rb:210:in 'block in map' from uri:classloader:/jruby/kernel/enumerator.rb:124:in 'block in initialize' from /Users/adm1n1strator/Downloads/enterprise-search-7.9.1/lib/war/lib/swiftype/es/index.class:328:in 'block in search_after' from org/jruby/RubyArray.java:1814:in 'each'

(Sorry, I had problems to format the logs)

The Elastic Workplace is great to quickly setup a search experience by connecting and retrieving data from our own data sources. Now I want to further customize the experience.
I installed Kibana because I would like to use the Dev tools console: have easy access to the index and try to create some dashboards.

Would it be possible to create a Pipeline and consume the out-of-the-box connectors before they are ingested in the Index?

Thanks, Gertjan

Thanks for pointing that out. That should be a DEBUG statement rather than a WARN. I'll get that fixed, but it shouldn't be impacting anything for you, it's just too verbose.

Upon digging into the Opportunities being missing, there's definitely a gap between what our docs advertise and what we display with Opportunities. Thanks for bringing this to our attention, I can see what you're seeing. As you noted, Accounts show the first Opportunity associated with them. Leads also show their converted Opportunities.

Unfortunately, there's not a way to use opportunities as facets at the moment. If you have a support relationship with Elastic, I'd recommend raising this an an enhancement request.

This looks to be exactly what it says it is. Are you sharing the same OAuth app with other projects/tools? Have you checked to see how close you are to exceeding your rate limit(s)? The Connectors::ThrottlingError (Connectors::ThrottlingError) tells us that we've acknowledged that Github is throttling Workplace Search, and are canceling the job to provide some backoff. It may succeed on its next try, but if you keep having issues it might be worth following up with Github to see what your limits are and how you're reaching them.

Hi Sean,

Thanks for the feedback.
I have indeed another program (TargetProcess) that syncs regularly with Github. Will check that more in detail.
A+, Gertjan

Perfect!

Oops and I'm realizing I didn't respond to some of your other queries...

For that, I think you'll want to take a look at the search API
unfortunately, that's not baked into Kibana, but you shouldn't have a hard time creating visualizations as you'd like.

That is not currently possible. Though another good topic for an enhancement request if you have a support relationship with Elastic.