Restore Enterprise Search Index From Snapshot

Hello,

I have a question about Enterprise Search created index as a 'web crawler' ingestion method and being able to restore it to a completely new, self-managed deployment. In the case of a catastrophic loss of the original host.

Can the 'ingestion method' be changed on a snapshot restored index from 'API' to 'crawler'?

I am testing this with a fresh Elasticsearch, Kibana and Enterprise Search deploy v8.8.1.

I created an index as a web crawler ingest. Ran the crawl, pulled the documents and things are good there.

Added a shared file system snap repository. Took a snap, and it did fine there too. Confirmed the newly created web crawler ingest index was there.

Then I shutdown the deployment and tore it down. Spun one back up, reconnected the snapshot repository. Kibana UI found it and showed the available snapshot I Just took from the dead, former deployment.

Ran a restore of only the crawler index only. The index does indeed show back up in Kibana > Enterprise Search > Content and documents are in it. But it comes back as a standard API ingestion index and not a crawl index.

Hi @alongaks.

Meta information about Crawler setup is stored in a different index that you will need to restore as well - you can access it by the alias .elastic-connectors - now it should point to an index .elastic-connectors-v1 - restoring it should return your crawler setup.

if you also want to store the history of syncs, you will also need to restore an index behind alias .elastic-connector-sync-jobs, which now points to the index .elastic-connector-sync-jobs-v1.

Technically the least effort possible would be to just take the record from existing .elastic-connectors and push the same record to the new deployment's alias with same name - you can do it manually or easily automate.

1 Like

Hello, Artem

Thanks for the response! This is very helpful.

Should it be functional for me to remove the .elastic-connectors-v1 and .elastic-connector-sync-jobs-v1 from the new deployment then do the snapshot restore of the same from the former, including the crawler index?

I actually tried that method and it did restore the index but still as a standard API index vs. crawler index w/configs.

Hi @alongaks.

Technically it should work. Can you verify that your records are migrated into .elastic-connectors-v1?

You should see a document there with "index_name" field that should be the same as your content index for Crawler.

Yeah, it looks like the index content is there. However, the index still showing up as a standard API index.

I ended up experimenting a little more, stopped Enterprise Search and also removed the .ent-search* indices in the new deployment, then tried to pull those from the snapshot then start Enterprise Search back up and it actully pulled in the full Enterprise Search index configs inclusive of the 'ingest method' being 'Crawler'. When clicking into the index in Ent Search from the UI it will show all the crawl rules, entry points, etc. :slight_smile:

Now the issue is a looping

[app-server][ERROR]: [7598cc79-4fb1-4e4d-b12e-3f47e83f515c] Exception: Unexpected internal error while processing a request: NoMethodError: undefined method `symbolize_keys' for #<String:0x5412468d>
/usr/share/enterprise-search/lib/war/shared_togo/app/models/shared_togo/crawler2/crawl_request.class:283:in `crawl_config_with_configuration': undefined method `symbolize_keys' for #<String:0x5412468d> (NoMethodError)
        from /usr/share/enterprise-search/lib/war/shared_togo/app/models/shared_togo/crawler2/concerns/formatting/crawler_concern.class:112:in `block in format_crawler_events_for_configuration'
        from org/jruby/RubyArray.java:1865:in `each'
        from /usr/share/enterprise-search/lib/war/actastic/lib/actastic/relation.class:325:in `each'
        from /usr/share/enterprise-search/lib/war/shared_togo/app/models/shared_togo/crawler2/concerns/formatting/crawler_concern.class:108:in `format_crawler_events_for_configuration'
        from /usr/share/enterprise-search/lib/war/shared_togo/app/models/shared_togo/crawler2/concerns/formatting/crawler_concern.class:90:in `format_crawler_for_configuration'
        from /usr/share/enterprise-search/lib/war/shared_togo/app/controllers/api/shared_togo/v1/internal/crawler2/overview_controller.class:5:in `index'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_controller/metal/basic_implicit_render.rb:6:in `send_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/abstract_controller/base.rb:195:in `process_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/abstract_controller/callbacks.rb:42:in `block in process_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/callbacks.rb:112:in `block in run_callbacks'
        from /usr/share/enterprise-search/lib/war/shared_togo/app/controllers/api/shared_togo/v1/internal/base_controller.class:77:in `rescue_with_logging'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/callbacks.rb:121:in `block in run_callbacks'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/callbacks.rb:139:in `run_callbacks'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/abstract_controller/callbacks.rb:41:in `process_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_controller/metal/rendering.rb:30:in `process_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_controller/metal/instrumentation.rb:33:in `block in process_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/notifications.rb:180:in `block in instrument'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/notifications/instrumenter.rb:24:in `instrument'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/notifications.rb:180:in `instrument'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_controller/metal/instrumentation.rb:32:in `process_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_controller/metal/rescue.rb:22:in `process_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_controller/metal/params_wrapper.rb:245:in `process_action'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/abstract_controller/base.rb:136:in `process'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionview-6.0.6.1/lib/action_view/rendering.rb:39:in `process'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_controller/metal.rb:190:in `dispatch'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_controller/metal.rb:254:in `dispatch'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/routing/route_set.rb:50:in `dispatch'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/routing/route_set.rb:33:in `serve'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/journey/router.rb:49:in `block in serve'
        from org/jruby/RubyArray.java:1865:in `each'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/journey/router.rb:32:in `serve'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/routing/route_set.rb:834:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-attack-6.6.0/lib/rack/attack.rb:103:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/json_parser_error_middleware.class:9:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/rewrite_deprecated_routes_middleware.class:24:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/system_logging_middleware.class:37:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-attack-6.6.0/lib/rack/attack.rb:127:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/eweb_access_middleware.class:26:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/warden-1.2.7/lib/warden/manager.rb:36:in `block in call'
        from org/jruby/RubyKernel.java:1237:in `catch'
        from /usr/share/enterprise-search/lib/war/gems/gems/warden-1.2.7/lib/warden/manager.rb:35:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-cors-1.0.6/lib/rack/cors.rb:98:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/tempfile_reaper.rb:17:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/etag.rb:27:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/conditional_get.rb:27:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/head.rb:14:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/content_type_middleware.class:21:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/content_security_policy_middleware.class:13:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/http/content_security_policy.rb:18:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/session/abstract/id.rb:269:in `context'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/session/abstract/id.rb:263:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/cookies.rb:654:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/cookie_strip_middleware.class:12:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/callbacks.rb:27:in `block in call'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/callbacks.rb:101:in `run_callbacks'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/callbacks.rb:26:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/actionable_exceptions.rb:22:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/debug_exceptions.rb:32:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/show_exceptions.rb:33:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/railties-6.0.6.1/lib/rails/rack/logger.rb:37:in `call_app'
        from /usr/share/enterprise-search/lib/war/gems/gems/railties-6.0.6.1/lib/rails/rack/logger.rb:26:in `block in call'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/tagged_logging.rb:80:in `block in tagged'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/tagged_logging.rb:28:in `tagged'
        from /usr/share/enterprise-search/lib/war/gems/gems/activesupport-6.0.6.1/lib/active_support/tagged_logging.rb:80:in `tagged'
        from /usr/share/enterprise-search/lib/war/gems/gems/railties-6.0.6.1/lib/rails/rack/logger.rb:26:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/silencer_middleware.class:9:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/remote_ip.rb:81:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/request_id_middleware.class:11:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/method_override.rb:24:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/runtime.rb:24:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/executor.rb:14:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/rack-2.1.4.3/lib/rack/sendfile.rb:113:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/actionpack-6.0.6.1/lib/action_dispatch/middleware/host_authorization.rb:97:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/stats_middleware.class:10:in `call'
        from /usr/share/enterprise-search/lib/war/lib/middleware/external_host_middleware.class:26:in `call'
        from /usr/share/enterprise-search/lib/war/gems/gems/railties-6.0.6.1/lib/rails/engine.rb:527:in `call'
        from /usr/share/enterprise-search/lib/war/vendor/fishwife-servlet/lib/fishwife/rack_servlet.rb:74:in `service'

But I'm liking the progress, just not familiar with what that error out means. :wink:

I made some progress on this.

Turns out, I forgot about the secret_management.encryption_keys: setting that is applied during the Enterprise Search app initial configuration. I have it as part of my Ansible play and kind of gloss over it at this point as it does its thing, working through the stack install tasks.

When restoring an Enterprise Search deployment from a backup, you need to make sure your configuration file contains the right set of encryption keys to allow you to gain access to the restored dataset (on Elastic Cloud it happens automatically).

After doing a restore of the...

.elastic-connectors-v1
.elastic-connector-sync-jobs-v1
.ent-search*
[crawler-index-to-restore]

... and also set the former encryption key in the enterprise-search.yml config the full functionality of the crawl index was restored - no more looping error in the Ent Search log and no errors in the UI while inside the Enterprise Search content - crawl index.

A couple of pieces to the crawler index configs that didn't come across are the custom pipelines and the entry points. Those were not restored with the index. The crawl schedule and crawl rules were.

I suppose there may be other, supporting indexes that need to be restored to bring those in?

Below is the process I used and seems to work to complete a restore of a crawl index. The deployment is 8.8.1 on both the 'old' and the 'new'. It is a single-node deployment.

Disclaimer: There may be a best practice way to do this for Enterprise Search but only the high-level snapshot restore documentation for Elasticsearch was available and not an Ent Search centric step-by-step. It is possible it was overlooked somewhere.

This crawl index does not include search engines or meta engines, but that is another bit to test if they get restored.

After re-installing a new Elasticsearch, Kibana and Enterprise Search single-node deploy…

  • Ensure the snapshot mount is connected to the new deployment and the correct config set in elasticsearch.yml. Also add the snapshot repository in the Kibana UI
  • Ensure retention of the old 'secret_management.encryption_keys:' details. Insert it into the new deployment enterprise-search.yml file in the appropriate line.
  • Stop the enterprise-search service
  • Kibana UI > Stack Management > Index Management
  • Toggle ‘Include hidden indices’
  • Search ‘.ent-search’ and toggle at the bottom to show 100 results, then ‘select all’ box on the left next to ‘Name’ then ‘Manage index’
  • Click ‘Delete index’ and repeat through each page. This is preferred vs. the API to physically ‘see’ what is being deleted and eliminate hasty fat-fingering keystrokes.
  • Go back to the search field and search ‘.elastic-‘ and delete ‘.elastic-connectors-sync-jobs-v1’ and ‘.elastic-connectors-v1’
  • Kibana UI > Snapshot and Restore > Snapshots tab – choose the snapshot desired, click the ‘Restore’ action button on the right.
  • Toggle off ‘All data streams and indices’.
  • Click ‘Use index patterns’ just above the search field.
  • Enter the index names recently deleted: ‘.elastic-‘ and select both of those. Use the wildcard ‘.ent-search*’ and add the query as a ‘custom option’. Add the crawler index name.
  • Leave remaining options as-is, click ‘Next
  • No changes to the ‘Index settings’ section. Click ‘Next’.
  • Review and then click ‘Restore snapshot’.
  • Go back to the ‘Restore Status’ tab to validate the restore. It may be necessary to reload/refresh the browser window. The ‘.ent-search*’ indices did not show up in the restore status, but it can be validated in the UI > Stack Management > Index Management > toggle ‘show hidden indices’ and do the ‘.ent-search*’ query. They should be there.
  • Waiting about 5min to confirm it was done, start enterprise-search and tail the app-log ensuring it starts up fine. If you want extra brownie points, tail the elasticsearch log and it should reflect if the snapshot restore is done.
  • Validate the index is indeed there UI > Enterprise Search > Content – Indices. See if the desired index is there. Click inside and look at documents, Try 'Manage Domains' > [domain name] > 'Crawl rules', 'Entry points', etc.

This works for me with the crawl indices, minus the pipelines and entry points. Though, after a couple other snapshot restore tests, the pipelines were halfway restored ( end up having to remake anyway as the full custom pipeline was not there ). YMMV.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.