Httpjson Crawler is not compatible with the --once option

I use filebeat's httpjson inputs to export elasticsearch index data,

D:\Download\filebeat-7.9.2-windows-x86_64>filebeat.exe run -e -c 2.yml
2020-09-27T18:24:01.195+0800    INFO    [httpjson]      httpjson/input.go:471   Continuing with pagination to URL: http://192.168.1.15:9200/_search/scroll      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:24:01.361+0800    INFO    [httpjson]      httpjson/input.go:471   Continuing with pagination to URL: http://192.168.1.15:9200/_search/scroll      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:24:01.538+0800    INFO    [httpjson]      httpjson/input.go:471   Continuing with pagination to URL: http://192.168.1.15:9200/_search/scroll      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:24:01.690+0800    INFO    [httpjson]      httpjson/input.go:471   Continuing with pagination to URL: http://192.168.1.15:9200/_search/scroll      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:24:01.861+0800    INFO    [httpjson]      httpjson/input.go:471   Continuing with pagination to URL: http://192.168.1.15:9200/_search/scroll      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:24:02.034+0800    INFO    [httpjson]      httpjson/input.go:471   Continuing with pagination to URL: http://192.168.1.15:9200/_search/scroll      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:24:03.014+0800    INFO    [httpjson]      httpjson/input.go:471   Continuing with pagination to URL: http://192.168.1.15:9200/_search/scroll      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:24:05.108+0800    INFO    [httpjson]      httpjson/input.go:471   Continuing with pagination to URL: http://192.168.1.15:9200/_search/scroll      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:24:05.114+0800    INFO    [httpjson]      httpjson/input.go:148   httpjson input worker has stopped.      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}

The log shows that when the paging is exhausted, httpjson inputs stops automatically.

I want filebeat to exit automatically when the paging is exhausted, so I use the --once option, but an error occurs, and the log shows that the httpjson inputs has not had time to send the request, and the program automatically exits.

D:\Download\filebeat-7.9.2-windows-x86_64>filebeat.exe run --once -e -c 2.yml
2020-09-27T18:28:18.802+0800    INFO    instance/beat.go:640    Home path: [D:\Download\filebeat-7.9.2-windows-x86_64] Config path: [D:\Download\filebeat-7.9.2-windows-x86_64] Data path: [D:\Download\filebeat-7.9.2-windows-x86_64\data] Logs path: [D:\Download\filebeat-7.9.2-windows-x86_64\logs]
2020-09-27T18:28:18.802+0800    INFO    instance/beat.go:648    Beat ID: c490fa4d-038b-42ce-84d0-de261be73d83
2020-09-27T18:28:18.803+0800    INFO    [beat]  instance/beat.go:976    Beat info       {"system_info": {"beat": {"path": {"config": "D:\\Download\\filebeat-7.9.2-windows-x86_64", "data": "D:\\Download\\filebeat-7.9.2-windows-x86_64\\data", "home": "D:\\Download\\filebeat-7.9.2-windows-x86_64", "logs": "D:\\Download\\filebeat-7.9.2-windows-x86_64\\logs"}, "type": "filebeat", "uuid": "c490fa4d-038b-42ce-84d0-de261be73d83"}}}
2020-09-27T18:28:18.803+0800    INFO    [beat]  instance/beat.go:985    Build info      {"system_info": {"build": {"commit": "2ab907f5ccecf9fd82fe37105082e89fd871f684", "libbeat": "7.9.2", "time": "2020-09-22T23:19:44.000Z", "version": "7.9.2"}}}
2020-09-27T18:28:18.803+0800    INFO    [beat]  instance/beat.go:988    Go runtime info {"system_info": {"go": {"os":"windows","arch":"amd64","max_procs":8,"version":"go1.14.7"}}}
2020-09-27T18:28:18.893+0800    INFO    [beat]  instance/beat.go:992    Host info       {"system_info": {"host": {"architecture":"x86_64","boot_time":"2020-09-12T18:04:57.96+08:00","name":"DESKTOP-02J23KF","ip":["fe80::833:cd40:e16b:cf49/64","169.254.207.73/16","fe80::d93b:8a82:1253:107d/64","192.168.137.1/24","fe80::6845:a251:4bef:6ace/64","169.254.106.206/16","fe80::3ccb:2d12:63a1:402f/64","169.254.64.47/16","fe80::64c5:a48c:4baa:7eee/64","192.168.17.1/24","fe80::2da0:a0a9:646d:5f3a/64","192.168.157.1/24","fe80::e0de:9323:a621:17d/64","169.254.1.125/16","fe80::c19b:c235:f902:2f1b/64","192.168.1.64/24","fe80::c5db:8373:c96c:4f66/64","169.254.79.102/16","::1/128","127.0.0.1/8"],"kernel_version":"10.0.17763.1039 (WinBuild.160101.0800)","mac":["8c:16:45:74:33:7b","0a:00:27:00:00:10","14:4f:8a:d7:68:2a","16:4f:8a:d7:68:29","00:50:56:c0:00:01","00:50:56:c0:00:08","00:ff:77:ac:a8:1f","14:4f:8a:d7:68:29","14:4f:8a:d7:68:2d"],"os":{"family":"windows","platform":"windows","name":"Windows 10 Enterprise LTSC 2019","version":"10.0","major":10,"minor":0,"patch":0,"build":"17763.1039"},"timezone":"CST","timezone_offset_sec":28800,"id":"0b1ba798-dcd4-4cff-b7bc-a5725e3523da"}}}
2020-09-27T18:28:18.894+0800    INFO    [beat]  instance/beat.go:1021   Process info    {"system_info": {"process": {"cwd": "D:\\Download\\filebeat-7.9.2-windows-x86_64", "exe": "D:\\Download\\filebeat-7.9.2-windows-x86_64\\filebeat.exe", "name": "filebeat.exe", "pid": 34288, "ppid": 35708, "start_time": "2020-09-27T18:28:18.615+0800"}}}
2020-09-27T18:28:18.894+0800    INFO    instance/beat.go:299    Setup Beat: filebeat; Version: 7.9.2
2020-09-27T18:28:18.895+0800    INFO    [file]  fileout/file.go:101     Initialized file output. path=123\456.txt max_size_bytes=1048576000 max_backups=1000 permissions=-rw-------
2020-09-27T18:28:18.895+0800    INFO    [publisher]     pipeline/module.go:113  Beat name: DESKTOP-02J23KF
2020-09-27T18:28:18.897+0800    WARN    beater/filebeat.go:178  Filebeat is unable to load the Ingest Node pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the Ingest Node pipelines or are using Logstash pipelines, you can ignore this warning.
2020-09-27T18:28:18.897+0800    INFO    instance/beat.go:450    filebeat start running.
2020-09-27T18:28:18.897+0800    INFO    [monitoring]    log/log.go:118  Starting metrics logging every 30s
2020-09-27T18:28:18.897+0800    INFO    memlog/store.go:119     Loading data file of 'D:\Download\filebeat-7.9.2-windows-x86_64\data\registry\filebeat' succeeded. Active transaction id=0
2020-09-27T18:28:18.897+0800    INFO    memlog/store.go:124     Finished loading transaction log file for 'D:\Download\filebeat-7.9.2-windows-x86_64\data\registry\filebeat'. Active transaction id=0
2020-09-27T18:28:18.898+0800    WARN    beater/filebeat.go:381  Filebeat is unable to load the Ingest Node pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the Ingest Node pipelines or are using Logstash pipelines, you can ignore this warning.
2020-09-27T18:28:18.898+0800    INFO    [registrar]     registrar/registrar.go:109      States Loaded from registrar: 0
2020-09-27T18:28:18.898+0800    INFO    [crawler]       beater/crawler.go:71    Loading Inputs: 1
2020-09-27T18:28:18.898+0800    INFO    [httpjson]      httpjson/input.go:130   Initialized httpjson input.     {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:28:18.899+0800    INFO    [crawler]       beater/crawler.go:141   Starting input (ID: 9255616884077493740)
2020-09-27T18:28:18.899+0800    INFO    [httpjson]      httpjson/input.go:140   httpjson input worker has started.      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:28:18.899+0800    INFO    [crawler]       beater/crawler.go:108   Loading and starting Inputs completed. Enabled inputs: 1
2020-09-27T18:28:18.899+0800    ERROR   [httpjson]      httpjson/input.go:145   failed to execute http client.Do: Post "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000": context canceled   {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:28:18.899+0800    INFO    beater/filebeat.go:447  Running filebeat once. Waiting for completion ...
2020-09-27T18:28:18.899+0800    INFO    [httpjson]      httpjson/input.go:146   httpjson input worker has stopped.      {"url": "http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000"}
2020-09-27T18:28:18.899+0800    INFO    beater/filebeat.go:449  All data collection completed. Shutting down.
2020-09-27T18:28:18.899+0800    INFO    beater/crawler.go:148   Stopping Crawler
2020-09-27T18:28:18.900+0800    INFO    beater/crawler.go:158   Stopping 1 inputs
2020-09-27T18:28:18.900+0800    INFO    [crawler]       beater/crawler.go:163   Stopping input: 9255616884077493740
2020-09-27T18:28:18.900+0800    INFO    beater/crawler.go:178   Crawler stopped
2020-09-27T18:28:18.900+0800    INFO    beater/filebeat.go:502  Shutdown output timer started. Waiting for max 5s.
2020-09-27T18:28:18.900+0800    INFO    beater/signalwait.go:93 Continue shutdown: All enqueued events being published.
2020-09-27T18:28:18.900+0800    INFO    [registrar]     registrar/registrar.go:132      Stopping Registrar
2020-09-27T18:28:18.900+0800    INFO    [registrar]     registrar/registrar.go:166      Ending Registrar
2020-09-27T18:28:18.900+0800    INFO    [registrar]     registrar/registrar.go:137      Registrar stopped
2020-09-27T18:28:18.904+0800    INFO    [monitoring]    log/log.go:153  Total non-zero metrics  {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":109,"time":{"ms":109}},"total":{"ticks":202,"time":{"ms":202},"value":0},"user":{"ticks":93,"time":{"ms":93}}},"handles":{"open":218},"info":{"ephemeral_id":"f29e5608-b6ae-431b-b6b2-1a03e064d0e6","uptime":{"ms":155}},"memstats":{"gc_next":13907136,"memory_alloc":12672400,"memory_total":36001192,"rss":45867008},"runtime":{"goroutines":15}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"file"},"pipeline":{"clients":0,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"cpu":{"cores":8}}}}}
2020-09-27T18:28:18.905+0800    INFO    [monitoring]    log/log.go:154  Uptime: 156.5813ms
2020-09-27T18:28:18.905+0800    INFO    [monitoring]    log/log.go:131  Stopping metrics logging.
2020-09-27T18:28:18.905+0800    INFO    instance/beat.go:456    filebeat stopped.

This is the configuration file:

filebeat:
  shutdown_timeout: 5s
  inputs:
    - type: httpjson
      url: http://192.168.1.15:9200/topic-service/_search?scroll=1m&size=1000
      json_objects_array: hits.hits
      http_method: POST
      pagination:
        id_field: _scroll_id
        req_field: scroll_id
        url: http://192.168.1.15:9200/_search/scroll
        extra_body_content:
          scroll: 1m

output:
  file:
    path: "123"
    filename: "456.txt"
    number_of_files: 1000
    rotate_every_kb: 1024000

I think this is a bug.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.