CloudWatch logs for functionbeat are currently being sent but Kibana doesn't show them on the "Logs" tab?!?!

It appears that my Elasticseach + Kibana + Functionbeat + CloudWatch is setup correctly as far as the connections are working. However, when I generate logs into CloudWatch I never see them in the "Logs" page of Kibana. I've went in to Kibana->Logs->Settings and have set Log indices to FUNCTIONbeat-* but still nothing.

My functionbeat.yml...

functionbeat.provider.aws.endpoint: "s3.us-gov-west-1.amazonaws.com"
functionbeat.provider.aws.deploy_bucket: "syost"

functionbeat.provider.aws.functions:
    - name: cloudwatch
      enabled: true
      type: cloudwatch_logs
      description: "lambda function for cloudwatch logs"
      role: my_custom_role
      triggers:
          - log_group_name: /aws/lambda/syost-demo-test
      virtual_private_cloud:
          security_group_ids: ["sd-xxxxx"]
          subnet_ids: ["subnet-xxxxx"]

setup.template.settings:
    index.number_of_shards: 1

setup.kibana:
    host: "x.x.x.x:5601"

output.elasticsearch:
    hosts: ["x.x.x.x:9200"]
    username: "elastic"
    password: "changeme"

processors:
    - add_host_metadata: ~
    - add_cloud_metadata: ~

Would it be possible to share:

  1. the output of the request GET _cat/indices/functionbeat-*
  2. the output of the request GET functionbeat-*/_search?size=1

Also, please ensure that in the Logs UI settings, you've specified functionbeat-* (all lowercase as indices are case sensitive) in the field Log indices.

@Luca_Belluccini here are the results

I'm guessing these commands get typed into the elastic console in Kibana. Here is the output...

GET _cat/indices/functionbeat-*
yellow open functionbeat-7.6.2-2020.04.21-000001 1Gj2zAb4RMy4XLJRu9ibGQ 1 1 0 0 283b 283b
GET functionbeat-*/_search?size=1
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

Here is how I have set the index...

Yes, sorry! I forgot to mention they had to be run on Kibana Console.

The index is empty, which makes me think there is a problem on Functionbeat configuration or permissions to harvest the data (but it managed to create the index).

  1. Can you please check your Functionbeat logs?
    The logs generated by Functionbeat are written to the CloudWatch log group for the function running on Amazon Web Services (AWS). To view the logs, go to the the monitoring area of the AWS Lambda console and view the CloudWatch log group for the function.
    If possible, please increase the logging verbosity to debug (see documentation).
  2. Please verify my_custom_role is a valid arn

I went ahead and redeployed via ./functionbeat -v -e -d "*" deploy syost-cloudwatch and then triggered the functionbeat lambda one time.

1.
Logging was turned on as such...

logging.level: debug 
logging.selectors: ["*"]

Please visit this url for the functionbeat log file.

2.
I typed aws iam list-roles > roles.json and grepped the custom role I wanted which is...

239         {                                                                        
240             "Description": "Allows Lambda functions to call AWS services on your behalf.", 
241             "AssumeRolePolicyDocument": {                                        
242                 "Version": "2012-10-17",                                         
243                 "Statement": [                                                   
244                     {                                                            
245                         "Action": "sts:AssumeRole",                              
246                         "Effect": "Allow",                                       
247                         "Principal": {                                           
248                             "Service": "lambda.amazonaws.com"                                                                                                                                                                                                                                          
249                         }                                                        
250                     }                                                            
251                 ]                                                                
252             },                                                                   
253             "MaxSessionDuration": 3600,                                          
254             "RoleId": "xxxxxxxxxxxx",                                   
255             "CreateDate": "2020-04-08T12:00:25Z",                                
256             "RoleName": "Logic_B_Lambda_AppDev",                                 
257             "Path": "/",                                                         
258             "Arn": "arn:aws-us-gov:iam::xxxxxxxxxxxx:role/Logic_B_Lambda_AppDev" 
259         },

The arn I assigned in my functionbeat.yml file is as follows...

role: arn:aws-us-gov:iam::xxxxxxxxxxxx:role/Logic_B_Lambda_AppDev

That was a direct copy and paste so that looks good.

Hello @syost,
Thanks for checks.

We can rule out the problem is on the harvesting side because there are few logs picked up:

2020-04-21T23:44:09.629Z	DEBUG	[processors]	processing/processors.go:186	Publish event: {

It seems the function times out (Task timed out after 3.00 seconds).

2020-04-21T23:44:09.670Z	DEBUG	[elasticsearch]	elasticsearch/client.go:733	ES Ping(url=http://198.19.159.0:9200)
END RequestId: 2292216a-8c44-4c66-9dc0-75256284e9ce
REPORT RequestId: 2292216a-8c44-4c66-9dc0-75256284e9ce	Duration: 3003.15 ms	Billed Duration: 3000 ms	Memory Size: 128 MB	Max Memory Used: 88 MB	Init Duration: 1436.37 ms	
2020-04-21T23:44:12.626Z 2292216a-8c44-4c66-9dc0-75256284e9ce Task timed out after 3.00 seconds

While I would expect several log lines after the ES Ping (this Github issue contains an example of debug logs you might be able to see).

  1. Is Elasticsearch on HTTP or HTTPS?
  2. Is Elasticsearch IP reachable from the lambda function?

@Luca_Belluccini

  1. Elasticsearch is running on a aws linux workspace and is "HTTP".
  2. How do I test if the elasticsearch ip is reachable from the lambda function?

Try to ensure the user which is running the lambda function is granted access to the IP.
In particular, I can notice the Functionbeat (169.254.200.21) & Elasticsearch ( 198.19.159.0:9200) are on 2 different networks .

I also would like to confirm your Elasticsearch cluster has at least a basic license.

@Luca_Belluccini
I don't have a license setup. I installed elasticsearch and kibana via yum. According to the getting started guide there was no indication/requirement I needed a license.

I don't know what to tell you in regards to the weird IP Functionbeat is setting itself to. I don't understand how to set that nor do I understand how to verify if the lambda is granting access to it or not. If your team could provide some instruction on how to achieve this that would be great. There is no setting in functionbeat.yml to set this. How does functionbeat set this IP?

When running an ifconfig I get two IP's (one of which we've tried already). I tried the other IP ensuring I set it accordingly across each config (elasticsearch.yml, kibana.yml, functionbeat.yml) but this made no difference. It did however change the ip for functionbeat to 169.254.1.57.

From this link it says something about xpack where in the elasticsearch.yml file there is a setting called xpack.license.self_generated.type. This setting doesn't appear in my elasticsearch.yml config file anywhere. My config file is strictly...

node.name: xxxxxxxxxxxx
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: true
network.host: 1.2.3.4 exclude the real ip for now
http.port: 9200
cluster.initial_master_nodes: ["xxxxxxxxxxxx"]

...and while I'm at it here is the kibana.yml file...

server.port: 5601
server.host: "1.2.3.4"
elasticsearch.hosts: ["http://1.2.3.4:9200"]

I have the lambda cloudwatch log file for this new test i just did. It's a bit longer. Please see this file for the log. Is there any way we can get some more eyes on this?

The latest logs always show a timeout, which highlights a problem with network.

Functionbeat runs as a Lambda function on AWS.
By default, Lambda runs your functions in an internal virtual private cloud (VPC) with connectivity to AWS services and the internet.

To access local network resources (e.g. your Elasticsearch instance), you can configure your function to connect to a VPC in your account.

When you use this feature, you manage the function's internet access and network connectivity with VPC resources.

See more on the Amazon Lambda Function networking troubleshooting.

Once this is solved, you might see a log message by Elasticsearch mentioning requiring a license.

Our subscription page (see Functionbeat) requires at least a basic license installed on Elasticsearch.
You can check your license on Elasticsearch running GET _license.
You can install a basic license following the steps here.
An overview of Functionbeat can be found here.

This works with the Elasticsearch binaries provided by Elastic (e.g. via Yum as you've mentioned). You have to use the non-OSS versions, which include X-Pack.

I am going to submit a documentation enhancement to mention the requirement of a License.

But to be clear, the root cause at the moment seems to be connectivity between the Lambda function (installed by Functionbeat) & Elasticsearch.

@Luca_Belluccini

Hi Luca ~

If you would be so kind to leave this post open for a few days it would be appreciated. I'm having to adjust some VPC settings and then I'm spinning up a EC2 instance for hosting the stack instead of my AWS Workspace. This takes time but I want to make sure I can resolve this issue and that it gets properly documented here. Thank you for staying with me!

Thanks ~ S

There's no reason to close the post. Let us know how it goes.
If the post auto closes and you want to post an update, send a private message so I can reopen it!

@Luca_Belluccini

The IP address that gets assigned to beat name is confusing as to what it really is. I can't seem to find any documentation as to what this ip is as it has absolutely nothing to do with our VPC as far as our IP's go.

2020-04-22T23:03:15.349Z	INFO	[publisher]	pipeline/module.go:110	Beat name: 169.x.x.x

I have created flow logs that highlight inbound/outbound traffic on our VPC, filtered it down for two IP's here. One IP I believe to be my lambda and the other is the IP to my EC2 instance. As you can see nothing is getting blocked...

2 097135049942 eni-551xxxxxvpc 172.xx.xx.lambda 172.xx.xx.ec2 44405 9200 6 1 60 1587596578 1587596600 ACCEPT OK
2 097135049942 eni-551xxxxxvpc 172.xx.xx.lambda 172.xx.xx.ec2 7797 9200 6 1 60 1587596578 1587596600 ACCEPT OK
2 097135049942 eni-551xxxxxvpc 172.xx.xx.lambda 172.xx.xx.ec2 46350 9200 6 1 60 1587596578 1587596600 ACCEPT OK
2 097135049942 eni-551xxxxxvpc 172.xx.xx.lambda 172.xx.xx.ec2 3443 9200 6 1 60 1587596578 1587596600 ACCEPT OK

I also have setup a basic license...

[ec2-user@ip-172-xx-xx-xx ~]$ curl -XGET 'http://172.xx.xx.xx:9200/_license?pretty'
{
  "license" : {
    "status" : "active",
    "uid" : "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "type" : "basic",
    "issue_date" : "2020-04-22T18:20:18.697Z",
    "issue_date_in_millis" : 1587579618697,
    "max_nodes" : 1000,
    "issued_to" : "elasticsearch",
    "issuer" : "elasticsearch",
    "start_date_in_millis" : -1
  }
}

Error "no route to host"

I'm receiving some new errors now that indicate "no route to host". This IP is literally the IP for the EC2. I don't understand why the above logs indicated zero blocks on inbound/outbound traffic for our VPC. All three configs (elasticsearch.yml, kibana.yml, and functionbeats.yml) are configured for my ec2 ip.

2020-04-23T00:36:12.165Z	INFO	pipeline/output.go:95	Connecting to backoff(elasticsearch(http://172.xx.xx.ec2:9200))
2020-04-23T00:36:12.165Z	DEBUG	[elasticsearch]	elasticsearch/client.go:733	ES Ping(url=http://172.xx.xx.ec2:9200)
2020-04-23T00:36:12.186Z	DEBUG	[elasticsearch]	elasticsearch/client.go:737	Ping request failed with: Get http://172.xx.xx.ec2:9200: dial tcp 172.xx.xx.ec2:9200: connect: no route to host
2020-04-23T00:36:13.692Z	ERROR	pipeline/output.go:100	Failed to connect to backoff(elasticsearch(http://172.xx.xx.ec2:9200)): Get http://172.xx.xx.ec2:9200: dial tcp 172.xx.xx.ec2:9200: connect: no route to host
2020-04-23T00:36:13.692Z	INFO	pipeline/output.go:93	Attempting to reconnect to backoff(elasticsearch(http://172.xx.xx.ec2:9200)) with 1 reconnect attempt(s)

These logs for the functionbeat lambda keep generating for no apparent reason. I don't understand this? It's as if it keeps retrying and retrying.

Error "failing to receive disposition for aws"

Secondly the functionbeat lambda logs in cloudwatch also indicate the following error that says it fails to collect add_cloud_metadata for provider aws.

2020-04-23T00:36:17.356Z	DEBUG	[filters]	add_cloud_metadata/providers.go:162	add_cloud_metadata: received disposition for aws after 1.011768438s. result=[provider:aws, error=failed requesting aws metadata: Get http://169.254.169.254/2014-02-25/dynamic/instance-identity/document: dial tcp 169.254.169.254:80: connect: connection refused, metadata=
{}
]

However, when i deployed my functionbeat lambda via ./functionbeat -v -e -d "*" deploy functionbeat-lambda it says the complete opposite indicating that it succeeded in detecting a hosting provider...

2020-04-23T00:32:05.813Z	INFO	add_cloud_metadata/add_cloud_metadata.go:93	add_cloud_metadata: hosting provider type detected as aws, metadata={"account":{"id":"xxxxxxxxxxxxx"},"availability_zone":"us-gov-xxxxx","image":{"id":"ami-6efdxxxx"},"instance":{"id":"i-0c7616cxxxxxx"},"machine":{"type":"t2.medium"},"provider":"aws","region":"us-gov-xxxx"}

The Lambda function installed by functionbeat must have the IAM role to access your VPC to target the IP on which Elasticsearch is running.
Please follow the steps here and ensure the user running the Lambda function has the rights to access your VPC.

Can you try this procedure and test again?

To connect a function to a VPC

  1. Open the Lambda console.

  2. Choose the function deployed by Functionbeat

  3. Under VPC , choose Edit .

  4. Choose Custom VPC .

  5. Choose a VPC, subnets, and security groups.
    Note
    Connect your function to private subnets to access private resources. If your function needs internet access, use NAT. Connecting a function to a public subnet does not give it internet access or a public IP address.

  6. Choose Save .

@Luca_Belluccini

SOLVED

VPC flow logs indicate good traffic all the way to the ENI and that is the extent of what VPC flows logs show (or so i've been told). To detect open/closed ports this requires interaction with the firewall service (for me it's "firewalld").

The issue was a bit challenging to trace mainly because I'm brand new to cloud development and elasticsearch but also because I wasn't aware we were even running a firewall. I've always used "UFW" and when I found out it wasn't installed I didn't give much thought to seeing if we were running something else ("firewalld" in this case).

The following commands opened the port and allowed elasticsearch to ingest the data. All errors in the lambda were no more and connection was essentially established.

sudo firewall-cmd --permanent --add-port=9200/tcp
sudo firewall-cmd --reload

I see the getting started guide has been updated. One thing that the getting started guide doesn't indicate besides the above is the need to be running in single node discovery which I'm sure most users will find themselves doing when following your guide. Apparently ES has development mode and production mode and when in development mode you're running on a single host unbound to public IP. Well with functionbeat your ES instance needs to be detectable. The meanings of Public/Private IP in cloud terms are a bit obfuscated to me but to force development mode to bind to an external interface I had to turn on single-node discovery as well.

discogery.type: "single-node"

Cheers ~ S

The setting discovery.type is not related to the problem as this only affects inter-node communication only, not http/https.

The solution was to open the firewall blocking the port 9200 (http/https).

1 Like

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.