Monitor Bastion Hosts on AWS

Anurag Mewar
6 min readFeb 12, 2021

--

Usually, the production environments are placed within a Virtual Private Cloud and made inaccessible directly from the Internet by literally not having any inbound connectivity. A bastion host acts as a jump server and provides access to VPC restricted systems (acting as SSH tunnel) from an external network.

By design, a bastion host can be a backdoor and a critical point of network security, hence, a bastion host must be running ONLY when needed. If you have already configured bastion host for SSH logging, that’s great!

Gif via gfycat.com

In this article, we will take a deep dive into monitoring the bastion host(s) for the total running time, and then sending the alerts to the Slack channel.

The Solution Architecture

Monitoring system for a single bastion host
Monitoring system for multiple bastion hosts

To avoid deadlock, race condition, and the other edge-cases, we will programmatically schedule cloudwatch events for each bastion host separately.

Implementing the Solution

Now that we know the architecture, let’s take a closer look at the nuts and bolts of actually setting up the same on the AWS environment.

  • CloudWatch Rule1:

This rule uses an event pattern where the service name is EC2, and the event type is EC2 Instance State-change Notification. States defined are stopped, running, and the instance-id contains the Instance ID of bastion host(s).
When the bastion hosts are in a stopped or running state, this rule triggers Lambda1.

Event pattern used-

{
“source”: [
“aws.ec2”],“detail-type”: [“EC2 Instance State-change Notification”],“detail”: {“state”: [“stopped”,“running”],“instance-id”: [<Bastion host instance IDs>]}}

This function is triggered when the bastion host(s) is/are in a stopped/running state. The schedule frequency for the instance-specific CloudWatch rule (programmatically created) is calculated based on the state of the instance.

The instance ID is fetched from the event object to determine which bastion instance triggered this function. Using Boto3 — EC2, Lambda and Event clients are created, and the current state of the bastion host is stored in a variable called stateCurrent.

The EC2 service stores a LaunchTime value for each instance which can be found by doing a DescribeInstances call. However, if we stop the instance and then restart it, this value is updated with the new launch time.

If the bastion host is running-
The current system time is fetched and assigned to a variable called currentTime. Based on the defined threshold (for example, t+2 days), the schedule frequency is calculated for CloudWatch rule2. The name of the instance-specific CloudWatch rule is kept unique, and using put_rule(), the CloudWatch rule is created. Further, Lambda2 is added as a target to the CloudWatch rule2 using put_targets(), and the instance ID is sent as a payload to Lambda2 via CloudWatch rule2. Eventually, a Slack alert is sent using urllib3.

If the bastion host is in a stopped state-
The instance state is sent as a Slack alert.

Lambda1
  • CloudWatch Rule2:

It is a scheduled event rule calculated according to the threshold defined in lambda1.
As soon as lambda2 is invoked by cloudwatch rule2, the function will receive the instanceID (via input JSON stored in the event rule) of the bastion host that triggered CloudWatch rule1.

Each instance has its CloudWatch rule2 with a unique name-<instanceID>-uniquename

Input JSON stored in the event rule-{“instanceIdLambda<number>”:”<instanceID>”}

This function will determine the current state (using the EC2 client created from boto3) of the instance fetched from the event object. Based on the identified state, the total running time (launch time) is calculated. If the running time exceeds the threshold value, a Slack alert is sent.

The launch time of the instance is determined and stored in a variable called launchTime. The current system time is stored in a variable called currentTime.

If the bastion host is running-

The total running time of the instance is calculated by subtracting launchTime from currentTime. The result of currentTime-launchTime is stored in a variable called differenceTime.

(In case we want to trigger the alert when the total running time > 2 days)

If differenceTimeDays > 1.9 (to cover the case where the rule is triggered a bit early. Take, for example, 1.99998), using urllib3, a Slack alert is sent.

Make sure all the triggers (instance-specific CloudWatch rule) to Lambda2 function are active.

Lambda2

If you would rather want to trigger daily alerts for the total running time of each bastion host:

We will write another lambda function (invoked daily) which will monitor-

  1. Total running time (launch time) of each bastion host
  2. If there’s a change in the number of bastion hosts
  • CloudWatch Rule3:

This scheduled event triggers lambda3 every day at a specified time.

This function monitors if there is any change in the number of bastion hosts. The integrity violation, along with the total running time (launch time) of each bastion instance is sent as a Slack alert.

EC2 resource and client are defined using Boto3, and the current state of the instance is determined. The launch time of the instance is stored in a variable called launchTime, and the current system time is stored in a variable called currentTime.

If the bastion host is running-
The total running time of the bastion host is determined using the same methodology as in Lambda2. The number of instances with isBastion : True is compared with the known number of bastion hosts and stored in a counter variable.

  • If there is a change in the number of bastion hosts:
    Along with the total running time of the bastion instances, the integrity violation Slack alert is sent.
  • If there is no change in the number of bastion hosts:
    The total running time of each bastion instance is sent as a Slack alert.

If the bastion host is not running-
The number of instances with isBastion : True is compared with the known number of bastion instances and stored in a counter variable.

  • If there is a change in the number of bastion hosts:
    The integrity violation Slack alert is sent.
Lambda3

GitHub Repo: https://github.com/5H4D0W-R007/Monitor-Bastion-Hosts

--

--

Anurag Mewar
Anurag Mewar

Written by Anurag Mewar

I do cybersecurity stuff @ Postman

No responses yet