Introduction
In this article I will be demonstrating the use of Python along with the Boto3 Amazon Web Services (AWS) Software Development Kit (SDK) which allows folks knowledgeable in Python programming to utilize the intricate AWS REST API's to manage their cloud resources. Due to the vastness of the AWS REST API and associated cloud services I will be focusing only on the AWS Elastic Cloud Compute (EC2) service.
Here are the topics I will be covering:
- Starting an EC2 instance
- Stopping an EC2 instance
- Terminating an EC2 instance
- Backing up an EC2 instance by creating an image
- Creating an EC2 instance from an image
- Scheduling backup and clean up using cron on a server and AWS Lambda
Dependencies and Environment Setup
To start I will need to create a user in my AWS account that has programmatic access to the REST API's. For simplicity I will be granting this user admin rights, but please note that is only for simplicity in creating this tutorial. If you are following along you should consult your organization's IT security policies before using this user in a production environment.
Step 1: In my AWS console I must go to the IAM section under the services menu, then click the Users link and finally click the Add user button which takes me to the screen shown below. In this screen I give the user the name boto3-user
and check the box for Programmatic access
before clicking the next button.
Step 2: In the permissions screen I click the Attach existing policies directly tile and then select the checkbox for AdministratorAccess before clicking next as shown below.
Step 3: Click through to next since I am not adding any optional tags.
Step 4: I review the user about to be created and then click Create user.
Step 5: Finally, I download credentials as a CSV file and save them.
Next up I need to install the necessary Python 3 libraries locally within a virtual environment, like so:
$ python -m venv venv
$ source venv/bin/activate
(venv)$ pip install boto3 pprint awscli
Lastly I configure the credentials for the boto3
library using the awscli library making sure to add in the credentials for the Access Key and Secret Key I downloaded in step 5 above.
$ aws configure
AWS Access Key ID [****************3XRQ]: **************
AWS Secret Access Key [****************UKjF]: ****************
Default region name [None]:
Default output format [None]:
Creating and EC2 Instance to Work On
In this section I am going to go over how to create an AWS region specific boto3
session as well as instantiate an EC2 client using the active session object. Then, using that EC2 boto3
client, I will interact with that region's EC2 instances managing startup, shutdown, and termination.
To create an EC2 instance for this article I take the following steps:
Step 1: I click the EC2 link within the Services menu to open the EC2 Dashboard and then click the Launch Instance button in the middle of the screen.
Step 2: In the Choose Amazon Machine Image (AMI) page I click the Select button next to the Amazon Linux AMI.
Step 3: Accept the default t2.micro
instance type and click the Review and Launch button.
Step 4: On the review page I expand the Tags section and click Edit Tags to add tags for Name and BackUp, then click the Launch Review and Launch again to go back to the review page before finally clicking the Launch button to launch the instance.
I now have a running EC2 instance, as shown below.
boto3
Session and Client
At last, I can get into writing some code! I begin by creating an empty file, a Python module, called awsutils.py
and at the top I import the library boto3
then define a function that will create a region-specific Session object.
# awsutils
import boto3
def get_session(region):
return boto3.session.Session(region_name=region)
If I fire up my Python interpreter and import the module just created above I can use the new get_session
function to create a session in the same region as my EC2 instance, then instantiate an EC2.Client object from it, like so:
>>> import awsutils
>>> session = awsutils.get_session('us-east-1')
>>> client = session.client('ec2')
I can then use this EC2 client object to get a detailed description of the instance using pprint
to make things a little easier to see the output of calling describe_instances
on the client
object.
>>> import pprint
>>> pprint.pprint(client.describe_instances())
...
I am omitting the output as it is quite verbose, but know that it contains a dictionary with a Reservations
entry, which is a list of data describing the EC2 instances in that region and ResponseMetadata
about the request that was just made to the AWS REST API.
Retrieving EC2 Instance Details
I can also use this same describe_instances
method along with a Filter
parameter to filter the selection by tag values. For example, if I want to get my recently created instance with the Name tag with a value of demo-instance
, that would look like this:
>>> demo = client.describe_instances(Filters=[{'Name': 'tag:Name', 'Values': ['demo-instance']}])
>>> pprint.pprint(demo)
...
There are many ways to filter the output of describe_instances
and I refer you to the official docs for the details.
Starting and Stopping an EC2 Instance
To stop the demo-instance
, I use the stop_instances
method of the client
object, which I previously instantiated, supplying it the instance ID as a single entry list parameter to the InstanceIds
argument as shown below:
>>> instance_id = demo['Reservations'][0]['Instances'][0]['InstanceId']
>>> instance_id
'i-0c462c48bc396bdbb'
>>> pprint.pprint(client.stop_instances(InstanceIds=[instance_id]))
{'ResponseMetadata': {'HTTPHeaders': {'content-length': '579',
'content-type': 'text/xml;charset=UTF-8',
'date': 'Sat, 22 Dec 2018 19:26:30 GMT',
'server': 'AmazonEC2'},
'HTTPStatusCode': 200,
'RequestId': 'e04a4a64-74e4-442f-8293-261f2ca9433d',
'RetryAttempts': 0},
'StoppingInstances': [{'CurrentState': {'Code': 64, 'Name': 'stopping'},
'InstanceId': 'i-0c462c48bc396bdbb',
'PreviousState': {'Code': 16, 'Name': 'running'}}]
The output from the last command indicates that the method call is stopping the instance. If I re-retrieve the demo-instance
and print the State
I now see it is stopped.
>>> demo = client.describe_instances(Filters=[{'Name': 'tag:Name', 'Values': ['demo-instance']}])
>>> demo['Reservations'][0]['Instances'][0]['State']
{'Code': 80, 'Name': 'stopped'}
To start the same instance backup, there is a complement method called start_instances
that works similar to the stop_instances
method that I demonstrate next.
>>> pprint.pprint(client.start_instances(InstanceIds=[instance_id]))
{'ResponseMetadata': {'HTTPHeaders': {'content-length': '579',
'content-type': 'text/xml;charset=UTF-8',
'date': 'Sat, 22 Dec 2018 19:37:02 GMT',
'server': 'AmazonEC2'},
'HTTPStatusCode': 200,
'RequestId': '21c65902-6665-4137-9023-43ac89f731d9',
'RetryAttempts': 0},
'StartingInstances': [{'CurrentState': {'Code': 0, 'Name': 'pending'},
'InstanceId': 'i-0c462c48bc396bdbb',
'PreviousState': {'Code': 80, 'Name': 'stopped'}}]}
The immediate output of the command is that it is pending startup. Now when I re-fetch the instance and print its state it shows that it is running again.
>>> demo = client.describe_instances(Filters=[{'Name': 'tag:Name', 'Values': ['demo-instance']}])
>>> demo['Reservations'][0]['Instances'][0]['State']
{'Code': 16, 'Name': 'running'}
Alternative Approach to Fetching, Starting, and Stopping
In addition to the EC2.Client
class that I've been working with thus far, there is also a EC2.Instance class that is useful in cases such as this one where I only need to be concerned with one instance at a time.
Below I use the previously generated session
object to get an EC2 resource object, which I can then use to retrieve and instantiate an Instance
object for my demo-instance
.
>>> ec2 = session.resource('ec2')
>>> instance = ec2.Instance(instance_id)
In my opinion, a major benefit to using the Instance
class is that you are then working with actual objects instead of a point in time dictionary representation of the instance, but you lose the power of being able to perform actions on multiple instances at once that the EC2.Client
class provides.
For example, to see the state of the demo-instance
, I just instantiated above, it is as simple as this:
>>> instance.state
{'Code': 16, 'Name': 'running'}
The Instance
class has many useful methods, two of which are start
and stop
which I will use to start and stop my instances, like so:
>>> pprint.pprint(instance.stop())
{'ResponseMetadata': {'HTTPHeaders': {'content-length': '579',
'content-type': 'text/xml;charset=UTF-8',
'date': 'Sat, 22 Dec 2018 19:58:25 GMT',
'server': 'AmazonEC2'},
'HTTPStatusCode': 200,
'RequestId': 'a2f76028-cbd2-4727-be3e-ae832b12e1ff',
'RetryAttempts': 0},
'StoppingInstances': [{'CurrentState': {'Code': 64, 'Name': 'stopping'},
'InstanceId': 'i-0c462c48bc396bdbb',
'PreviousState': {'Code': 16, 'Name': 'running'}}]}
After waiting about a minute for it to fully stop... I then check the state again:
>>> instance.state
{'Code': 80, 'Name': 'stopped'}
Now I can start it up again.
>>> pprint.pprint(instance.start())
{'ResponseMetadata': {'HTTPHeaders': {'content-length': '579',
'content-type': 'text/xml;charset=UTF-8',
'date': 'Sat, 22 Dec 2018 20:01:01 GMT',
'server': 'AmazonEC2'},
'HTTPStatusCode': 200,
'RequestId': '3cfc6061-5d64-4e52-9961-5eb2fefab2d8',
'RetryAttempts': 0},
'StartingInstances': [{'CurrentState': {'Code': 0, 'Name': 'pending'},
'InstanceId': 'i-0c462c48bc396bdbb',
'PreviousState': {'Code': 80, 'Name': 'stopped'}}]}
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Then checking the state again after a short while...
>>> instance.state
{'Code': 16, 'Name': 'running'}
Creating a Backup Image of an EC2.Instance
An important topic in server management is creating backups to fall back on in the event a server becomes corrupted. In this section I am going to demonstrate how to create an Amazon Machine Image (AMI) backup of my demo-instance
, which AWS will then store in its Simple Storage Service (S3). This can later be used to recreate that EC2 instance, just like how I used the initial AMI to create the demo-instance
.
To start I will show how to use the EC2.Client
class and its create_image
method to create an AMI image of a demo-instance
by providing the instance ID and a descriptive name for the instance.
>>> import datetime
>>> date = datetime.datetime.utcnow().strftime('%Y%m%d')
>>> date
'20181221'
>>> name = f"InstanceID_{instance_id}_Image_Backup_{date}"
>>> name
'InstanceID_i-0c462c48bc396bdbb_Image_Backup_20181221'
>>> name = f"InstanceID_{instance_id}_Backup_Image_{date}"
>>> name
'InstanceID_i-0c462c48bc396bdbb_Backup_Image_20181221'
>>> pprint.pprint(client.create_image(InstanceId=instance_id, Name=name))
{'ImageId': 'ami-00d7c04e2b3b28e2d',
'ResponseMetadata': {'HTTPHeaders': {'content-length': '242',
'content-type': 'text/xml;charset=UTF-8',
'date': 'Sat, 22 Dec 2018 20:13:55 GMT',
'server': 'AmazonEC2'},
'HTTPStatusCode': 200,
'RequestId': '7ccccb1e-91ff-4753-8fc4-b27cf43bb8cf',
'RetryAttempts': 0}}
Similarly, I can use the Instance
class's create_image
method to accomplish the same task, which returns an instance of an EC2.Image
class that is similar to the EC2.Instance
class.
>>> image = instance.create_image(Name=name + '_2')
Tagging Images and EC2 Instances
A very powerful, yet extremely simple, feature of EC2 instances and AMI images are the ability to add custom tags. You can add tags both via the AWS management console, as I showed when creating the demo-instance
with tags Name and BackUp, as well as programmatically with boto3
and the AWS REST API.
Since I have an EC2.Instance
object still floating around in memory in my Python interpreter I will use that to display the demo-instance
tags.
>>> instance.tags
[{'Key': 'BackUp', 'Value': ''}, {'Key': 'Name', 'Value': 'demo-instance'}]
Both the EC2.Instance
and the EC2.Image
classes have an identically functioning set of create_tags
methods for adding tags to their represented resources. Below I demonstrate adding a RemoveOn tag to the image created previously, which is paired with a date at which it should be removed. The date format used is "YYYYMMDD".
>>> image.create_tags(Tags=[{'Key': 'RemoveOn', 'Value': remove_on}])
[ec2.Tag(resource_id='ami-081c72fa60c8e2d58', key='RemoveOn', value='20181222')]
Again, the same can be accomplished with the EC2.Client
class by providing a list of resource IDs, but with the client you can tag both images and EC2 instances at the same time if you desire by specifying their IDs in the Resource parameter of create_tags
function, like so:
>>> pprint.pprint(client.create_tags(Resources=['ami-00d7c04e2b3b28e2d'], Tags=[{'Key': 'RemoveOn', 'Value': remove_on}]))
{'ResponseMetadata': {'HTTPHeaders': {'content-length': '221',
'content-type': 'text/xml;charset=UTF-8',
'date': 'Sat, 22 Dec 2018 20:52:39 GMT',
'server': 'AmazonEC2'},
'HTTPStatusCode': 200,
'RequestId': '645b733a-138c-42a1-9966-5c2eb0ca3ba3',
'RetryAttempts': 0}}
Creating an EC2 Instance from a Backup Image
I would like to start this section by giving you something to think about. Put yourself in the uncomfortable mindset of a system administrator, or even worse a developer pretending to be a sysadmin because the product they are working on doesn't have one (admonition... that's me), and one of your EC2 servers has become corrupted.
Eeek! It's scramble time... you now need to figure out what OS type, size, and services are running on the down server... fumble through setup and installation of the base server, plus any apps that belong on it, and pray everything comes up correctly.
Whew! Take a breath and chill because I'm about to show you how to quickly get back up and running, plus... spoiler alert... I am going to pull these one-off Python interpreter commands into a workable set of scripts at the end for you to further modify and put to use.
Ok, with that mental exercise out of the way let me get back to work. To create an EC2 instance from an image ID I use the EC2.Client
class's run_instances
method and specify the number of instances to kick off and the type of instance to run.
>>> pprint.pprint(client.run_instances(ImageId='ami-081c72fa60c8e2d58', MinCount=1, MaxCount=1, InstanceType='t2.micro'))
...
I am omitting the output again due to its verbosity. Please have a look at the official docs for the run_instances method, as there are a lot of parameters to choose from to customize exactly how to run the instance.
Removing Backup Images
Ideally, I would be making backup images on a fairly frequent interval (ie, daily at the least) and along with all these backups come three things, one of which is quite good and the other two are somewhat problematic. On the good side of things I am making snapshots of known states of my EC2 server which gives me a point in time to fall back to if things go bad. However, on the bad side I am creating clutter in my S3 buckets and racking up charges with each additional backup I put into storage.
A way to mitigate the downsides of clutter and rising storage charges is to remove backup images after a predetermined set of time has elapsed and, that is where the Tags I created earlier are going to save me. I can query my EC2 backup images and locate ones that have a particular RemoveOn tag and then remove them.
I can begin by using the describe_images
method on the EC2.Client
class instance along with a filter for the 'RemoveOn' tag to get all images that I tagged to remove on a give date.
>>> remove_on = '201812022'
>>> images = client.describe_images(Filters=[{'Name': 'tag:RemoveOn', 'Values': [remove_on]}])
Next up I iterate over all the images and call the client method deregister_image
passing it the iterated image ID and voilà - no more image.
>>> remove_on = '201812022'
>>> for img in images['Images']:
... client.deregister_image(ImageId=img['ImageId'])
Terminating an EC2 Instance
Well, having covered starting, stopping, creating, and removing backup images, and launching an EC2 instance from a backup image, I am nearing the end of this tutorial. Now all that is left to do is clean up my demo instances by calling the EC2.Client
class's terminate_instances
and passing in the instance IDs to terminate. Again, I will use describe_instances
with a filter for the name of demo-instance
to fetch the details of it and grab its instance ID. I can then use it with terminate_instances
to get rid of it forever.
Note: Yes, this is a forever thing so be very careful with this method.
>>> demo = client.describe_instances(Filters=[{'Name': 'tag:Name', 'Values': ['demo-instance']}])
>>> pprint.pprint(client.terminate_instances(InstanceIds=[instance_id]))
{'ResponseMetadata': {'HTTPHeaders': {'content-type': 'text/xml;charset=UTF-8',
'date': 'Sat, 22 Dec 2018 22:14:20 GMT',
'server': 'AmazonEC2',
'transfer-encoding': 'chunked',
'vary': 'Accept-Encoding'},
'HTTPStatusCode': 200,
'RequestId': '78881a08-0240-47df-b502-61a706bfb3ab',
'RetryAttempts': 0},
'TerminatingInstances': [{'CurrentState': {'Code': 32,
'Name': 'shutting-down'},
'InstanceId': 'i-0c462c48bc396bdbb',
'PreviousState': {'Code': 16, 'Name': 'running'}}]}
Pulling Things Together for an Automation Script
Now that I have walked through these functionalities issuing commands one-by-one using the Python shell interpreter (which I highly recommend readers to do at least once on their own to experiment with things) I will pull everything together into two separate scripts called ec2backup.py
and amicleanup.py
.
The ec2backup.py
script will simply query all available EC2 instances that have the tag BackUp then create a backup AMI image for each one while tagging them with a RemoveOn tag with a value of 3 days into the future.
# ec2backup.py
from datetime import datetime, timedelta
import awsutils
def backup(region_id='us-east-1'):
'''This method searches for all EC2 instances with a tag of BackUp
and creates a backup images of them then tags the images with a
RemoveOn tag of a YYYYMMDD value of three UTC days from now
'''
created_on = datetime.utcnow().strftime('%Y%m%d')
remove_on = (datetime.utcnow() + timedelta(days=3)).strftime('%Y%m%d')
session = awsutils.get_session(region_id)
client = session.client('ec2')
resource = session.resource('ec2')
reservations = client.describe_instances(Filters=[{'Name': 'tag-key', 'Values': ['BackUp']}])
for reservation in reservations['Reservations']:
for instance_description in reservation['Instances']:
instance_id = instance_description['InstanceId']
name = f"InstanceId({instance_id})_CreatedOn({created_on})_RemoveOn({remove_on})"
print(f"Creating Backup: {name}")
image_description = client.create_image(InstanceId=instance_id, Name=name)
images.append(image_description['ImageId'])
image = resource.Image(image_description['ImageId'])
image.create_tags(Tags=[{'Key': 'RemoveOn', 'Value': remove_on}, {'Key': 'Name', 'Value': name}])
if __name__ == '__main__':
backup()
Next up is the amicleanup.py
script which queries all AMI images that have a RemoveOn tag equal to the day's date it was run on in the form "YYYYMMDD" and removes them.
# amicleanup.py
from datetime import datetime
import awsutils
def cleanup(region_id='us-east-1'):
'''This method searches for all AMI images with a tag of RemoveOn
and a value of YYYYMMDD of the day its ran on then removes it
'''
today = datetime.utcnow().strftime('%Y%m%d')
session = awsutils.get_session(region_id)
client = session.client('ec2')
resource = session.resource('ec2')
images = client.describe_images(Filters=[{'Name': 'tag:RemoveOn', 'Values': [today]}])
for image_data in images['Images']:
image = resource.Image(image_data['ImageId'])
name_tag = [tag['Value'] for tag in image.tags if tag['Key'] == 'Name']
if name_tag:
print(f"Deregistering {name_tag[0]}")
image.deregister()
if __name__ == '__main__':
cleanup()
Cron Implementation
A relatively simple way to implement the functionality of these two scripts would be to schedule two cron tasks on a Linux server to run them. In an example below I have configured a cron task to run every day at 11PM to execute the ec2backup.py
script then another at 11:30PM to execute the amicleanup.py
script.
0 23 * * * /path/to/venv/bin/python /path/to/ec2backup.py
30 23 * * * /path/to/venv/bin/python /path/to/amicleanup.py
AWS Lambda Implementation
A more elegant solution is to use AWS Lambda to run the two as a set of functions. There are many benefits to using AWS Lambda to run code, but for this use-case of running a couple of Python functions to create and remove backup images the most pertinent are high availability and avoidance of paying for idle resources. Both of these benefits are best realized when you compare using Lambda against running the two cron jobs described in the last section.
If I were to configure my two cron jobs to run on an existing server, then what happens if that server goes down? Not only do I have the headache of having to bring that server backup, but I also run the possibility of missing a scheduled run of the cron jobs that are controlling the EC2 server backup and cleanup process. This is not an issue with AWS Lambda as it is designed with redundancy to guarantee extremely high availability.
The other main benefit of not having to pay for idle resources is best understood in an example where I may have spun up an instance just to manage these two scripts running once a day. Not only does this method fall under the potential availability flaw of the last item, but an entire virtual machine has now been provisioned to run two scripts once a day constituting a very small amount of compute time and lots of wasted resources sitting idle. This is a prime case for using AWS Lambda to improve operational efficiency.
Another operational efficiency resulting from using Lambda is not having to spend time maintaining a dedicated server.
To create an AWS Lambda function for the EC2 instance image backups follow these steps:
Step 1. Under the Service menu click Lambda within the Compute section.
Step 2. Click the Create function button.
Step 3. Select the Author from scratch option, type ec2backup
as a function name, select Python 3.6 from the run-time options, then add the boto3-user
for the role and click Create Function
as show below:
Step 4. In the designer select CloudWatch Events and add a cron job of cron(0 11 * ? * *)
which will cause the function to run everyday at 11PM.
Step 5. In the code editor add the following code:
import boto3
import os
from datetime import datetime, timedelta
def get_session(region, access_id, secret_key):
return boto3.session.Session(region_name=region,
aws_access_key_id=access_id,
aws_secret_access_key=secret_key)
def lambda_handler(event, context):
'''This method searches for all EC2 instances with a tag of BackUp
and creates a backup images of them then tags the images with a
RemoveOn tag of a YYYYMMDD value of three UTC days from now
'''
created_on = datetime.utcnow().strftime('%Y%m%d')
remove_on = (datetime.utcnow() + timedelta(days=3)).strftime('%Y%m%d')
session = get_session(os.getenv('REGION'),
os.getenv('ACCESS_KEY_ID'),
os.getenv('SECRET_KEY'))
client = session.client('ec2')
resource = session.resource('ec2')
reservations = client.describe_instances(Filters=[{'Name': 'tag-key', 'Values': ['BackUp']}])
for reservation in reservations['Reservations']:
for instance_description in reservation['Instances']:
instance_id = instance_description['InstanceId']
name = f"InstanceId({instance_id})_CreatedOn({created_on})_RemoveOn({remove_on})"
print(f"Creating Backup: {name}")
image_description = client.create_image(InstanceId=instance_id, Name=name)
image = resource.Image(image_description['ImageId'])
image.create_tags(Tags=[{'Key': 'RemoveOn', 'Value': remove_on}, {'Key': 'Name', 'Value': name}])
Step 6. In the section under the code editor add a few environment variables.
- REGION with a value of the region of the EC2 instances to backup which is us-east-1 in this example
- ACCESS_KEY_ID with the value of the access key from the section where the
boto3-user
was set up - SECRET_KEY with the value of the secret key from the section where the
boto3-user
was set up
Step 7. Click the Save button at the top of the page.
For the image clean up functionality follow the same steps with the following changes.
Step 3. I give it a name of amicleanup
Step 4. I use a slightly different time configuration of cron(30 11 * ? * *)
to run at 11:30PM
Step 5. Use the following cleanup function:
import boto3
from datetime import datetime
import os
def get_session(region, access_id, secret_key):
return boto3.session.Session(region_name=region,
aws_access_key_id=access_id,
aws_secret_access_key=secret_key)
def lambda_handler(event, context):
'''This method searches for all AMI images with a tag of RemoveOn
and a value of YYYYMMDD of the day its ran on then removes it
'''
today = datetime.utcnow().strftime('%Y%m%d')
session = get_session(os.getenv('REGION'),
os.getenv('ACCESS_KEY_ID'),
os.getenv('SECRET_KEY'))
client = session.client('ec2')
resource = session.resource('ec2')
images = client.describe_images(Filters=[{'Name': 'tag:RemoveOn', 'Values': [today]}])
for image_data in images['Images']:
image = resource.Image(image_data['ImageId'])
name_tag = [tag['Value'] for tag in image.tags if tag['Key'] == 'Name']
if name_tag:
print(f"Deregistering {name_tag[0]}")
image.deregister()
Conclusion
In this article I have covered how to use the AWS Python SDK library boto3
to interact with EC2 resources. I demonstrate how to automate the operational management tasks to AMI image backup creation for EC2 instances and subsequent clean up of those backup images using scheduled cron jobs on either a dedicated server or using AWS Lambda.
If you are interested in learning how to use Boto and AWS Simple Storage Service (S3) check out Scott Robinson's article here on StackAbuse.
As always, thanks for reading and don't be shy about commenting or critiquing below.