HomeArticles

Getting Started with GitHub Actions - Software Automation

Introduction

In this guide, we'll take a look at what GitHub actions are, how they work, and build a workflow using Python to showcase how you can use GitHub actions to automate tasks.

Since its inception in 2008, GitHub has grown to become the de facto leader in development project hosting. A community-oriented idea to allow all of our favorite open-source programs free hosting in one central place blew up. GitHub became so popular, that it became synonymous with git; you'll find dozens of articles online explaining how git is not the same as GitHub, and vice-versa.

On it's 10 year anniversary, a big company acquired GitHub for 7.5 billion dollars. That company's name is Microsoft. GitHub acquisition aside, building WSL and having many open-source projects like VS Code, .NET and TypeScript, just to name a few, Microsoft changed the development game and the general public's opinion on the company's invasion of privacy that was Windows 10.

Community-oriented as it still may be, GitHub's next goal was to start making some revenue - by entering the enterprise scene. Cue - GitHub Actions.

Taking a Look at Existing Enterprise Solutions

At the time of Microsoft getting its hands on GitHub, the enterprise scene for software development was already established by a few big players:

Atlassian's BitBucket allowed for seamless integration with Jira and Trello, the leaders in issue management and organization.
Amazon's CodeCommit allowed organizations using AWS to never leave the comforts of one UI and one CLI tool.
GitLab, with it's DevOps-oriented approach aimed to centralize the entire development process under one roof.

In the past few years GitHub has managed to add many of it's enterprise competition's features, including CI/CD

CI/CD and Automation

Modern software development relies heavily on automation, and for a simple reason - it speeds things up. New versions are automatically built, tested and deployed to the appropriate environments.

All it takes is a single effort to write up a couple of scripts and configure a few machines to execute them. GitHub's offering of such features comes in the form of GitHub Actions

An Overview of GitHub Actions

At the time of writing this guide, GitHub Actions are less than two years old. Despite its young age, the feature has matured pretty well due to it being a feature of GitHub.

The Community

Countless users jumped aboard and started getting to know the ins and outs of GitHub Actions and started writing up their own reusable modules (or actions) and shared them with the rest of the world. GitHub heavily relies on such contributions in its marketing model. Currently there are over 9,500 different actions which allow you to, in a few lines of code, set up your environments, run linters and testers, interact with numerous major platform APIs etc. All without ever installing any software besides git and your favorite editor.

Worfklows

We define our automated process through workflows. They are YAML files which contain, among other things, the name of our workflow, trigger events, jobs and steps of our pipeline and runners to perform them.

YAML

YAML Ain't a Markup Language or YAML (a recursive acronym) is a language mostly used for writing configuration files. It is often preferred over JSON for easier writing and readability. Even though JSON is faster in terms of serialization, and much more strict, YAML is used in places where speed is not of great importance.

If you've never had experience with YAML, I highly encourage you to visit Learn X in Y minutes, where X=YAML.

If you're somewhat experienced, I recommend reading about some of YAML's idiosyncrasies and gotchas.

Trigger Events

The on keyword specifies one or more GitHub (note: not just git) events that will trigger the workflow. The event can be very broad, e.g. on every push to the repository, or very specific, e.g. every time a pull request gets a new comment.

The events can also be scheduled in a cron-like fashion:

name: my workflow
on:
  push:
    branches: [main, test]

Here, we've got a trigger event set for every push to either main or test branch. Another way to register triggers is on a schedule, such as:

name: my nightly build workflow
on:
  schedule:
    cron: '0 22 * * *'

This is a nighly build scheduled for 10PM every day.

Jobs

So far, we've given our workflow a name and configured different events that trigger it. The jobs keyword lists actions that will be executed. One workflow can hold multiple jobs with multiple steps each:

jobs:
  job1:
    steps:
      .
      .
  job2:
    steps:
      .
      .

By default, all jobs run in parallel, but we can make one job wait for the execution of another using the needs keyword:

jobs:
  job1:
    steps:
      .
      .
  job2:
    needs: job1
    steps:
      .
      .
  job3:
    needs: [job1, job2]
    steps:
      .
      .

Ensuring jobs execute successfully one by one.

We can also independently configure each job's environment, or run a job across multiple configurations using the matrix strategy. The documentation notes:

A matrix allows you to create multiple jobs by performing variable substitution in a single job definition.

Here's an example of a matrix build configured to work on multiple platforms:

jobs:
  ubuntu_job:
    runs-on: ubuntu-latest
    steps:
      .
      .
  multi_os_job:
    runs-on: {{matrix.os}}
    strategy:
      matrix:
        os: [ubuntu-latest, windows-2016, macos-latest ]
    steps:
      .
      .

Actions

Actions are reusable modules which can be placed in workflows as any other job or step. They can both take inputs and produce outputs. The community marketplace is rich with many bootstrap actions for preparing environments; we will be using a few today.

You can write your own actions as either docker containers or by using vanilla JavaScript and contribute to the marketplace, or keep them to yourself.

An action can easily be referenced in a workflow like any other step in the list:

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

jobs:
  compile_code:
    runs-on: ubuntu-latest
    steps:
      - name: check out repo
        uses: actions/checkout@v2
      - name: compile code
        run: gcc main.c
      .
      .

Here, we can see an example of using actions like any other step. Note that steps are, unlike jobs, always executed consecutively.

Runners

Runners, otherwise known as agents or workers, are machines which are tasked with executing your workflows. Each runner can be set up differently. For example, GitHub offers runners in the three most popular OS flavors - Ubuntu, Windows and MacOS.

GitHub offers their own runners, but you can also opt to host your own runner with the GitHub Actions runner application configured.

Pricing

GitHub runners can execute workflows for free if the repository is public, and the monthly threshold doesn't exceed 2000 minutes.

Teams and Enterprises have their own pricing categories (typical) with different perks and prices, at $4/user per month and $21/user per month respectively, as of writing this guide.

For a complete overview of GitHub's plans, check out GitHub's updated pricing page.

Artifacts - Workflow Persistent Data

Since GitHub runners are temporarily available, so is the data they process and generate. Artifacts are data that can remain available on the repository page after the execution of runners and need to be uploaded with the special upload-artifact action.

The default retention time period is 90 days, but that can be changed:

The overview screen greets us with a lot of data, including the number of the workflow run, a list of all jobs that are queued for execution or have already executed, the visual representation of different jobs and their connections, as well as any artifacts produced by the workflow.

GitHub Actions in Practice - A Python Benchmarker

Note: this example uses a repository created for this article, which can be found, unsurprisingly, on GitHub.

Let's combine what we've covered into a fully-fledged workflow. We will be creating a Python benchmarker workflow which we will place in .github/workflows/benchmark.yml.

The workflow will be triggered on every push to the main branch.

name: python version benchmarker

on:

push:
  branches: [main]

The workflow consists of three stages.

The Lint Stage

The first job is tasked with linting the contents of benchmarker.py, making sure that it has a score of at least 8.0:

jobs:
  pylint:
    runs-on: ubuntu-latest
      steps:
        - uses: actions/checkout@v2 				# checkout repo
        - uses: actions/setup-python@v2				# set up environment for python
            with:
              python-version: 3.7
        - uses: py-actions/py-dependency-install@v2 # install dependencies from requirements.txt
            with:
              path: requirements.txt
        - name: run pylint, fail under 8.5
          run: pip install pylint; pylint benchmarker.py --fail-under=8

Benchmark

We will be running the benchmark across 6 different versions and implementations of python, failing if the code isn't compatible with all of them (configured with fail-fast parameter of the matrix strategy, which is true by default):

  benchmark:
    runs-on: ubuntu-latest
    needs: pylint
    outputs:
      pypy2: ${{ steps.result.outputs.pypy2 }}
      pypy3: ${{ steps.result.outputs.pypy3 }}
      py2-7: ${{ steps.result.outputs.py2-7 }}
      py3-6: ${{ steps.result.outputs.py3-6 }}
      py3-7: ${{ steps.result.outputs.py3-7 }}
      py3-8: ${{ steps.result.outputs.py3-8 }}
    strategy:
      matrix:
        include:
        - python-version: pypy2
          out: pypy2
        - python-version: pypy3
          out: pypy3
        - python-version: 2.7
          out: py2-7
        - python-version: 3.6
          out: py3-6
        - python-version: 3.7
          out: py3-7
        - python-version: 3.8
          out: py3-8
    steps:
    - uses: actions/checkout@v2
    - name: setup py
    uses: actions/setup-python@v2
     with:
        python-version: ${{matrix.python-version}}
    - name: save benchmark stats
      id: result
      run: |
        echo "::set-output name=${{matrix.out}}::$(python benchmarker.py)"

Let's take a more detailed look at this, to see some finer issues you can come across when using GitHub Actions. The outputs keyword specifies key:value pairs that a job can produce and allow other jobs to reference. The key value is the name of the output and the value is a reference to a particular output of a step with a given id.

In our case the step with an id: result will produce an output based on the matrix' value of the python-version which had to be modified and provided with the out parameter since GitHub's object access syntax doesn't allow dots in object names, as well as having numbers on the first position.

There was no inherent way of placing outputs in a single JSON and referencing steps.result.outputs as a JSON object - which can be done for read-only purpose as we will see in the following stage. Each output must instead be defined explicitly.

Uploading to Pastebin and Creating a New Artifact

The third and final stage will read the previous stage's outputs and compile them into a single file. That file will be uploaded as an artifact as well as uploaded to Pastebin.

In order to make a post request to Pastebin we will need to configure an account and then use its API key:

  pastebin:
    runs-on: ubuntu-latest
    needs: benchmark
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-python@v2
        with:
          python-version: 3.9          
      - uses: py-actions/py-dependency-install@v2
        with: 
          path: requirements.txt
      - name: use benchmark data
        run: echo '${{ toJSON(needs.benchmark.outputs) }}' > matrix-outputs.json
      - name: pastebin API request
        env:
          PASTEBIN_API_KEY: ${{ secrets.PASTEBIN_API_KEY }}
        run: python pastebin.py
      - name: upload newly created artifact
        uses: actions/upload-artifact@v2
        with:
          name: benchmark-stats
          path: newpaste.txt

The secret is placed as a job's environment variable to be easily accessed with os.environ[PASTEBIN_API_KEY] in Python.

Secrets management in GitHub

GitHub offers a safe place for secrets on a repository or project-wide level. To save a secret, navigate to the repository Settings and add a new value in the Secrets tab:

When Not to Choose GitHub Actions as a CI/CD Tool?

Even though we've seen the potential of this new feature of GitHub, there are some things to consider; things that may be deal breakers and make you search for an automation tool elsewhere:

GitHub's offering of runners is pretty lacking. With 2 cores and 8GB of RAM, they are good for running linters and testing; but don't even think about some serious compilation.
REWRITE Workflow debugging can be an unpleasant experience. There is no way of re-running a single job but re-running the entire workflow. If the final step is encountering issues, you'll either have to rewrite the workflow to make troubleshooting a bit more bearable or wait for the entire workflow to run before getting to your point of troubleshooting.
No support for distributed builds.

Conclusion

GitHub Actions have matured a lot in the past few years, but not enough. Still, the potential is there. With the best API out of all git platforms, and with the innovative approach of writing actions in JavaScript, all backed up by the largest git community in the world - there is no doubt that GitHub Actions has the potential to take over the entire CI/CD game. But not yet.

For now, use this tool for simple compiling/packaging or to append tags to your commits while the enterprise still relies on the likes of Jenkins, Travis CI and GitLab CI.

# python # unix # git # yaml # github

Last Updated: May 11th, 2023

Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Dragan BjekicAuthor

David Landup

Editor

Introduction
Taking a Look at Existing Enterprise Solutions
CI/CD and Automation
An Overview of GitHub Actions
The Community
Worfklows
Jobs
Actions
Runners
Artifacts - Workflow Persistent Data
GitHub Actions in Practice - A Python Benchmarker
The Lint Stage
Benchmark
Uploading to Pastebin and Creating a New Artifact
When Not to Choose GitHub Actions as a CI/CD Tool?
Conclusion

Free

Monitor with Ping Bot

# monitoring

# uptime

# observability

Reliable monitoring for your app, databases, infrastructure, and the vendors they rely on. Ping Bot is a powerful uptime and performance monitoring tool that helps notify you and resolve issues before they affect your customers.

Learn more

Free

Vendor Alerts with Ping Bot

# monitoring

# uptime

# observability

Get detailed incident alerts about the status of your favorite vendors. Don't learn about downtime from your customers, be the first to know with Ping Bot.