Mark Lapierre

Software, test, and automation engineer; cognitive scientist; data science aspirant; perennial student.

Using GitLab CI/CD for Automated Let's Encrypt Certificate Deployment

I previously wrote about setting up this blog on GitLab Pages¹. Back then I set up HTTPS manually. This post is about how I’ve automated that process using GitLab CI/CD and Justin Aiken’s jekyll-gitlab-letsencrypt plugin for Jekyll.

Last time I used certbot-auto (it was called letsencrypt-auto at the time) to manually obtain and verify my Lets’s Encrypt certificate. If you have shell access to your webserver you can use certbot to automatically update your certificate. But I don’t have shell access; GitLab Pages are for static sites. However, all the steps involved in obtaining, installing, and verifying a Let’s Encrypt certificate can be automated as long as you have access to some kind of shell environment, and can publish the static results of the process.

That’s where GitLab CI/CD comes in.

It’s possible to write your own shell script and add it to your .gitlab-ci.yml file, but thankfully other people have already done that work. In this case, in the form of jekyll-gitlab-letsencrypt.

Configuring it involved adding it to my Gemfile:

group :jekyll_plugins do
  gem 'jekyll-gitlab-letsencrypt'
end

And then updating my _config.yml:

gitlab-letsencrypt:
  # Gitlab settings:
  gitlab_repo: 'mlapierre/mlapierre.gitlab.io' # Namespaced repository identifier

  # Domain settings:
  email: 'mlapierre@gmail.com' # Let's Encrypt email address
  domain: 'marklapierre.net'   # Domain that the cert will be issued for

  # Jekyll settings:
  base_path: '/'  # Where you want the file to go
  pretty_url: true  # Add a "/" on the end of the URL
  filename: '_pages/letsencrypt_challenge.html'

  # Delay settings:
  initial_delay: 180 # How long to wait for Gitlab CI to push your changes before it starts checking

And then adding a job to my .gitlab-ci.yml:

renew-letsencrypt:
  stage: build
  script:
    - bundle install
    - bundle exec jekyll letsencrypt
  only:
    - schedules

Items in only specify how the renew-letsencrypt job can be executed. The schedules option allows the job to be run in a schedule.

And here is a scheduled pipeline that runs once a week:

Let's Encrypt renew pipeline schedule settings

Let’s Encrypt certificates expire after 90 days but you can make the request to renew them as frequently as you like (within limits that most people wouldn’t reach).

For the plugin to work it needs push access to my blog’s repository on GitLab, and for that it needs a personal access token. If you want to use jekyll-gitlab-letsencrypt locally and will never publish your _config.yml you can provide your personal access token via:

gitlab-letsencrypt:
  personal_access_token: 'ENTER SECRET HERE'

But if you use GitLab CI/CD you can provide it more securely via an environment variable, GITLAB_TOKEN. You can do that via Settings -> CI/CD -> Variables:

GITLAB_TOKEN in CI/CD Variables

When the scheduled job executes, the plugin takes care of publishing the challenge file that the Let’s Encrypt service verifies, and then it uses the GitLab API to update my certificate in my GitLab Pages settings:

$ bundle exec jekyll letsencrypt
Configuration file: /builds/mlapierre/mlapierre.gitlab.io/_config.yml
Registering mlapierre@gmail.com to https://acme-v01.api.letsencrypt.org/...
Pushing file to Gitlab
Commiting challenge file as _pages/letsencrypt_challenge.html
Done Commiting! Check https://gitlab.com/mlapierre/mlapierre.gitlab.io/commits/master
Going to check http://marklapierre.net/.well-known/acme-challenge/d7D2-lvQqQJSolI42L9RSaOjQYBQrBSQsFWdzYELLJM/ for the challenge to be present...
Waiting 180 seconds before we start checking for challenge..
Got response code 200, file is present!
Requesting verification...
Challenge status = pending
Challenge is valid!
Updating domain marklapierre.net pages setting with new certificates..
           Success!
All finished!  Don't forget to `git pull` in order to bring your local repo up to date with changes this plugin made.
Job succeeded

And now I don’t have to bother manually updating my certificate. Many thanks to Justin and everyone else who did all the hard work!

1. In the interest of transparency, I should note that since writing that post I've started working for GitLab. More on that in another post.

Pros and Cons of Quarantined Tests

Flaky tests, i.e., those that only fail sometimes, are the bane of any end-to-end automated test suite.

Another type of problem test is one that fails every time but which tests something that is deemed not important enough to fix right now. If you have to ignore some of the failed tests sooner or later you’re going to ignore one that you should have paid attention to. Or worse, you might decide to ignore them all because clearly no-one is fixing the bugs.

If a test is broken, fixing it should always be the first course of action, if possible. But what if some other task has a higher priority? If you’re confident that the problem is the test and not the software being tested, it might be reasonable to allow the test to keep failing, at least temporarily.

When you frequently ignore some failing tests, the whole suite is at risk of being seen as unreliable. A common way to prevent that is to quarantine the flaky/failing tests. Quarantine in this context refers to isolating the troublesome tests from the rest of the test suite. Not for fear of contagion, except in the sense of the negative impact they can have on the perception of the rest of the tests.

I think I first came across the concept in an article by Martin Fowler. It’s a great read on the topic of flaky tests and how to identify and resolve the causes of their flakiness. This post isn’t about how to fix them so check out that article if you’re after that kind of info.

More recently, an article on the Google Testing Blog mentioned the same technique for dealing with the same types of troublesome tests.

Even though quarantining tests can be a good temporary solution, if you don’t fix the tests (or the bugs) you can end up in the situation I mentioned before; a few failing tests create the impression that the entire suite is unreliable, enough so that you might consider them a death sentence.

My team and try to avoid that death sentence in a few ways:

Report quarantined test results separately from the rest of the test suite.

That way everyone can see the results of the reliable tests and know that a failure there is something that should be looked at immediately. We don’t have to try to identify the “true” failures amongst the flaky ones.
Tag quarantined tests with a reason they’re quarantined.

So flaky tests get tagged as such. Failing tests that aren’t going to get fixed for a while get reported and tagged with the issue number. Comments can be added if the tag isn’t sufficient. This isn’t enough to rescue a quarantined test from oblivion, but it can help avoid the potential problem of losing track of why a test was quarantined.
Schedule a regular review of quarantined tests.

If it’s not scheduled it’s not likely to happen. Failing tests can be assigned to someone to fix if priorities change, and time can be invested in fixing a flaky test if we decide it’s more important than we first thought.
Delete the test

If any test stays in quarantine for a long time it would be worthwhile rethinking the value the test provides. Maybe it turns out that unit tests, or even exploratory tests, provide enough coverage. Or the test might cover a part of the software that rarely changes, or which doesn’t get much use. In that case if there is a regression it’s not a big deal. We might @Ignore the test and leave a comment explaining why—instead of deleting it—if it seems likely someone might decide to write the test again.

How do you deal with flaky or failing tests that don’t get fixed quickly?

Inattentional Blindness and Scripted Tests

My previous workplace was a large organisation in which many testers were employed to evaluate the quality of the software we developed and maintained. We had a collection of scripted test cases that testers would follow step-by-step. That worked reasonably well, although I was employed to help automate our processes, including testing, which contributed significantly to the quality of our software.

When I started working at my current workplace I found similarly detailed scripted test cases. Part of my responsibilities included manual testing, so I thought I could test the way I was familiar with, and how the rest of my colleagues tested—just follow the test cases. It hasn’t worked well. We find bugs, for sure, but as I’ve grown in experience I’ve found more and more problems with the software that had been there for a long time, through many versions of the software and through many executions of test cases that should have revealed them.

There are at least a few things that I think explain why we, including my past self, failed to identify problems:

Out-of-date test cases. Change happens constantly and we have too many tests with too much detail for our small QA team to keep up with.
Treating test cases like an instruction manual. It’s relatively easy for an experienced tester to follow the steps of a scripted test case to the letter, assuming the steps are accurate. That was our standard practice. But it’s even easier to miss out on opportunities to reveal bugs if you do that.
Overwhelming detail. Many test cases are so long, verbose, and complicated that it’s very easy to miss important details in the steps and expected results, especially when you’re under pressure to get the job done quickly.
Unnecessarily specific detail. Often a test case instructs the tester to use a particular element of the UI in a particular way. E.g., “enter a value in the Account text field and click the Validate button at the bottom of the panel.” That sort of specificity means the other fields are likely to be ignored, as well as ignoring all the other ways that validation could be triggered. And that problem is in addition to making it hard to keep the test case up to date (because sooner or later, that Validate button is going to move, or be removed entirely).

The last three points have something in particular in common. They all trigger a cognitive phenomenon called inattentional blindness. It’s something that we all experience, whether we’re aware of it or not¹. A well-known demonstration of the phenomenon comes from a psychological study and you can perform the experiment from the study yourself by watching a video and following the instructions at the start:

If you haven’t done so already, I strongly recommend you do the experiment first before you read on—this is something you really only get to experience once, although there are variations of it. Although it’s likely something you’ll experience again and again in real life.

The study and others like it find that half of the time on average, people fail to notice the unexpected element. They’re asked to perform a task and they’re focused so intently on it that they fail to perceive something they’re looking right at. It’s one of the reasons using your phone while driving is so dangerous, even hands-free; if your attention is on the text/app/call it’s not on the road.

That kind of inattentional blindness is exactly what can happen when you follow a scripted test. You focus on the steps you have to follow and the results you have to check for and you fail to notice anything else. The software can behave in unexpected ways, but you might miss it if you’re only paying attention to what the test case says should happen. Even if you’re looking right at the problem. Missing the unexpected becomes more likely the longer you’re doing the task and the more anxious you are about completing it quickly.

I’m certainly not the first to draw the link between inattentional blindness and scripted tests. Michael Bolton, for one, has mentioned it a few times on his blog. But it’s particularly relevant to me now as I try to improve our testing practices. Whatever changes we make, we need to be aware of the potential for inattentional blindness.

For now it’s clear to me that the scripted, highly-detailed test cases we’re used to at my workplace are getting in the way of us improving the quality of our software. Part of the solution is a more exploratory approach to testing. One suggestion is that instead of following a test case step-by-step, you could:

glance over the test case; try to discern the task that’s being modeled or the information that’s being sought; then try to fulfill the task without referring to the test case. That puts the tester in command to try the task and stumble over problems in getting to the goal. Since the end user is not going to be following the test case either, those problems are likely to be bugs.

For those interested, there is more information about inattentional blindness on Scholarpedia, including references to the original research publications.

How do you avoid inattentional blindness while you’re testing? I haven’t set up a commenting system here yet but this article is cross-posted to dev.to and you’re welcome to comment there.

1. Technically, we're never aware of it. That's the inattentional part.

From GitHub to GitLab pages

I started this site on GitHub Pages because:

I was already using GitHub
I didn’t want to have to deal with hosting
It’s free

But I’d been meaning to set up my own domain for a while¹ and although GitHub pages supports custom domains it didn’t seem to support HTTPS on custom domains. And there’s no way I’m setting up a web site in 2018 without HTTPS.

It turns out that support for HTTPS with custom domains is gradually being rolled out now, and it’s possible to set it up yourself. But there’s no mention of this in the documentation, and word from GitHub Support doesn’t suggest it’ll be official any time soon. And it hasn’t been rolled out to my account yet.

So I looked into GitLab as an alternative. I liked what I found. Not only is there extensive documentation, but there’s also an official tutorial.

I won’t say it was easy, but unless something has gone terribly wrong you’re reading this over a secure connection certified by Let’s Encrypt.

I followed the instructions to add my domain to my GitLab pages settings, and to configure the DNS records for my domain. There are a few helpful links to how-to pages for specific hosts, but none for my registration service. Fortunately, the instruction provided were sufficient.

Before I delved into HTTPS I wanted to make sure the domain setup was behaving as expected. It was not. When I tried to open a link to my site I got a 404 back. The DNS record was directing the request to the correct server, but the server wasn’t associating the request with my GitLab Pages repository. It was then that I realised the project name still ended in github.com, not gitlab.com. When I imported my GitHub repository, the import function on GitLab copied the original repo’s name, ignoring the new name it allowed me to enter. No matter, I thought, there’s an option to change the name. So I did that. But still my site threw up a 404. It seems that I’d only changed the name of my project, i.e., the display name. I had to hunt down the setting to change the name of the repository itself (Settings > Advanced Settings > Rename repository). But once I’d updated that I could finally access my site via marklapierre.net.

And, finally, the HTTPS configuration. I followed the tutorial and ran the letsencrypt-auto CLI to set up a challenge response that would confirm for Let’s Encrypt that I control marklapierre.net. Unfortunately, at some point after the instructions were written GitLab Pages and Jekyll stopped accepting permalinks with dots in them. Fortunately, someone pointed this out in the comments on the tutorial along with the solution—end the permalink with / not .html.

So now I have a secure site on my own domain. Huzzah!

[Update: on the same day I published this, GitHub announced support for HTTPS on custom domains. Nice.]

1. Someone else got marklapierre.com. Last update in 2014? C'mon Mark, you're killing me!

Science and software testing

Software testing, particularly manual software testing, is sometimes thought of as nothing more than following a script to confirm that the software does what it was designed to do. From that perspective, testing might seem like a boring and relatively mindless task. And to be honest, that is the traditional view of testing as part of the Waterfall method of software development in large organisations. Division of labour meant that there were some people who did nothing but follow scripts someone else had written, and report bugs that someone else would fix.

Science, on the other hand, is undeniably interesting and challenging. So if you share the impression that software testing is boring, you might be suprised to know that I find both engaging and worth spending my time and effort on¹. Having worked as a software tester, and having studied a scientific field (cognitive science), I’ve noticed some similarities that help explain why I’m drawn to both pursuits despite their apparent lack of similarity.

Science can be defined as:

“The intellectual and practical activity encompassing the systematic study of the structure and behaviour of the physical and natural world through observation and experiment.”

That doesn’t seem to describe following testing scripts at all. Even if you swap “the physical and natural world” for “the software under test”, and even if you include the task of writing scripts. But if you consider the entire process of software testing you’ll see similarities emerge. For one thing, test scripts have to be written based on something, and in today’s world of agile software development, than something is usually not requirements handed down from designers, but rather requirements explored, developed, and refined iteratively. Observation and experiment are a big part of that iterative process. This is especially the case when working on existing software that doesn’t have good documentation—how else could you figure out how the software works except through observation and experiment? Even if you have access to the code, it’s unlikely you could read the code and know exactly how the software will behave. And there isn’t always someone else around to ask.

The reality of software testing is a lot more than following a script. A more complete definition of testing is that:

“Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc.”

When defined that way, it’s much clearer how testing and science are similar. Questioning, study, modeling, observation, and inference are all core aspects of science and testing.

In testing, we question whether the software does what we expect it to do. We question whether it does what customers want it to do and in the way they want. We question whether a code change has unintended effects. We study how the software behaves under various conditions. We construct models of how we believe the software performs, even if they’re only mental models. We observe how the software responds to input. And ultimately we make inferences about the quality of the software.

Another similarity between science and software testing is that neither process truly has an end. There is always more to discover through science, even at the end of a project that has produced significant insights. And there is always more to learn about any but the simplest software. In the case of science and testing, it’s not meaningful to think of the entire process as having a goal, but it is necessary to define a reasonable milestone as the completion of a project. We don’t finish testing when there are no bugs, because that will never happen, but we can consider testing complete when the software behaves well under a reasonable range of scenarios.

Science is often described as trying to prove things². That is not the aim of science, nor is it how science works. Science is, in part, a way of trying to better understand the world. And science is the knowledge produced by that process. The scientific method involves making a hypothesis and then gathering evidence and analysing data to draw conclusions about whether the hypothesis is supported. It’s possible to find evidence that rules out a hypothesis, but it’s not possible find evidence that a particular hypothesis is the only explanation for the data. This is because other hypotheses might explain that evidence just as well, including hypotheses that no-one has come up with yet. But after carefully analysing the results of many experiments a clearer understanding can begin to emerge (in the form of a theory). In that way you can think of science as showing what doesn’t work until there’s a reasonably solid explanation left. It’s not about being right; it’s about being less wrong.

Similarly, testing isn’t about proving that the software is bug free; it’s about providing evidence that you can use the software without any significant issues, so that what’s left is reasonably solid. It’s also not about proving that the software does exactly what the customer wants, but it is about helping to iteratively improve the customer’s satisfaction with the software. This is an important part of software testing that’s sometimes forgotten—the aim isn’t solely to find bugs, but also to find unexpected, unusual, or confusing behaviour.

On the other hand, there are plenty of ways in which science and testing are different. But I’ll leave that for another post.

1. Not so much manual testing specifically, but a comprehensive approach to testing that includes exploratory testing and automation.

2. Do a search for "science proves". It's enough to make a scientist or philosopher or mathematician cry.