Before we start diving into writing CI configuration files we're going to cover what a CI configuration looks like and go over some basics.
We'll cover concepts like:
- Pipelines versus Stages versus Jobs
- Pipeline configuration
And we'll use the Terraform
.gitlab-ci.yml file as the example, along with some other examples and some visuals to help us along the way.
Let's get started.
Pipelines & Stages & Jobs¶
In GitLab CI we have three things we need to be aware of: pipelines, stages and jobs.
A pipeline contains stages. Stages contain jobs. Jobs contains configuration that tells GitLab CI what it is you want to do.
We define a single pipeline by providing a
.gitlab-ci.yml file. The file itself represents the pipeline. We then define stages in this pipeline (by adding them to the file) and each stage in turn has a single job added to it. Each job defines the stage it belongs to, the rules that decide if the job should execute, and the script that is executed.
It's important to understand the difference between these elements, so let's visualise this with a simple example:
graph LR a1 --> discord1 a2 --> b1 b1 --> c1 c1 --> d1 subgraph Stage A a1[Discord Notification] a2[Test Code] end subgraph Stage B b1[Compile Code] end subgraph Stage C c1[Package Code] end subgraph Stage D d1[Deploy Code] end subgraph Discord API discord1[API] end
Here we have four stages:
A we have two jobs:
Discord Notification and
Test Code. These jobs run in parallel under certain conditions, but we're not going to cover that at this point in time. In a future update to the book we will cover parallel execution. For now let's keep things simple.
Now we have stages
D. These are going to run in that order, precisely, and each execute a single job. Each stage depends on the previous to do some work or produce some artefact that we need in the next stage(s).
So once the
Test Code job in
Stage A has completed it called the next job (
Compile Code) in the next stage (
Stage B). This repeats:
C and finally
D, until the whole pipeline is completed.
The whole diagram represents a complete pipeline.
So remember: a pipeline contains stages, stages contain jobs and jobs contain configuration instructing GitLab CI to execute stuff for us.
To put this into an example more closely aligned with reality, let's write out the above as actual YAML configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
This is valid YAML and a valid pipeline configuration. It contains the stages we mentioned above and their associated jobs.
Because each job is in its own stage the whole thing will run in a linear manner (minus
Stage A, that has two jobs that will attempt to run in parallel). We also further ensure a linear progression where it matters by using the
dependencies: keywords, which create an explicit dependency between the jobs, thus forcing a linear execution.
Let's now review the contents of a real CI configuration file. It's the file we'll be writing to configure our Terraform pipeline.
We've covered the differences between a pipeline, a stage and a job. Let's now start looking at the Terraform
.gitlab-ci.yml file and begin to understand the keywords used to construct the whole pipeline.
Here are the very first few lines of our Terraform's
.gitlab-ci.yml file, which constitute the pipeline's global configuration as well as some default values:
1 2 3 4 5 6 7 8 9 10 11 12
All of this is configuring the pipeline to behave in a particular way and do some tasks for use ahead of each stage. Let's review each of the configuration options above.
This configures the entire pipeline to run all
script: configurations (explained below) in a Docker container using a specific image:
This particular image is perfect for our needs not just because it provides Terraform but because it's suitable for us inside of GitLab CI pipelines due to some bootstrapping thats being done around Terraform. This will become more clear later on.
This configuration keyword allows us to define variables that are available for use across the entire pipeline, in all stages, and can be used for all kinds of things.
cache: keyword we can have the pipeline cache certain files and or directories between stages/jobs, and even across pipelines themselves. For us this is important because after we call
terraform init we need to copy the
.terraform/ to the other stages in the pipeline. If we didn't we would have to call
terraform init for every job.
In our stages we use the
script: keyword to define the functionality of each stage and actually get our work done. The
before_script: configuration is used to have a script execute before the script inside of each of our
script: blocks. We're using the GitLab CI provided Terraform Docker image, so we need to use this feature to move into the
So as an example if we had the following
We're defining a script that makes sure a directory we need in each stage always exists. If we then used the following
script: in a stage inside of our pipeline:
Before the stage's script executed our
before_script: would run, which means we'd effectively be getting (as I'm sure you've guessed):
If you had two or more stages that needed this directory, then of course it would get repetitive having to provide the same command every time. Plus if the name of the directory changed you could use a variable and also change the
mkdir call in a single place.
Our Terraform pipeline has the following stages:
1 2 3 4 5
These stages are stepped through, one by one, in the order shown. We have four stages:
I believe if we explain the jobs behind the
apply stages, and their respective job configurations, then we'll have enough information to successfully write the actual files themselves. The
destroy stage will be understandable after you've studied the others.
The GitLab CI documentation covers stages in more detail.
This is the configuration of a single job inside the
1 2 3 4 5 6 7 8 9 10 11
Let's break this down into its core components.
When we use the
rules: keyword we're telling GitLab CI that our job (not the pipeline or even the stage as a whole) has a list of rules from which one must equal "true" (with a short-circuit effect in place) before this job will be included in its particular stage.
If none of the rules evaluate to "true", then this job does not execute, but the rest of the stage may very well if another job inside of said stage does evaluate to "true".
Let's look at this visually with a simple, contrived example (and assume all jobs are in a single stage):
graph LR a[Job 1] --> b b[Job 2] --> c c[Job 3]
If we have the following rules for each job, we can make adjustments to them to alter the above flow...
a => rules: A_VAR==1
b => rules: B_VAR==2
c => rules: C_VAR==3
If we execute the pipeline and we set
B_VAR=2, but we set
C_VAR=99, then the pipeline will look like this:
graph LR a[Job 1] --> b b[Job 2]
If we flip that logic on its head entirely, setting
C_VAR=3, then the pipeline will look like this:
graph LR a[Job 3]
Put another way: if a job's rules exclude it from the stage, then GitLab CI moves on to the next job looking for one that evaluates to "true" which is then included in the stage (which means the stage is included in the pipeline).
All of our stages only have a single job defined in them.
So what rules do we have in our
1 2 3
We're using an
exists: keyword to determine if a file (
.destroy) exists or not. If it does then the
when: keyword determines what should happen, and in this case
never means this stage should
never be included in pipeline.
Finally we're asking GitLab CI to check for changes to a list of pattern matches. In our case we're looking for changes to any files that match
*.tf, or Terraform configuration files. In the event such changes do exist then this rule evaluates to true and the stage is included in the pipeline.
The final rule explains why I've opted to include the first rule, the
if: keyword: what if there are no changes to the Terraform files? How do I run the pipeline? By including this
if: check means I can "override" the other rules and have the stage included in all cases.
This raises another important point about
rules: in GitLab CI: the first rule in the list to evaluate to true determines if the stage is included in the pipeline or not. No other rules are evaluated after this point. That's why the
if: clause is included first - it means all other rules are ignored if
RUN_ANYWAY = YES.
script: keyword is basically the backbone of most CI configurations. It's how we define the actual functionality of the job within the stage. There are other things we can do with a job, like triggering other remote pipelines, but what you'll see the most is a
script: keyword being used to execute some shell code.
init the Terraform installation. Then we
validate that the syntax of the code is valid. If not then the stage will fail and the pipeline will come to a halt.
In the above script we're using the
Now let's review the same thing for the
plan stage - it's job configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
We have a keyword here -
artefacts: - that we haven't seen before. Let's go over what it does.
artefacts: keyword we're telling GitLab CI to create two artefacts: the plan file for Terraform to use at later stages, and the JSON version that gets pushed into the back end of the GitLab CI Terraform solution.
We'll ignore the magic behind the latter part of this stage and instead just focus on the first part.
artefacts: is a bit like the
cache: keyword we saw earlier - it stores items of interest for us. However with artefacts we decide what stages get the artefacts themselves, where as using
cache: means whatever is cached is included in all stages. Not every stage needs whatever we store as artefacts, which is why we use them.
As we need to generate a Terraform plan so that our
apply can do its job, we use the
artefacts: keyword to store it for later recovery.
And finally we'll review the one job we've configured for the
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
With another unique keyword we've not seen yet:
Here we encounter another new keyword:
plan stage we used the
artefacts: keyword to create a downloadable artefact from our Terraform plan file (
plan.cache). Now we're using the
dependencies: keyword to tell the job what artefacts to download from what job. In this case it's the
plan job, as defined in the code above.
This is how we move objects between jobs, stages and even pipelines: we use
dependencies: (among other features available to use too.)
We've gone over the basics of a simple GitLab CI configuration file. We've now got a feel for the formatting and some of the basic keywords being used. This is enough to work with for the time being, but if you want to know more or just simply explore what's available (tinkering is a good idea!) then checkout the GitLab CI configuration file reference.
In the next section we're going to discuss the Terraform pipeline configuration and then begin to actually start writing out files.