Hello, Geopoiesis!

Turbocharging your infrastructure-as-code

Geopoiesis is a specialized continuous integration and deployment tool for modern declarative infrastructure provisioning and management. It is designed as a companion for Terraform - an excellent, provider-agnostic open-source DevOps tool. Geopoiesis takes up where Terraform left off, providing an easy and safe way for teams to collaborate on shared infrastructure.

Geopoiesis is distributed as a single statically linked binary, designed to be self-managed on top of AWS backend primitives: S3, DynamoDB and SSM Parameter Store for storage, CloudWatch for telemetry, CloudWatch Logs for logging and KMS for encryption. Ultimately, we'd like to also support non-AWS backends.

This document assumes you are reasonably familiar with the main tenets of Terraform. If you aren't, you're probably better off starting with a basic introduction first. Ideally, you should be no stranger to AWS, either.

If you're already aware that you need a tool like Geopoiesis, feel free to skip this section and go directly to read about its core concepts. Otherwise, the rest of this article will try to make you aware of some of the everyday problems with operating Terraform, and show how Geopoiesis addresses those.

Life before Geopoiesis

While standalone Terraform offers a good enough experience for a single developer working on a small project, it quickly starts to show its limitations when the size of the team grows. If the single user kept their configuration and state locally, now is the time to check in the code to a version control system, and put your state in remote storage.

In a multi-user environment each change you will want to peer-review each change to at see if it the code is valid and produces the right result. The most immediate way of doing that is checking out code under review, and running terraform plan locally. If all is well, you merge the change and run terraform apply on the main branch.

Not only is the above tedious, but it also requires a lot of discipline to always make sure you are pushing the new version of the main branch, and that two team members are not stepping on each other's toes. Hence, it makes sense to delegate running Terraform somewhere else, like a generic CI platform. You can run terraform plan on each commit, and terraform apply on each merge to the main branch.

The above approach goes a long way towards making your life with Terraform easier: you can get status checks on your commits, thereby delegating the task of validating your configuration to the machine. You can also look at the CI output to understand the impact of proposed changes. As long as you retain your deployment history, you get crude auditing, too.

Still, a few issues remain. For example, multiple merges to the main branch can overlap, especially if your plan/apply cycle is taking long. If you're not using remote state locking, state changes will override themselves and all hell will break loose. If you're using state locking, some of your changes will simply fail. This is better than arbitrarily overriding existing state, but usually not what you want either: you'd like to serialize your apply operations instead. Also, you should make sure that the outcome of the dry-run plan will be the same as the final apply? A conflicting change to your state - external or coming from a different merge - could significantly alter the outcome.

Last but not least, there is a problem with managing secrets. Due to Terraform's nature, it requires very strong credentials to manage infrastructure lifecycle. These credentials could likely be used to take down your entire cloud infrastructure. Storing them locally on multiple user machines is sub-optimal and - unless you're in full control of your CI/CD pipeline - giving them to a third party is even more so.

Enter Geopoiesis

Geopoiesis addresses the shortcomings of both running Terraform locally and using a more generic CI/CD platform.

VCS integration

Geopoiesis integrates with GitHub and GitHub Enterprise (GitLab integration coming soon) and uses webhooks to receive information about pushes to the repository. It runs terraform plan on each commit, and pushes a detailed result as a commit status. Like this:

In the above example, three different Geopoiesis scopes follow the repository and each reports on the change to its state that merging this Pull Request is expected to cause.

Optional manual review

When a commit is merged to a branch directly tracked by one of Geopoiesis scopes, Geopoiesis runs terraform plan and - depending on the scope settings - can ask for a final manual approval.

The plan and the workspace are then encrypted and temporarily persisted to S3 to ensure that if the above is confirmed, terraform apply will attempt to perform exactly those changes on exactly that version of remote state.

Note that manual approval is just the most conservative of the three Geopoiesis modes. You may want to read more about further automation here.

Change serialization

Until the above is confirmed and applied (or discarded) all other operations on the remote state linked to this scope are queued. This refers to both regular runs:

...as well as tasks, which also can perform arbitrary changes on the state:

Secrets management

Unlike when running local Terraform locally, in Geopoiesis environment secrets are stored in a central location. Yet unlike with SaaS CI/CD offerings, you have full control over the process. While Geopoiesis provides you a convenient UI (see below), your settings do not ever leave your AWS account - they are stored in SSM Parameter Store and optionally encrypted with KMS:

When saving an environment variable, you have the option to either store it in plaintext, or make it a secret. In the latter case, it is not retrievable from the UI (although you could retrieve it using a task if you're sure it's a good idea) and on the backend it is encrypted using your installation's KMS key:

You can read more about environment management here.

State management

While Geopoiesis requires that remote Terraform state is used, it does not offer persistent state storage. This is by design: your state is likely to contain very sensitive data that you know best how to store and protect. That said, given that Geopoiesis operates in the AWS environment, we encourage you to consider S3 backend. Note that you don't necessarily need locking if you exclusively manage your state using a single installation of Geopoiesis.

Note that however you decide to store your remote state, it is still accessible to Geopoiesis tasks.

Further reading

Much work on Geopoiesis was guided by two excellent books - Terraform up and Running by Yevgeniy Brikman, offering a practical crash course Terraform, and Cloud Native Infrastructure by Justin Garrison and Kris Nova, taking a more strategic view on the modern approach to managing your cloud infrastructure.

If your terraforming needs don't warrant a fancy solution like Geopoiesis, be sure to at least take a look at terragrunt, which addresses some of the problems mentioned above.