Runs

Triggering a run

With Geopoiesis running and your scope environment properly set up, you can now trigger your first Geopoiesis run:

Assuming everything went well (and there are changes to your state), you will see a screen like this:

Geopoiesis just ran terraform init and terraform plan with full state refresh. The exact snapshot of your change has been archived to S3. At this stage you have an option to confirm or discard your plan. If you confirm, your workspace will be pulled from S3 and used to apply the plan using terraform apply.

Assuming you chose to confirm the plan, a Geopoiesis worker will pick it up from the queue, and apply it like so:

When you next visit the Runs section, it will show your applied run:

That run reconciles the state, so next time when you trigger a run, the result will be much less exciting. Note that this may have been your result the first time, too, if your state was up to date:

Run state diagram

The above workflow represents a happy path, but there are always plenty of things that can go wrong. Below is a detailed diagram explaining all possible state transitions for a user-triggered run.

When a new run is created, it starts with a waiting state. If some other run currently has a lock on its scope, it is marked as blocked for as long as the lock is being held. At any point before actual work starts, the user can cancel the run and thus transition it to the canceled state. When the run is not blocked and a worker is available, work begins and the run enters plan initializing state. This is where we set up your Terraform workspace and run any initializations hooks you may have set up. If that's successful, we move to the planning phase.

The planning phase can have one of four possible outcomes. The most unhappy ones are the worker crashing, resulting in plan crashed status or one of the commands exiting with non-zero status, transitioning run state to plan failed. For more nuanced explanation of failing vs. crashing please see the section below.

If all goes well, Geopoiesis looks at the delta reported by Terraform. If the plan does not introduce any changes, run transitions into a terminal state called no changes. If there are changes, the state changes to unconfirmed and the worker returns the run back on the queue. The user can now either confirm, or discard the Terraform plan associated with the run. Discarding the plan will change the run state to discarded, while confirming it will mark the run as confirmed.

You don't necessarily need to confirm every change manually. Once you're comfortable with Geopoiesis we suggest considering automation.

When confirmed, the run is waiting for the next available worker, which then changes its state to applying. Somewhat similar to the planning phase, applying a plan can have one of three outcomes. If all goes well, the run is marked as applied. If any of the commands exit with a non-zero code, the run is marked as apply failed. If the worker crashes while applying, the run is marked as apply crashed.

In the UI you will see the shorter form of run state: instead of plan failed or plan crashed the labels will just say failed and crashed. This is merely intended to keep main UI elements clean and readable, since the full state name is still available in the history section. The GraphQL API will also return the full state.

Failing vs. crashing

Geopoiesis worker is designed to crash easily, and let garbage collector clean up any remaining artefacts. Your run will fail if and only if the planning or applying phase fails due to a problem with your Terraform config. But if one or more of the underlying backend primitives are misbehaving (eg. heavy Dynamo throttling, partial AWS outage) or Geopoiesis cannot pull your code from the repo, the worker will simply crash. We assume that you're running Geopoiesis as an ECS service or in an EC2 autoscaling group, so you should not worry about restarting individual failed processes.