Terraform Module: Cloud Run Worker Pools For GitHub Runners
Introduction
In today's dynamic software development landscape, efficient and scalable infrastructure is paramount. This article delves into the implementation of a Terraform module designed to deploy Cloud Run Worker Pools, a crucial component for running GitHub Actions runners. Our focus will be on providing a detailed, step-by-step guide to creating a robust and cost-effective solution. This article provides a comprehensive guide to implementing a Terraform module for Cloud Run Worker Pools. Cloud Run Worker Pools are essential for running GitHub Actions runners, offering a scalable solution for managing workflows. This guide provides a step-by-step approach to creating a robust and cost-effective infrastructure using Terraform.
Understanding Cloud Run Worker Pools
Cloud Run Worker Pools offer long-running container instances capable of processing tasks from external queues. This makes them ideal for running GitHub Actions runners, which scale according to workflow demands. By leveraging Cloud Run Worker Pools, we can ensure that our CI/CD pipelines are both responsive and resource-efficient. This is especially important for organizations that require flexible scaling options and cost optimization.
Objective: Crafting a Terraform Module
Our primary objective is to develop a Terraform module that automates the deployment of Cloud Run Worker Pools tailored for GitHub Actions runners. This module will encapsulate the necessary configurations and dependencies, making it reusable and easy to manage. The module's design prioritizes cost optimization, scalability, and seamless integration with GitHub Actions.
Technical Requirements: Setting the Stage
Before diving into the implementation, let's outline the technical specifications for our Cloud Run Worker Pool. These requirements are designed to balance performance and cost-effectiveness. This involves carefully configuring scaling options, CPU and memory allocations, timeout settings, and concurrency limits.
Cloud Run Worker Pool Configuration
To optimize performance and cost, the following configurations are crucial:
| Setting | Value | Rationale |
|---|---|---|
| Scaling | 0-10 instances | Cost optimization with room to scale |
| CPU | 2 vCPU | Standard GitHub runner spec |
| Memory | 4Gi | Sufficient for most workflows |
| Timeout | 3600s (1 hour) | Max job duration |
| Concurrency | 1 | One job per instance |
These settings ensure that our worker pool can handle varying workloads while minimizing unnecessary costs. The scaling range of 0-10 instances allows the pool to scale down to zero when idle, further reducing expenses.
Implementation Location
The Terraform module will reside in the following directory structure:
- Directory:
terraform/modules/worker-pool/ - Files:
main.tf,variables.tf,outputs.tf
This structure promotes modularity and maintainability, making it easier to manage and update the module in the future. The main.tf file will contain the primary resource definitions, variables.tf will define the input variables, and outputs.tf will declare the output values.
Terraform Implementation: The Core Logic
The heart of our solution lies in the Terraform implementation. We'll use the google-beta provider to interact with the Cloud Run Worker Pool API. The following code block demonstrates the main.tf file, which defines the worker pool resource and its configurations.
# terraform/modules/worker-pool/main.tf
terraform {
required_providers {
google-beta = {
source = "hashicorp/google-beta"
version = "~> 5.0"
}
}
}
variable "project_id" {
type = string
}
variable "region" {
type = string
default = "us-central1"
}
variable "name" {
type = string
default = "github-runners"
}
variable "image" {
type = string
description = "Container image URL from Artifact Registry"
}
variable "service_account_email" {
type = string
}
variable "min_instances" {
type = number
default = 0
}
variable "max_instances" {
type = number
default = 10
}
variable "cpu" {
type = string
default = "2"
}
variable "memory" {
type = string
default = "4Gi"
}
variable "github_org" {
type = string
default = "Matchpoint-AI"
}
variable "runner_labels" {
type = string
default = "self-hosted,cloud-run,linux,x64"
}
variable "secrets" {
type = object({
app_id = string
installation_id = string
private_key = string
})
description = "Secret Manager secret IDs for GitHub App credentials"
}
# Cloud Run Worker Pool (beta)
resource "google_cloud_run_v2_worker_pool" "runners" {
provider = google-beta
name = var.name
location = var.region
project = var.project_id
template {
containers {
image = var.image
resources {
limits = {
cpu = var.cpu
memory = var.memory
}
}
# Environment variables
env {
name = "GITHUB_ORG"
value = var.github_org
}
env {
name = "RUNNER_LABELS"
value = var.runner_labels
}
# Secrets from Secret Manager
env {
name = "GITHUB_APP_ID"
value_source {
secret_key_ref {
secret = var.secrets.app_id
version = "latest"
}
}
}
env {
name = "GITHUB_APP_INSTALLATION_ID"
value_source {
secret_key_ref {
secret = var.secrets.installation_id
version = "latest"
}
}
}
env {
name = "GITHUB_APP_PRIVATE_KEY"
value_source {
secret_key_ref {
secret = var.secrets.private_key
version = "latest"
}
}
}
}
# Service account
service_account = var.service_account_email
# Scaling configuration
scaling {
min_instance_count = var.min_instances
max_instance_count = var.max_instances
}
# Timeout for long-running jobs
timeout = "3600s"
# Max retries on failure
max_retries = 3
}
labels = {
component = "github-runner"
managed-by = "terraform"
}
}
This code snippet defines the Cloud Run Worker Pool resource, configuring its scaling, resource limits, and environment variables. It also integrates with Secret Manager to securely inject GitHub App credentials. This approach ensures that sensitive information is not hardcoded in the configuration.
Variables File: Defining Inputs
The variables.tf file defines the input variables for our module. These variables allow users to customize the deployment according to their specific needs. Each variable includes a description and a default value (where applicable). This makes the module more user-friendly and self-documenting.
# terraform/modules/worker-pool/variables.tf
variable "project_id" {
description = "GCP Project ID"
type = string
}
variable "region" {
description = "GCP Region for the worker pool"
type = string
default = "us-central1"
}
variable "name" {
description = "Name of the worker pool"
type = string
default = "github-runners"
}
variable "image" {
description = "Container image URL (from Artifact Registry)"
type = string
}
variable "service_account_email" {
description = "Service account email for the worker pool"
type = string
}
variable "min_instances" {
description = "Minimum number of instances (0 for scale-to-zero)"
type = number
default = 0
}
variable "max_instances" {
description = "Maximum number of instances"
type = number
default = 10
}
variable "cpu" {
description = "CPU allocation per instance"
type = string
default = "2"
}
variable "memory" {
description = "Memory allocation per instance"
type = string
default = "4Gi"
}
variable "github_org" {
description = "GitHub organization name"
type = string
default = "Matchpoint-AI"
}
variable "runner_labels" {
description = "Comma-separated labels for the runner"
type = string
default = "self-hosted,cloud-run,linux,x64"
}
variable "secrets" {
description = "Secret Manager secret IDs for GitHub App credentials"
type = object({
app_id = string
installation_id = string
private_key = string
})
}
The variables defined here cover a wide range of configurations, from the GCP project ID to the GitHub organization name. This flexibility ensures that the module can be adapted to various environments and use cases.
Outputs File: Exposing Key Identifiers
The outputs.tf file defines the output values of our module. These outputs provide key identifiers and URIs that can be used by other Terraform configurations or applications. By exposing these values, we facilitate integration with other systems and workflows. This allows users to easily retrieve essential information about the deployed worker pool, such as its ID and URI.
# terraform/modules/worker-pool/outputs.tf
output "worker_pool_id" {
description = "The ID of the worker pool"
value = google_cloud_run_v2_worker_pool.runners.id
}
output "worker_pool_name" {
description = "The name of the worker pool"
value = google_cloud_run_v2_worker_pool.runners.name
}
output "worker_pool_uri" {
description = "The URI of the worker pool"
value = google_cloud_run_v2_worker_pool.runners.uri
}
The outputs include the worker pool ID, name, and URI, which are crucial for monitoring and managing the deployed infrastructure.
Acceptance Criteria: Ensuring Quality
To ensure the quality and reliability of our Terraform module, we've established a set of acceptance criteria. These criteria cover functional requirements, code quality, and verification steps. Meeting these criteria is essential for a successful deployment.
Functional Requirements
The module must meet the following functional requirements:
- [ ] Worker pool Terraform resource created using
google-betaprovider - [ ] Scaling configured: min=0, max=10
- [ ] Resources: 2 vCPU, 4Gi memory
- [ ] Timeout set to 1 hour
- [ ] GitHub App secrets injected via Secret Manager references
- [ ] Service account attached
- [ ] Environment variables set for GitHub org and runner labels
These requirements ensure that the worker pool is correctly configured and integrated with the necessary services.
Code Quality Requirements
The code must adhere to the following quality standards:
- [ ] Uses
google-betaprovider for worker pool resource - [ ] All variables have descriptions and sensible defaults
- [ ] Outputs expose key identifiers
- [ ] Code passes
terraform fmtandterraform validate
These standards promote readability, maintainability, and adherence to best practices.
Verification Steps
To verify the deployment, we'll use the following steps:
# After terraform apply
gcloud beta run worker-pools describe github-runners \
--region=us-central1 \
--project=${PROJECT_ID}
# Check scaling config
gcloud beta run worker-pools describe github-runners \
--region=us-central1 \
--format="value(template.scaling)"
These commands allow us to inspect the deployed worker pool and verify its configuration. By checking the scaling configuration, we can ensure that the pool is scaling as expected.
Dependencies: Mapping the Landscape
Our Terraform module has several dependencies that must be addressed before deployment. These dependencies include IAM configurations, Secret Manager secrets, Artifact Registry images, and the runner image itself. Understanding these dependencies is crucial for a smooth deployment process.
Dependencies List
- Blocked By: #3 (IAM), #4 (Secrets), #5 (Artifact Registry), #7 (Runner Image)
- Blocks: #9 (Deploy worker pool)
Addressing these dependencies ensures that all necessary resources are in place before deploying the worker pool. This reduces the risk of deployment failures and ensures that the worker pool functions correctly.
Estimated Complexity: Gauging the Effort
The implementation of this Terraform module involves a moderate level of complexity. This is due to the beta nature of the Cloud Run Worker Pools API and the need to integrate with multiple Google Cloud services. However, the benefits of a scalable and cost-effective CI/CD pipeline make the effort worthwhile.
Complexity Assessment
- Effort: Medium
- Risk: Medium (beta API may have quirks)
- Files Changed: 3 files
Being aware of these complexities allows us to plan accordingly and allocate the necessary resources for the implementation.
Notes: Important Considerations
Several key considerations should be kept in mind during the implementation process. These include the beta status of Cloud Run Worker Pools, the potential need to enable the beta API, and the requirement for the google-beta provider. Keeping these notes in mind can help avoid common pitfalls and ensure a successful deployment.
Key Considerations
- Cloud Run Worker Pools is a beta feature
- May need to enable the beta API
- The
google-betaprovider is required
Definition of Done: Setting the Goal
To mark the successful completion of this project, we've defined a clear set of criteria. These criteria encompass the creation of the Terraform module, adherence to coding standards, and successful validation. Meeting these criteria signifies that the module is ready for production use.
Completion Criteria
- [ ] Terraform module created at
terraform/modules/worker-pool/ - [ ] Module uses
google-betaprovider correctly - [ ] All variables documented
- [ ] Code passes validation
- [ ] PR merged to main branch
Conclusion
Implementing a Terraform module for Cloud Run Worker Pools is a significant step towards building a scalable and cost-effective CI/CD pipeline. By following the guidelines and best practices outlined in this article, you can create a robust solution that meets your organization's needs. The module's modular design and comprehensive documentation make it easy to manage and maintain, ensuring long-term value. For further reading on Terraform and Cloud Run, consider exploring the official Terraform documentation and Google Cloud Run documentation.