CI/CD & DevOps
Prerequisite:
Overview
Shipping software used to mean infrequent, high-stakes releases - large batches of changes deployed after weeks of manual testing, often on a Friday evening. When something broke, it was hard to isolate the cause among hundreds of changes. CI/CD flips this model: small changes ship continuously, every change is tested automatically, and the system fails fast when something goes wrong.
Continuous Integration (CI) means every code change is automatically built and tested as soon as it is pushed. Continuous Delivery (CD) means every change that passes CI is automatically deployable - and often automatically deployed - to production. Together, they form the backbone of modern software delivery.
Why CI/CD Matters
The core insight is that risk scales with batch size. Deploying one change is far safer than deploying a hundred at once. When a single change breaks production, you know exactly where to look. CI/CD creates a culture of small, frequent, safe releases.
The practical benefits compound: bugs caught in CI cost a fraction of bugs caught in production. Automated pipelines eliminate the manual toil of running test suites and building artifacts by hand. Developers get fast feedback - often within minutes - rather than discovering failures days later in code review.
CI Pipeline Stages
A well-designed CI pipeline has ordered, fast-failing stages. Each stage runs only if the previous one passes, and the fastest checks run first to give developers feedback quickly:
- Lint and static analysis: catch syntax errors, style violations, and obvious type errors in seconds - no test infrastructure needed
- Unit tests: test individual functions and classes in isolation using mocks; should complete in under two minutes
- Build: compile code, build the Docker image, produce the artifact that will be deployed
- Integration tests: test the application against real dependencies (databases, queues) running in containers
- Security scan: scan the built image for known CVEs; run dependency audits (pip-audit, npm audit)
Keeping the full pipeline under ten minutes is a practical target - longer pipelines discourage frequent commits.
GitHub Actions
GitHub Actions defines pipelines as YAML workflow files in .github/workflows/. A workflow has triggers (e.g., push to main, pull request opened), jobs (isolated execution units), and steps (individual shell commands or reusable actions).
Jobs run on runners - GitHub-hosted VMs or self-hosted machines. Jobs can run in parallel by default; you add needs: to declare sequential dependencies. Each job starts in a fresh environment, so artifacts (built Docker images, compiled binaries) must be explicitly passed between jobs using caching or artifact upload/download.
Key practices for effective GitHub Actions workflows:
- Test in the same environment as production: build and run tests inside Docker to eliminate “works in CI, fails in prod” failures
- Fail fast: put lint before tests; put fast tests before slow ones; exit immediately on the first failure
- Parallelise slow tests: split test suites across multiple runner instances with test sharding
- Cache dependencies: cache pip or npm packages keyed by the lockfile hash; a cache hit can save two minutes per run
Continuous Delivery and Deployment Strategies
Once CI passes, CD takes over. The most important distinction: staging validates the change in a production-like environment before it reaches users.
Rolling Updates
Replace running instances of the old version one at a time with the new version. At any point, both versions serve traffic. This is the default in Kubernetes. It is simple and requires no extra infrastructure, but both versions must be compatible - including database schemas - during the transition window.
Blue-Green Deployments
Maintain two identical production environments: blue (live) and green (idle). Deploy the new version to green, run smoke tests, then atomically switch the load balancer to route all traffic from blue to green. Rollback is instant - switch back to blue. The cost is maintaining double the infrastructure.
Canary Deployments
Route a small percentage of traffic (e.g., 1% or 5%) to the new version while the rest continues to the stable version. Monitor error rates and latency on the canary. If metrics look good, gradually increase the percentage; if they degrade, roll back. Canaries are the safest strategy for changes with uncertain user impact.
Feature Flags
Feature flags decouple deployment from release. You deploy code to production with the new feature disabled behind a flag. You then enable the flag for internal users first, then a percentage of users, then everyone. If something goes wrong, you disable the flag without a code deployment. This makes the deployment itself a non-event and separates it from the business decision of when to launch.
Trunk-Based Development
CI/CD works best with trunk-based development: all developers commit to a single main branch with short-lived feature branches (merged within a day or two). Long-lived branches accumulate drift and create painful merges that slow down the pipeline. Feature flags let you merge incomplete work into main without exposing it to users.
Secrets in CI
CI pipelines often need credentials - API keys, registry passwords, cloud provider tokens. Never hard-code secrets in workflow files or commit them to the repository.
- GitHub Secrets: environment variables encrypted at rest, injected into the runner environment; visible to authorized workflow runs but not in logs
- HashiCorp Vault: a dedicated secrets manager with dynamic credentials, audit logs, and fine-grained access control; accessed from CI via short-lived tokens
- OIDC identity: GitHub Actions can authenticate to cloud providers (AWS, GCP) using workload identity federation - no long-lived credentials stored in secrets at all
Examples
GitHub Actions workflow for a Python app. A workflow triggered on push and pull requests: a lint job runs ruff check and mypy; a test job depends on lint, starts a PostgreSQL service container, builds the app image, and runs pytest inside the container with the test database URL injected via environment variable; a build-and-push job (on push to main only) builds the final Docker image and pushes it to GitHub Container Registry.
Dockerfile in CI. Build the image in CI using docker buildx build --cache-from type=registry to pull the layer cache from the registry, avoiding rebuilding unchanged layers on every run. Tag the image with the Git commit SHA for traceability - docker tag myapp:latest ghcr.io/org/myapp:${GITHUB_SHA}.
Canary deployment with traffic splitting. In Kubernetes, deploy the canary as a separate Deployment with a small replicas count sharing the same Service selector labels as the stable Deployment. Traffic splits proportionally to replica count. Use a tool like Argo Rollouts or Flagger to automate the progressive promotion and automatic rollback based on Prometheus metrics.
Read Next: