CI/CD Pipeline Errors
CI failures fall into three categories: the test or build is genuinely broken (rare), the runner environment is wrong (common), or some flaky external dep timed out (most common). Recognizing which is which fast keeps your pipeline green. The ten errors below cover GitLab CI, GitHub Actions, Jenkins, and the runner side of all of them.
#131 Runner offline / job stuck pending
Solution: Check runner host: systemctl status gitlab-runner; journalctl -u gitlab-runner -n 50; verify network to the GitLab/GitHub server; runner registration token may have rotated.
#132 secret env var not injected
Solution: Verify secret defined in CI settings (correct project/group/org scope); CI YAML reference matches name; check protected branch / environment scope; try printing ${VAR:0:3} (first 3 chars) to debug without leaking.
#133 docker login failure in pipeline
Solution: Most CI providers offer ephemeral registry tokens. For GitLab: $CI_REGISTRY_USER / $CI_REGISTRY_PASSWORD; GitHub: secrets.GITHUB_TOKEN. Avoid hard-coding.
#134 flaky test (passes locally, fails in CI)
Common causes: Time-dependent assertions, race conditions exposed by slower CI hardware, missing test isolation. Solution: Reproduce in a container locally; add explicit waits for network/DB readiness; isolate state between tests.
#135 build cache miss (slow build)
Solution: Cache key in CI YAML must match between runs (often pinned to requirements.txt hash); cache size limits in your CI provider; for Docker: layer ordering matters.
#136 disk full on runner
Solution: docker system prune -af; remove old workspaces in /var/lib/gitlab-runner/builds; configure cleanup at runner level (runner concurrency, build duration limits).
#137 job timeout (10/30/60 min hit)
Solution: Raise timeout in CI config; or fix the actual slowness (parallelize tests, smaller images, better caching).
#138 deployment failed: connection to target refused
Solution: CI runner can’t reach production; check VPN/firewall; for Kubernetes: verify kubeconfig not stale (cluster cert rotation).
#139 git pull failure (auth) in pipeline
Solution: Token expired or scope insufficient; for GitHub Actions: permissions: contents: read; for self-hosted runners: SSH deploy key registered.
#140 Ansible playbook fails: Host key verification
Solution: First-time SSH to a host needs known_hosts entry. Set ansible.cfg host_key_checking = False for ephemeral test envs (don’t in prod), or pre-populate ~/.ssh/known_hosts.
Conclusion
- Reproduce CI failures locally in the same container image. 80% of “flaky” tests are environment differences.
- Use ephemeral CI tokens from the provider (CI_REGISTRY_PASSWORD, GITHUB_TOKEN), not hard-coded creds.
- Cache aggressively but pin keys to deterministic content (lockfile hashes).
- Don’t set timeouts higher than your SLA; if the build is too slow, fix the build.
- Self-hosted runners need monitoring like any other server: disk, CPU, runner-process health.
Related Linux Admin articles
- Linux Docker & Container Errors — many CI failures are container issues
- Linux Programming & Build Errors — for compile/link issues during CI
- Linux Systemd & Service Errors — for runner daemon issues