--- layout: handbook-page-toc title: "Chef Automation" --- ## On this page {:.no_toc .hidden-md .hidden-lg} - TOC {:toc .hidden-md .hidden-lg} Issue: [`infra/5078`](https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/5078) ## Idea/Problem Statement 1. Chef-related workflows are mostly repetitive, notably, updating a role, environment, or a cookbook involves running a set of repetitive commands on an SRE workstation 1. Not all users have access to update live Chef changes, which makes them ask an SRE to do it for them Once the Chef change is approved and merged into master, it should be assumed that applying such change to the Chef server is a safe operation, provided that it is not a production change. In this light, a CI/CD pipeline should be applying this change. ## Design ### Uploading cookbook changes A CI stage (called `publish`) would clone the publisher script repository, copy the script to the cookbook repo and run it. Since we have a lot of cookbook repositories, we need to keep the actual publishing script independent from the cookbooks so that we don’t need to update all cookbooks when a change to the publishing script is made. The publish stage only runs for a `master` branch and when the credentials required for uploading cookbooks are present as environment variables. The publishing script itself does the following: 1. It evaluates the `metadata.rb` files from before and after the merge to decide if the version has been changed 1. Assuming a change in version, it sets up all the credentials needed for a successful Berkshelf run, which we use to manage cookbooks versions and dependencies 1. It installs some required packages (e.g. rubygems, berkshelf, …) 1. It uploads the new cookbook 1. It creates an MR that includes the changes made for `Berksfile.lock` and mentions the user who initiated the merge We have 66 cookbook repositories that need updating to include the new `publish` stage, assuming it has a `.gitlab-ci.yml` file (some are very old). A custom Ruby script would clone all repositories in turn, parse `.gitlab-ci.yml` if found, add a static YAML stanza that would do the steps described in the first paragraph, dump the file, then push to branch. This would speed up the updating process but it would mean losing stuff like YAML comments and the order of some keys, but those we can live with. ### Uploading roles/environment changes A CI stage (called `apply`) would include two jobs, one for applying all changes that are not production-related, and another for the production ones. A distinction between production and non-production changes is made based on the file name prefixes. The production job is set to be executed manually, to avoid any surprises before making sure that the changes are working properly on staging. The staging job, in turn, is going to show what actions are to be executed when the production job is triggered, again, to avoid any surprises. ### Implementation Considerations #### Testing ##### Uploading cookbook changes The CI pipeline was tested on a cookbook (gitlab-ceph) that we don’t use anymore in production or staging, so no fear from pushing a new cookbook version to Chef. ##### Uploading roles/environment changes The CI pipeline was tested by updating the `description` property on staging- and production-related roles and environments, such change should have no effect whatsoever on the fleet. #### GitLab.com and Self-managed To our knowledge, no on-prem installation is using Chef for configuration management. ### Operational Considerations #### Automation Such change is not expected to have metrics exported. A failure in the CI pipeline should be enough as way of monitoring. #### Monitoring See Automation above.