--- layout: handbook-page-toc title: "Engineering" --- ## Communication GitLab Engineering values clear, concise, transparent, asynchronous, and frequent communication. Here are our most important modes of communication: - [**GitLab Issue Tracker**](https://gitlab.com/gitlab-org/gitlab/issues): Please use confidential issues for topics that should only be visible to team members at GitLab. - [**Everything starts with a Merge Request**](/handbook/communication/#everything-starts-with-a-merge-request): The most effective way to make a change to the company is to make a proposal in the form of a merge request to the handbook and assign it to the [DRI](/handbook/people-group/directly-responsible-individuals/). - [**Engineering Management Issue Board**](https://gitlab.com/gitlab-com/www-gitlab-com/boards/980804): For observability into what [Eric Johnson](https://gitlab.com/edjdev) and his management team are working on. - [**Engineering Management Staff Meeting**](https://docs.google.com/document/d/1-5yET9Bpuq5OqOWGN88-f2C-tlqbvjoSrCQlVol4dbU/edit): [Eric Johnson](https://gitlab.com/edjdev) and his direct reports conduct a staff meeting every Wednesday at 9am PST. Anyone at GitLab is welcome to attend and contribute to the agenda. - [**Week-in-Review document**](https://docs.google.com/document/d/1EkfzI85aqw8chYDBf2GLRvjKEa3s0FWHMI3u0DIr-xg/edit): Every week a reminder is sent to the [#eng-week-in-review](https://gitlab.slack.com/messages/CJWA4E9UG) Slack channel to read the latest update. - **VPE Office Hours**: Each week [Eric Johnson](https://gitlab.com/edjdev) holds open office hours on Zoom for questions, feedback, and handbook changes. It's typically Thursdays for 1 hour and alternates between EMEA and APAC-friendly timeslots. See Eric's calendar for current times. - **Slack**: Here are some common Engineering-centric channels - [#evpe](https://gitlab.slack.com/messages/C9X79MNJ3) - [#development](https://gitlab.slack.com/messages/C02PF508L) - [#production](https://gitlab.slack.com/messages/C101F3796) ### Keeping yourself informed As part of a fully distributed organization such as GitLab, it is important to stay informed about engineering led initiatives. We employ [multimodal communication](/handbook/communication/#multimodal-communication), which describes the minimum set of communication channels we'll broadcast to. For the Engineering department, any important initiative will be announced in: * The Engineering mailing list * All members of the department should become members as part of the onboarding process. If this is not the case for you, reach out to your manager. * Slack * `#eng-week-in-review` * The channel membership is mandatory * Week-in-Review document updates are announced in this channel * `#development` * `#production` * `#security` * `#support-managers` * `#quality` * `#ux` * `#whats-happening-at-gitlab` If you frequently check any of these channels, you can consider yourself informed. It is up to the person sharing to ensure that the same message is shared across all channels. Ideally, this message should be a one sentence summary with a link to an issue to allow for a single source of truth for any feedback. ## On this page {:.no_toc .hidden-md .hidden-lg} - TOC {:toc .hidden-md .hidden-lg} ## Other Related Pages - [Engineering Management Issue Board](https://gitlab.com/gitlab-com/www-gitlab-com/boards/980804) - [Engineering Compensation Roadmaps](/handbook/engineering/compensation-roadmaps/) - [Engineering Hiring](/handbook/hiring/charts/) - [Developer onboarding](/handbook/developer-onboarding/) - [Engineering Career Development](/handbook/engineering/career-development/) - [Engineering Internship](/handbook/engineering/internships/) - [Engineering Management](/handbook/engineering/management/) - [Engineering Workflow](/handbook/engineering/workflow/) - [Code Review](/handbook/engineering/workflow/code-review/) - [Frequently Used Projects](/handbook/engineering/projects/) - [Issue Triage Policies](/handbook/engineering/quality/issue-triage/) - [Root-Cause-Analysis](/handbook/engineering/root-cause-analysis/) - [Critical Security Releases](https://gitlab.com/gitlab-org/release/docs/blob/master/general/security/process.md#critical-security-releases) - [Emergency Meeting Protocol](/handbook/engineering/emergency-meeting-protocol/) - [Performance of GitLab](/handbook/engineering/performance/) - [Monitoring of GitLab.com](/handbook/engineering/monitoring/) - [Production Readiness Guide](https://gitlab.com/gitlab-com/infrastructure/blob/master/.gitlab/issue_templates/production_readiness.md) - [Contributing to Go projects](https://docs.gitlab.com/ee/development/go_guide/index.html) - [Pajamas Design System](/handbook/engineering/ux/pajamas-design-system/) - [Engineering READMEs](/handbook/engineering/readmes/) - [Database Engineering](/handbook/engineering/development/database/) ## Prioritizing technical decisions Please see the [Product Management section](/handbook/product/product-management/process/#prioritization) that governs how they prioritize work, and also should guide our technical decision making. <%= partial "includes/master-prioritization-list.md" %> Despite the high priority of velocity to our project and our company, there is one set of things we must prioritize over it: GitLab availability & security. Neither we, nor our customers, can run an Enterprise-grade service if we are willing to risk users productivity and data. Our hundreds of Engineers collectively make thousands of independent decisions each day that can impact GitLab.com and our users and customers there. They all need to keep availability and security in mind as we endeavor to be the most productive engineering organization in the world. We can only move as fast as GitLab.com is available and secured. Availability of self-managed GitLab instances is also extremely important to our success, and this needs to happen in partnership with our customers' admins (whereas we are the admins for GitLab.com) For security, we prioritize it more highly by having strict SLAs around priorities labels with [security issues](/handbook/engineering/security/#severity-and-priority-labels-on-security-issues). This shows a security first mindset as these issues take precedence in a given timeframe. ### The Importance of Velocity * The rate at which GitLab delivers new value to users in the form of features is a competitive advantage for the project and the company. * As an open source project, people are welcome to fork us. However, in order to ensure that the community remain intact, and the bulk of energy is directed toward one version of GitLab it is important to move fast so that any fork is quickly out of date. * Companies tend to slow down as they grow. It takes deliberate effort to prevent this, so it must always be top of mind. * Once you slow down, it is incredibly painful to speed back up again. ### Incremental Velocity and Measurement Our velocity should be incremental in nature. It's derived from our [MVC](https://about.gitlab.com/handbook/product/#the-minimally-viable-change-mvc), which encourages "delivering the smallest possible solution that offers value to our users". This could be a small new feature, but also includes code improvements, fixing bugs, etc. To measure this, we count and define the target here: [MRs per Development Engineer](https://about.gitlab.com/handbook/engineering/development/performance-indicators/#average-mrs-development-engineers-month) which is a goal for managers and not ICs. Historically, we have seen this as high as 14-19 MRs per Product Development Engineer per Month. Ten MRs per month per Product Development Engineer translates to roughly an MR every 1 1/2 business days with time for overhead. To attain this, Product Development Engineers are encouraged to: * Fix small problems they see in the code base without an issue that are incremental improvements. Small here translates to 1/2 day or less. * For feature issues, break the issue into several smaller MRs that are delivered incrementally. Small here translates to less than two days. * Help dogfood a Gitlab feature by using it to fix an issue identified within the code base. As examples, fix code climate issue for one file OR SAST scanner potential errors found. * Raise concerns for issues where our incremental philosophy does not work when the issue cannot be broken down further. * Raise concerns for issues that Product Development Engineers do not feel fits the [MVC](https://about.gitlab.com/handbook/product/#the-minimally-viable-change-mvc) definition. ### Velocity over predictability We optimize for shipping a high volume of user/customer value with each release. We do want to ship multiple major features in every monthly release of GitLab. However, we do not strive for predictability over velocity. As such, we eschew heavyweight processes like detailed story point estimation by the whole team in favor of lightweight measurements of throughput like the number of merge requests that were included or rough estimates by single team members. There is variance in how much time an issue will take versus what you estimated. This variance causes unpredictability. If you want close to 100% predictability you have to take two measures: 1. Invest more time in estimation to reduce that variance. The time spent estimating things could otherwise be used to create features. 1. Leave a reserve of time with unscheduled work so you can accommodate the variance. According to [Parkinson's law](https://en.wikipedia.org/wiki/Parkinson%27s_law) the work expands so as to fill the time available for its completion. This means that we're not adhering to our [iteration value](/handbook/values/#iteration) and that for the next cycle our estimates for comparable features will be larger. Both measures reduce the overall velocity of shipping features. The way to prevent this is to accept that we don't want perfect predictability. Just like with our [OKRs](/company/okrs/), which are so ambitious that we expect to reach about 70% of the goal, this is also fine for shipping [planned features](/handbook/product/#how-this-impacts-planning). _Note:_ This does not mean we place zero value on predictability. We just optimize for velocity first. ## Balance refactoring and velocity When changing an outdated part of our code (e.g. HAML views, jQuery modules), use discretion on whether to refactor or not. For long term maintainability, we are very interested in migrating old code to the consistent and preferred approach (e.g. Vue, GraphQL), but we're also interested in continuously shipping features that our users will love. Aim to implement new modules or features with the preferred approach, but changing preexisting non-conforming parts is a gray area. If the weight of refactoring and other constraints (such as time) risk threatening the availability of a feature, then strongly consider refactoring at another time. On the other hand, if the code in question has hurt availability or poses a threat to it, then strongly consider prioritizing refactoring. This is a balancing act and if you're not sure where your change should go (or whether you should do some refactoring before hand), reach out to another Engineer or Maintainer. If it makes sense to refactor before implementing a new feature or a change, then please: - Create separate merge requests for the refactoring and change. This aids maintainability and code review. - Notify your engineering manager and relevant stakeholders (preferably in an issue comment) of the relevant scope increase and rationale. If it is decided **not** to refactor at this moment, then please: - Make sure a descriptive "technical debt" issue exists for this refactoring. - Notify your engineering manager so that the refactoring issue can be weighted and scheduled. ## Folding@home and COVID-19 Team members are welcome to run [Folding@home](https://foldingathome.org/) on their company provided computers. Folding@home is a distributed computing network that is [searching](https://foldingathome.org/2020/03/10/covid19-update/) for therapies for the [COVID-19](https://www.cdc.gov/coronavirus/2019-ncov/index.html) respiratory illness among other diseases. We recommend running it at night if you have high daily compute workloads. Also keep your computer plugged in. We considered potential security and hardware implications in [this issue](https://gitlab.com/gitlab-com/www-gitlab-com/-/issues/6875). If you would like to join a team with other GitLab team members, there is a `GitLab Team Members` team for Folding@home. When setting up or changing your Folding@home identity, you can add team `245256`. This is not a competition, but simply to track how much our team members have contributed overall. You can view our statistics on our [team page](https://stats.foldingathome.org/team/245256). You can discuss with other GitLab team members in the [#folding-at-home slack channel](https://gitlab.slack.com/archives/C0109M19SAV/p1584979706000200). ## Hiring Practices Calendar year 2020 will be a time of slower growth for GitLab Engineering compared to past years. We grew 100% in 2018, and 130% in 2019. We'll grow roughly 20% this year. But this is still fast compared to other companies, which we're grateful for. We can use the expertise and bandwidth we've built in past years to raise our bar even higher. We rely primarily on the judgment of our hiring managers to do this. But we also try to systematize as much as possible so our hiring practices are fair, transparent, and repeatable. * We require that at least two interviewers of the engineering division give each candidate a [Strong Yes](https://about.gitlab.com/handbook/hiring/interviewing/#engineering-division) (the highest assessment available in our applicant tracking system) on their interview scorecard in order to move to the offer stage. * The two-star rule applies to anyone added to the pipeline as of Monday, March 16, 2020. Any candidacy in the interview process prior to this can be held to the previous one-star rule. * Hiring managers must do a write up as part of advancing a candidate to offer. This can be found in the Justification section of the interview plan in Greenhouse. * In what specific way(s) does this person make the team better? * What red flags were raised during the interview process? * What are our specific strategies to set this person up for success? * We require that a simple majority of nice-to-have requirements (usually 5 of 9) are met for the role * If a candidate was previously rejected for a role the hiring manager needs review the previous interview experience (if our data rentention policy allows), discuss with the previous hiring manager and address why the person has now met our qualifications We do not run a single-veto hiring process because this impedes our ability to uplevel our teams. High-performers are more likely to have been the product of a controversial hiring process because they challenge the status quo. But that does not mean every controversial hiring process yeilds a high performer. An important part of a hiring manager's performance is making these determinations. ### Shadowing VPE's interviews * Interview shadowing should be done for quality and training purposes with approval from the candidate prior to the scheduled interview time * Shadow will need to partner with Eric to determine the most beneficial interview to attend * Recruiting or EBA will request permission from the candidate prior to including the shadow on the invite to set candidate experience expectations * The shadowing GitLabber should be prepped on what (if anything) will be said to the candidate during the interview * Immediately following the shadowed interview (or shortly thereafter, schedules permitting) the shadow and Eric will have a 30 min sync and submit scorecard for the candidate ## Engineering Management Issue Board The VP of Engineering and their direct reports track our highest priorities in the [Engineering Management Issue Board](https://gitlab.com/gitlab-com/www-gitlab-com/boards/980804), rather than to do lists, Google Doc action items, or other places. The reasons for this are: * It's a way to use our own product more (dogfooding) * It lends itself to our preferred async method of working * It's transparency across the company for what senior leaders in Engineering are working on * It allows for delegation while reducing the need for status check-ins by relying on issue notifications Here are the mechanics of making this work: * Use the `Engineering Management` label to get it on the board, and the department label to get it in progress (e.g. `Development Department`) * Mention the appropriate people in the issue so they become participants and receive notifications * We can re-prioritize in 1:1’s or staff meetings periodically * Directors can delegate items to anyone in their department * Link to issues on this board in places where status needs to be tracked _e.g._ 1:1 docs, staff meeting notes, etc * It's okay to link to other issues, boards, epics, etc in the body of an issue to avoid duplicating content * Set the issue due date * An issue per quarterly OKR is expected * If the product of an issue is _not_ an MR, please assign it back to the stakeholder to verify the output and close it for you. * If you close out an issue with the `CEO Interest` label, please post it to [#ceo](https://gitlab.slack.com/messages/C3MAZRM8W) ## Engineering OKR process Here is the [standard, company-wide process for OKRs](/company/okrs/). Engineering has some small deviations from (and extensions to) this process. ### OKR Kickoff This process should begin no later than two weeks before the end of the preceding quarter. And kickoff should happen on or before the first day of the new quarter. 1. OKR owners should [**author new issues**](https://gitlab.com/gitlab-com/www-gitlab-com/issues/new) in the handbook project using the "Engineering OKR" description template * The issue **title** should be `FY20-Q2 Organization Type OKR: Objective phrase => 0%` * Type should be one of "IACV", "Product", or "Team" * _e.g._ `FY20-Q3 Engineering Product OKR: Build our product vision => 0%` * Update the issue **description** * Add your Key results phrases to the issue *description*. Valid Key results are: * Raising a KPI from one specific value to another * Building out a new KPI * Failing either of the first two... Completing a high-profile project with specific outcomes * _e.g._ `* Raise first reply-time SLA for premium from 92% to 95% => 0%` * Add your manager's and your direct report's handles to the *CC line* * *Assign* the issue to yourself * Set the **due date** to the last day in the quarter * Apply the appropriate **labels** to make sure it appears in your appropriate column of our [management board](https://gitlab.com/gitlab-com/www-gitlab-com/boards/980804) * Interlink related OKRs (usually by OKR type) of your manager and direct reports using the **related issues** field 1. Get approval **prior to the first day of the quarter** from your manager * For the VPE and their direct reports: * Do an MR to that quarter's markdown handbook page * `* Department: [Objective phrase](https://placeholder.com/) => 0%` _e.g._ `* Support: Raise first reply-time SLA for premium from 92% to 95% => 0%` * Indent department level OKRs underneath the Engineering Division OKRs * One line for each objective * Assign the Mr to the VPE and address changes asynchronously like a code review * Discuss in 1:1 if needed * For everyone else: Ask you manager to do an async review of your issues via Slack or email and address any changes. Alternatively, discuss in a 1:1. 1. Communicate dependencies to other divisions, departments, or teams. Encourage them to take on corollary OKRs. ### OKR Status * Update the OKR issue **whenever you have additional information** * For direct reports of the VPE, expect to give an update in **each weekly 1:1** as part of the management issue board review. * For individuals that do a **monthly key review meeting**, expect to give an OKR update there. ### OKR Retrospection This process should begin on the first day of the subsequent quarter, and complete no later that two weeks after. 1. OKR owners should score their OKRs in the issue * Update the overall score in the issue title. * Update the individual key result scores in the issue description. 1. OKR owners should retrospect in the issue description. 1. OKR owners should do an MR to that quarter's OKR page with just the final scores after the objective phrase/link (_e.g._ `=> 70%`) and assign it to their direct manager for review. 1. OKR onwers should follow retrospection [guidelines](https://about.gitlab.com/handbook/engineering/workflow/#retrospective) and include what went well, what went wrong and how to improve 1. The manager should review each individual issue, ask any questions, and merge * The OKR owner should incorporate any manager feedback like in a code review ## Unlearning Previous Corporate Cultures In GitLab Engineering we are serious about concepts like [servant leadership](https://en.wikipedia.org/wiki/Servant_leadership), [over-communication](https://www.weforum.org/agenda/2015/03/why-you-need-to-over-communicate/), and furthering our [company value of transparency](/handbook/values/#transparency). You may have joined GitLab from another organization that did not share the same values or techniques. Perhaps you're accustomed to more corporate politics? You may need to go through a period of "unlearning" to be able to take advantage of our results-focused, people-friendly environment. It takes time to develop trust in a new culture. Less common, but even more important, is to make certain you don't unintentionally bring any mal-adaptive behaviors to GitLab from these other environments. We encourage you to read the engineering section of the handbook as part of your onboarding, ask questions of your peers and managers, and reflect on how you can help us better live our culture: * [Why handbook first?](/handbook/handbook-usage/#why-handbook-first) * [The Engineering Dual Career Track](/handbook/engineering/career-development/#individual-contribution-vs-management) * Our most challenging core values: [Iteration](/handbook/values/#iteration) and [Transparency](/handbook/values/#transparency) * Please keep discussions in public Slack channels (avoid direct messages and private channels) * To calibrate, try making yourself uncomfortable every day for 3 months with how transparent and vulnerable you are being with your manager and peers ## Dogfooding We [dogfood everything](/handbook/product/product-management/process/#dogfood-everything). Based on our [product principles](/handbook/product#product-principles), it is the Engineering division's responsibility to dogfood features or do the required discovery work to provide feedback to Product. It is Product's responsibility to prioritize improvements or rebuild functionality in GitLab. ### Dogfooding Antipatterns An easy antipattern to fall into is to resolve your problem outside of what the product offers. Dogfooding is not: 1. Building a bot outside of GitLab. 1. Writing scripts that leverage the GitLab API (if the functionality is on our roadmap and could be shipped within the GitLab Project). 1. Using a component of GitLab that is part of our [components](https://docs.gitlab.com/ee/development/architecture.html#component-diagram) or [managed apps](https://docs.gitlab.com/ee/user/clusters/applications.html). 1. Using templates or repos that are not part of the default UI (having to type or copy-paste to add them). ### Dogfooding Process Follow the [dogfooding process described in the Product Handbook](/handbook/product/product-management/process/#dogfooding-process) when considering building a tool outside of GitLab. ## GitLab Repositories GitLab consists of many subprojects. A curated list of GitLab Repositories can be found at the [GitLab Engineering Projects](/handbook/engineering/projects/) page. When adding a repository please follow these steps: 1. Ensure that the project is under the [gitlab-org](https://gitlab.com/gitlab-org) namespace for anything related to the application or under the [gitlab-com](https://gitlab.com/gitlab-com) namespace for anything strictly company related. 1. [Add the project to the list of GitLab Repositories](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/doc/projects.md) 1. Add an MIT license to the repository. It is easiest to simply copy-paste the [MIT License](https://gitlab.com/gitlab-org/gitlab/blob/master/LICENSE) verbatim from the `gitlab` repo. 1. Add a section titled "Developer Certificate of Origin and License" to `CONTRIBUTING.md` in the repository. It is easiest to simply copy-paste the [DCO + License section](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-license) verbatim from the `gitlab` repo. 1. Add any further relevant details to the Contribution Guide. See [Contribution Example](https://gitlab.com/gitlab-org/gitlab/blob/master/CONTRIBUTING.md). 1. Add a link to `CONTRIBUTING.md` from the project's `README.md` 1. Add a [CODEOWNERS](https://docs.gitlab.com/ee/user/project/code_owners.html) file, to make it easy for contributors to figure out which teams are best suited to review their changes. - Use teams rather than individuals as owners, to make it self updating over time and resilient to people taking time off - You can scope ownership to subdirectories or individual files, but it should contain at the very least a top-level catch all for any new or non explicitly mentionned file. 1. If your project contains code that is distributed with GitLab or is executed in production, set up [security jobs](https://gitlab.com/help/user/application_security/security_dashboard/index#gitlab-security-dashboard-ultimate) for your project and add your project to the AppSec team's [triage rotation](https://about.gitlab.com/handbook/engineering/security/index.html#triage-rotation). The AppSec will triage security findings from the Security Dashboard and create issues for vulnerabilities. When changing the settings in an existing repository, it's important to keep [communication](#communication) in mind. In addition to discussing the change in an issue and announcing it in relevant chat channels (e.g., `#development`), consider announcing the change during the [Company Call](/handbook/communication/#company-call). This is particularly important for changes to the [GitLab](https://gitlab.com/gitlab-org/gitlab) repository. ### New Projects When creating a new project that may stay small, or could eventually become an open-source project that we maintain, add it first to the `Sandbox` namespace at [gitlab-org/sandbox](https://gitlab.com/gitlab-org/sandbox) following the same steps above. This will ensure that if we ever need to promote a project or share it with a wider audience, it is already in a GitLab namespace. ## Access Requests GitLab consists of many different types of applications and resources. When you require escalated permissions or privileges to a resource to conduct task(s), or support for creating resource(s) with specific endpoints, please submit an issue to the [Access Requests Issue Tracker](https://gitlab.com/gitlab-com/team-member-epics/access-requests/issues) using the template provided. Below is a short list of supported technologies: * For complete list, please see the issues template(s) * G-Suite * Slack * BambooHR * 1Password * Greenhouse * Other, including Email Distros ## Engineering Departments, Sub-departments & Teams * [Development Department](/handbook/engineering/development/) * [CI/CD Sub-department](/handbook/engineering/development/ci-cd/) * [Package Backend Team](/handbook/engineering/development/ci-cd/package/) * [Release Backend Team](/handbook/engineering/development/ci-cd/release/) * [Verify Backend Team](/handbook/engineering/development/ci-cd/verify/) * [Verify & Release Frontend Team](/handbook/engineering/development/ci-cd/fe-verify-release/) * [Defend Sub-department](/handbook/engineering/development/defend/) * [Dev Sub-department](/handbook/engineering/development/dev/) * [Create:Editor Backend Team](/handbook/engineering/development/dev/create-editor-be/) * [Create:Knowledge Backend Team](/handbook/engineering/development/dev/create-knowledge-be/) * [Create:Source Code Backend Team](/handbook/engineering/development/dev/create-source-code-be/) * [Create:Source Code Frontend Team](/handbook/engineering/development/dev/create-source-code-fe/) * [Create Frontend Team](/handbook/engineering/development/dev/fe-create/) * [Create:Static Site Editor Team](/handbook/engineering/development/dev/create-static-site-editor/) * [Gitaly Team](/handbook/engineering/development/dev/gitaly/) * [Gitter Team](/handbook/engineering/development/dev/gitter/) * [Manage Backend Team](/handbook/engineering/development/dev/manage/) * [Manage & Fulfillment Frontend Team](/handbook/engineering/development/dev/fe-manage-fulfillment/) * [Plan:Project Management Backend Team](/handbook/engineering/development/dev/plan-project-management-be/) * [Plan:Portfolio Management Backend Team](/handbook/engineering/development/dev/plan-portfolio-management-be/) * [Plan:Certify Backend Team](/handbook/engineering/development/dev/plan-certify-be/) * [Plan Frontend Team](/handbook/engineering/development/dev/fe-plan/) * [Enablement Sub-department](/handbook/engineering/development/enablement/) * [Database Team](/handbook/engineering/development/enablement/database/) * [Distribution Team](/handbook/engineering/development/enablement/distribution/) * [Ecosystem Team](/handbook/engineering/development/enablement/distribution/) * [Geo Team](/handbook/engineering/development/enablement/geo/) * [Memory Team](/handbook/engineering/development/enablement/memory/) * [Search Team](/handbook/engineering/development/enablement/search/) * [Growth Sub-department](/handbook/engineering/development/growth/) * [Acquisition Team](/handbook/engineering/development/growth/acquisition-conversion-be-telemetry/) * [Conversion Team](/handbook/engineering/development/growth/acquisition-conversion-be-telemetry/) * [Expansion Team](/handbook/engineering/development/growth/expansion/) * [Retention Team](/handbook/engineering/development/growth/retention/) * [Fulfillment Backend Team](/handbook/engineering/development/growth/fulfillment/) * [Fulfillment Frontend Team](/handbook/engineering/development/growth/fe-fulfillment/) * [Telemetry Backend Team](/handbook/engineering/development/growth/acquisition-conversion-be-telemetry/) * [Telemetry Frontend Team](/handbook/engineering/development/growth/#telemetry-frontend) * [Ops Sub-department](/handbook/engineering/development/ops/) * [Configure](/handbook/engineering/development/ops/configure/) * [Monitor Stage Team](/handbook/engineering/development/ops/monitor/) * [Monitor:APM Team](/handbook/engineering/development/ops/monitor/apm/) * [Monitor:Health Team](/handbook/engineering/development/ops/monitor/health/) * [Serverless Team](/handbook/engineering/development/ops/serverless/) * [Secure Sub-department](/handbook/engineering/development/secure/) * [Secure Backend Team](/handbook/engineering/development/secure/) * [Secure Frontend Team](/handbook/engineering/development/secure/fe-secure/) * [Infrastructure Department](/handbook/engineering/infrastructure/) * [Secure & Defend Reliability Engineering Team](/handbook/engineering/infrastructure/team/reliability/#reliability-engineering-secure--defend) * [CI/CD & Enablement Reliability Engineering Team](/handbook/engineering/infrastructure/team/reliability/#reliability-engineering-cicd--enablement) * [Dev & Ops Reliability Engineering Team](/handbook/engineering/infrastructure/team/reliability/#reliability-engineering-dev--ops) * [Delivery Team](/handbook/engineering/infrastructure/team/delivery/) * [Quality Department](/handbook/engineering/quality/) * [Dev Quality Engineering Team](/handbook/engineering/quality/dev-qe-team/) * [Ops & CI/CD Quality Engineering Team](/handbook/engineering/quality/ops-qe-team/) * [Secure & Enablement Quality Engineering Team](/handbook/engineering/quality/secure-enablement-qe-team/) * [Engineering Productivity Team](/handbook/engineering/quality/engineering-productivity-team/) * [Security Department](/handbook/engineering/security/) * [Vulnerability Management](/handbook/engineering/security/vulnerability_management) * [Support Department](/handbook/support/) * [UX Department](/handbook/engineering/ux/) * [Frontend Department](/handbook/engineering/frontend/) ## Headcount planning Before the beginning of each fiscal year, and at various check points throughout the year, we plan the size and shape of the Engineering and Product Management functions together to maintain symmetry. The process should take place in a single artifact (usually a spreadsheet, [current spreadsheet][FY2020-headcount-sheet]), and follow these steps: 1. **Product Management:** Supplies headcount numbers for PMs and development groups proportional to our roadmap efforts 1. **Engineering:** Supplies feedback to Product, headcount for management roles in the development department, and full plans for the Security, UX, Quality, and Infrastructure departments 1. **CEO:** Supplies feedback to Engineering and Product, or gives final approval Note: Support is part of the engineering function but is budgeted as 'cost of sales' instead of research and development. Headcount planning is done separately according to a different model. [FY2020-headcount-sheet]: https://docs.google.com/spreadsheets/d/1MUR2IhPxS0tQCKYMlJSpC0uA0spEYBPA7V--CzbWy8M ## Long Term Profitability Targets The non support related departments within Engineering (Development, Infrastructure, Quality, Security, and UX) have an expense target of 20% as a percentage of revenue. The Support target is 10% as a percentage of revenue. ## Starting new teams Our product offering is growing rapidly. Occasionally we start new teams. Backend teams should map to our [product categories](/handbook/product/categories/). Backend teams also map 1:1 to [product managers](/handbook/product/). A dedicated team needs certain skills and a minimum size to be successful. But that doesn't block us from taking on new work. This is how we iterate our team size and structure as a feature set grows: 1. **Existing Team:** The existing PM schedules issues for most appropriate existing engineering team * If there is a second PM for this new feature, they work through the first PM to preserve the 1:1 interface 1. **Shared Manager Team:** Dedicated engineer(s) are identified on existing teams and given a specialty * The manager must do double-duty * Their title can reflect both specialties of their engineers _e.g._ Engineering Manager, Distribution & Package * Even if temporary, managing two teams is a valuable career opportunity for a manager looking to develop director-level skills * Each specialty can have its own process, for example: Capitalized team label, Planning meetings, Standups 1. **New Dedicated Team:** * Engineering Manager * Senior/Staff Engineer * Two approved fulltime vacancies * A dedicated PM ## Team Page Template ```markdown ## Vision ... ## Mission ... ## Team Members The following people are permanent members of the [Blank] Team: <%= direct_team(manager_role: 'Engineering Manager, [Blank]') %> ## Stable Counterparts The following members of other functional teams are our stable counterparts: <%= stable_counterparts(role_regexp: /[,&] Blank/, direct_manager_role: 'Engineering Manager, [Blank]') %> ## Hiring This chart shows the progress we're making on hiring. Check out our [jobs page](/jobs/) for current openings. <%= hiring_chart(department: '[Blank] Team') %> ## Common Links * Issue Tracker * Slack Channel * ... ## How to work with us ... ``` ## Fast Boot Events New teams may benefit from holding a [Fast Boot](/handbook/engineering/fast-boot/) event to help the jump start the team. During a Fast Boot, the entire team gets together in a physical location to bond and work alongside each other. ## Mentorship and Coaching Programs All levels of leadership at GitLab could benefit from external mentorship and coaching programs. To validate this hypothesis we are working on a small pilot program for 6 months with [PlatoHQ](https://www.platohq.com/) and [7CTOs](https://7ctos.com/). ### Line Managers and Senior Individual Contributors The pilot for PlatoHQ has 5 Engineering Managers participating. The program exists of both self-learning via an online portal and 1-1 sessions with a mentor. During the program participants are working on a project together. The goals for the pilot are: * Rave reviews from participants about their professional development * A successfully completed project to deliver career development matrices * A compelling list of ideas implemented by engineering managers within their teams that were sourced from PlatoHQ coaches and content * A plan to scale out further (if above criteria are satisfied) ### Senior Leaders in Engineering The pilot with 7CTOs is ran with 3 Senior leaders in Engineering. The program exists of peer mentoring sessions (forums) and effective network building. The goals of the pilot are: * Rave reviews from participants about their professional development * A compelling list of ideas implemented by senior leaders that were sourced from the 7CTOs sessions * A plan to scale out further (if above criteria are satisfied) The pilot programs' progression will be evaluated on February 28, 2020 and the final evaluation will be on May 1, 2020. After the evaluation there will be a decision whether to roll this out to all of Engineering. ## Collaboration To maintain our rapid cadence of shipping a [new release on the 22nd of every month](/blog/2018/11/21/why-gitlab-uses-a-monthly-release-cycle/), we must keep the barrier low to getting things done. Since our team is distributed around the world and therefore working at different times, we need to work in parallel and asynchronously as much as possible. That also means that if you are implementing a new feature, you should feel empowered to work on the entire stack if it is most efficient for you to do so. Nevertheless, there are features whose implementation requires knowledge that is outside the expertise of the developer or even the group/[stage](/company/team/structure/#stage-groups) group. For these situations, we'll require the help of an expert in the feature's domain. In order to figure out how to articulate this help, it is necessary to evaluate first the amount of work the feature will require from the expert. If the feature only requires the expert's help at an early stage, for example designing and architecting the future solution, the approach will be slightly different. In this case, we would require the help of at least two experts in order to get a consensual agreement about the solution. Besides, they should be informed about the development status before the final solution is finished. This way, any discrepancy or architectural issue related to the current solution, will be brought up early. ## Code Quality and Standards We need to maintain code quality and standards. It's very important that you are familiar with the [Development Guides] in general, and the ones that relates to your group in particular: - [UX Guides](https://docs.gitlab.com/ee/development/ux_guide/index.html) - [Backend Guides](https://docs.gitlab.com/ee/development/README.html#backend-guides) - [Frontend Guides](https://docs.gitlab.com/ee/development/fe_guide/index.html) - [Database Guides](https://docs.gitlab.com/ee/development/README.html#database-guides) Please remember that the only way to make code flexible is to make it as simple as possible:
### Quality is everyone's responsibility It is important to remember that quality is everyone's responsibility. Everything you merge to master should be production ready. Familiarize yourself with the [definition of done]. [Development Guides]: https://docs.gitlab.com/ee/development/README.html [definition of done]: https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/development/contributing/merge_request_workflow.md#definition-of-done ### Release when it's ready Our [releases page](/handbook/engineering/releases/) describes our two main release channels: 1. Self-managed users use a [monthly self-managed release](/handbook/engineering/releases/#self-managed-releases). 2. GitLab.com uses [auto-deploy releases](https://gitlab.com/gitlab-org/release/docs/blob/master/general/deploy/auto-deploy.md). As the first of these is a monthly release, it's tempting to try to rush to get something in to a monthly self-managed release. However, this is an anti-pattern. Most issues don't have strict deadlines. Those that do are exceptions, and should be treated as such. Deadline pressure logically leads to a few outcomes: 1. People are at [increased risk of burnout](/handbook/paid-time-off/#recognizing-burnout). 2. We may compromise on our [definition of done](https://docs.gitlab.com/ee/development/contributing/merge_request_workflow.html#definition-of-done). 3. We [cut scope](/handbook/values/#move-fast-by-shipping-the-minimum-viable-change). 4. We miss the deadline. Only the last two outcomes are acceptable as a general rule. Missing a 'deadline' in the form of an assigned milestone is often OK as we put [velocity above predictability](#velocity-over-predictability), and missing the monthly self-managed release does not prevent code from reaching GitLab.com. For these reasons, and others, we intentionally [do not define a specific date](/handbook/engineering/releases/#timelines) for code to be merged in order to reach a self-managed monthly release. The earlier it is merged, the better. This also means that: 1. We don't want merge request authors to [work extra hours](/handbook/values/#measure-results-not-hours) or otherwise rush to meet a deadline. 2. We don't want [reviewers and maintainers](/handbook/engineering/workflow/code-review/) to be put under pressure to do anything other than meet the [regular SLOs](/handbook/engineering/workflow/code-review/#first-response-slo). If it is essential that a merge request make it in a particular release, this must be communicated well in advance to the engineer and any reviewers, to ensure they're able to make that commitment. If a severe bug needs to be fixed with short notice, it is better to revert the change that introduced it than to rush, or even to delay the release until the fix is ready. In general, there is no need to change any behavior close to the self-managed release. ## Visualization Tools [grafana_prom]: https://www.youtube.com/watch?v=IW3zKdHrSvg [monitoring_cs]: https://www.youtube.com/watch?v=sRQdUtc-aH4 ### Grafana - Visualization tool that analyzes metrics (e.g: CPU, memory, disk and I/O utilization) - [Using Grafana Repeating Panels with Prometheus][grafana_prom] - [Grafana/Prometheus Monitoring Update for CS][monitoring_cs] ### Kibana - Visualization tool that analyzes Elasticsearch log messages. - [Monitoring of Gitlab.com - Logs](/handbook/engineering/monitoring/#logs) ## Monitoring Tools [sitespeed]: https://www.youtube.com/watch?v=6xo01hzW-f4 [runners]: https://www.youtube.com/watch?v=wEcoyC1cE5M [prometheus]: https://www.youtube.com/watch?v=8Ai55-sYJA0 [instrumenting]: https://www.youtube.com/watch?v=e1-wIbQS-oE [gitaly]: https://www.youtube.com/watch?v=R6F674Nj3wI [sentry_doc]: https://docs.gitlab.com/ee/user/project/operations/error_tracking.html [prom_doc]: https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html#monitoring-gitlab-with-prometheus [speed_doc]: https://docs.gitlab.com/ee/user/project/merge_requests/browser_performance_testing.html#overview [demo]: https://youtu.be/o02t3V3vHMs [monitoring_gitlab]: /handbook/engineering/monitoring [dashboards]: /handbook/engineering/monitoring#main-monitoring-dashboards [box_monitoring]: /handbook/engineering/monitoring#selection-of-useful-dashboards-from-the-monitoring [logs]: /handbook/engineering/monitoring#logs [gitlab_monitoring]: /handbook/engineering/monitoring ### Prometheus - Tool that monitors and sends alerts regarding the health of containers and microservices. - [Documentation][prom_doc] - [Intro to GitLab Monitoring - Debugging Runners][runners] - [Prometheus 101][prometheus] - [Instrumenting applications for Prometheus][instrumenting] - [How Gitaly uses Prometheus for monitoring][gitaly] ### Sentry - Tool that monitors systems to identify issues in real time. - [Documentation][sentry_doc] - [How to investigate a 500 error - Sentry / Kibana Demo][demo] - [Diagnose Errors on GitLab.com - Searching Sentry][/handbook/support/workflows/500_errors.html#searching-sentry] ### Sitespeed.io - Tool that helps you monitor, analyze and optimize your website speed and performance. - [Documentation][speed_doc] - [How we used sitespeed.io to measure Frontend performance][sitespeed] ### Related Pages - [Main Monitoring Dashboards][dashboards] - [Blackbox and Whitebox Monitoring][box_monitoring] - [Logs][logs] - [Monitoring of GitLab.com][gitlab_monitoring] ## Pairing Engineers on P1/S1 Issues In most cases, a single engineer and maintainer review are adequate to handle a P1/S1 issue. However, some issues are highly difficult or complicated. Engineers should treat these issues with a high sense of urgency. For a complicated P1/S1 issue, multiple engineers should be assigned based on the level of complexity. The issue description should include the team member and their responsibilities. | Team Member | Responsibility | | ------ | ------ | | `Team Member 1` | `Reproduce the Problem` | | `Team Member 2` | `Audit Code Base for other places where this may occur` | If we have cases where three or five or X people are needed, Engineering Managers should feel the freedom to execute on a plan quickly. Following this procedure will: - Decrease the time it takes to resolve P1/S1 issues - Allow for a smooth handover of the issue in case of OOO or End of the Work Day - Provide support for Engineers if they are stuck on a problem - Provide another set of eyes on topics with high urgency or securing security-related fixes ## Error Budgets We use [SRE](https://en.wikipedia.org/wiki/Site_Reliability_Engineering)-like error budgets in [OKRs](/company/okrs/2018-q4/) to incentivize risk management and help make GitLab.com ready for mission critical customer workloads. Each backend and frontend development team is responsible for not exceeding an allocated budget of 15 points each quarter. The severity of issues caused will impact their budget accordingly: * ~S1: 30 points * ~S2: 15 points * ~S3: 6 points * ~S4: 3 point The Infrastructure team will perform attribution as part of the root cause analysis process and record the results in the OKRs page. ## Engineering Proposed Initiatives Engineering is the primary advocate for the performance, availability, and security of the GitLab project. Product Management prioritizes all initiatives, so everyone in the engineering function should participate in the Product Management [prioritization process](/handbook/product/product-management/process/#prioritization) to ensure that our project stays ahead in these areas. The following list should provide some guidelines around the initiatives that each engineering team should advocate for during their release planning: - Review fixes from our support team. These issues are tagged with the `support-fix` label. You can filter on open MRs [here](https://gitlab.com/gitlab-org/gitlab/merge_requests?label_name%5B%5D=support-fix). - Working on high priority issues as a result of [issue triaging](/handbook/engineering/qualty/issue-triage/). This is our commitment to the community and we need to include some capacity to review MRs or work on defects raised by the community. - Improvements to the performance and scalability of a feature. Again, the Product team should be involved in the definition of these issues but Engineering may lead here by clearly defining the recommended improvements. - Improvements to our toolchain in order to boost efficiency. ## Rails by default, VueJS where it counts Part of our engineering culture is to keep shipping so users and customers see significant new value added to GitLab.com or their self-managed instance. To support rapid development, we focus on Rails page views by default. When an area of the application sees significant usage, we typically rewrite those screens as a [VueJS](https://vuejs.org/) single page app backed by our API, in order to maintain the best qualitative experience and quantitative performance. ## GraphQL first When adding new functionality, we should use GraphQL where possible on the [backend] and the [frontend]. We have a long-term goal to [use GraphQL everywhere] because it lets us increase development speed, reduces dependencies between frontend and backend engineers, and gives us a single source of truth for application data. Defaulting to GraphQL for new work means that the distance from that goal doesn't increase over time. This does not override [the importance of velocity]: if something is significantly more work to ship using GraphQL, rather than extending an existing implementation (in a Rails controller or the REST API), we should not block ourselves on using GraphQL. Instead, we should ship the feature and create a follow-up issue to move that resource to GraphQL in future. That follow-up issue can be scheduled by the relevant Product Manager, in consultation with Engineering Managers, as with any other [engineering proposed initiative]. [backend]:https://docs.gitlab.com/ee/development/api_graphql_styleguide.html [frontend]: https://docs.gitlab.com/ee/development/fe_guide/graphql.html [use GraphQL everywhere]: https://gitlab.com/groups/gitlab-org/-/epics/1366 [the importance of velocity]: #the-importance-of-velocity [engineering proposed initiative]: #engineering-proposed-initiatives ## Demos Moved to a [dedicated page](/handbook/engineering/development/demos.html). ## Canary Testing GitLab makes use of a 'Canary' stage. Production Canary is a series of servers running GitLab code in a production environment. The Canary stage contains code functional elements like web, container registry and git servers while sharing data elements such as sidekiq, database, and file storage with production. This allows UX code and most application logic code to be consumed by a smaller subset of users under real world scenarios before being made available to all users on GitLab.com. The production Canary stage is forcibly enabled for all users visiting GitLab Inc. operated groups: * [GitLab.com](https://gitlab.com/gitlab-com) * [GitLab.org](https://gitlab.com/gitlab-org) * [charts](https://gitlab.com/charts) The Infrastructure department teams can globally disable use of production Canary when necessary. Individuals can also opt-out of using production Canary environments. However, opting-out does not include the aforementioned groups above. To opt in/out, go to [GitLab Version](https://next.gitlab.com/) and move the toggle appropriately. To verify that Canary is enabled, in the header, next to the GitLab logo will be a 'Next' icon, or use the [performance bar](https://docs.gitlab.com/ee/administration/monitoring/performance/performance_bar.html) (typing `pb`) in GitLab and watch out for the Canary icon next to the web server name. ## Resources for Development {: #resources} When using any of the resources listed below, some rules apply: * Consider the cost and whether anything can be done to reduce the cost. * You can boot up as many machines as you need. * It is your responsibility to clean up after yourself; if a machine is not used, remove it. * If you observe any resource that is running for long periods of time, ask the person responsible whether the machine is still in use. * Prepend your username to any resource you start. Eg. if your name is Jane Doe, name the resource `janedoe-machine-for-testing`. ### Google Cloud Platform (GCP) Every team member has access to a common project on [Google Cloud Platform](https://console.cloud.google.com/). Please see the secure note with the name "Google Cloud Platform" in the shared vault in 1password for the credentials or further details on how to gain access. Once in the console, you can spin up VM instances, Kubernetes clusters, etc. Where possible, please prefix the resource name with your name for easy identification (e.g. `myname-k8s-cluster`). Please remove any resources that you are not using, since the company is [billed monthly](https://cloud.google.com/pricing/). If you are unable to create a resource due to quota limits, file an issue on the [Infrastructure issue tracker](https://gitlab.com/gitlab-com/infrastructure). If your group needs to have its own GCP project, please use this [issue template](https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/new?issuable_template=group_project) to request one. Your group may already have a project which can be found on this list of [group GCP projects](https://ops.gitlab.net/gitlab-com/group-projects/tree/master/environments). If you encounter the following error when creating a new GKE cluster, this indicates that we cannot create more clusters within that network. Please ask in #kubernetes for team members to delete unused clusters, or alternatively create your cluster in a different network. ``` The network "default" does not have available private IP space in 10.0.0.0/8 ``` ### Digital Ocean (DO) Every team member has access to the [dev-resources project](https://gitlab.com/gitlab-com/dev-resources/) which allows everyone to create and delete machines on demand. ### Amazon Web Services (AWS) In general, most team members do not have access to AWS accounts. In case you need an AWS resource, file an issue on the [Infrastructure issue tracker](https://gitlab.com/gitlab-com/infrastructure). Please supply the details on what type of access you need. ## DevOps Slack Channels There are primarily two Slack channels which developers may be called upon to assist the production team when something appears to be amiss with GitLab.com: 1. `#backend`: For backend-related issues (e.g. error 500s, high database load, etc.) 2. `#frontend`: For frontend-related issues (e.g. JavaScript errors, buttons not working, etc.) Treat questions or requests from production team for immediate urgency with high priority.A lot of programmers make the mistake of thinking the way you make code flexible is by predicting as many future uses as possible, but this paradoxically leads to *less* flexible code.
— Nearby Cats (@BaseCase) January 16, 2019
The only way to achieve flexibility is to make things as simple and easy to change as you can.