---
layout: handbook-page-toc
title: "Health Group"
---

## On this page
{:.no_toc .hidden-md .hidden-lg}

- TOC
{:toc .hidden-md .hidden-lg}

## Common links

* Slack channel: [#g_monitor_health](https://gitlab.slack.com/archives/g_monitor_health)
* Slack alias: [@monitor-health-group](https://app.slack.com/client/T02592416/C0259241C/user_groups/SLFUX86HF)
* Google group: [monitor-health-group@gitlab.com](https://groups.google.com/a/gitlab.com/forum/#!forum/monitor-health-group)

## Backend Team members

<%= direct_team(manager_role: 'Engineering Manager, Monitor:Health') %>

## Frontend Team members

<%= direct_team(manager_role: 'Frontend Engineering Manager, Monitor', role_regexp: /(?!Monitor:APM)Monitor/) %>

## Stable counterparts

<%= stable_counterparts(role_regexp: /(?!Monitor:APM)Monitor/, direct_manager_role: 'Engineering Manager, Monitor:Health', other_manager_roles: ['Engineering Manager, Monitor:APM', 'Frontend Engineering Manager, Monitor:Health', 'Frontend Engineering Manager, Monitor:APM']) %>

## Responsibilities
{: #monitoring}

This team maps to the [Health Group](/handbook/product/categories/#health-group) category and focuses on:
* Error Tracking
* Cluster Monitoring
* Synthetic Monitoring
* Incident Management
* Status Page

## Repos we own or use
* [Prometheus Ruby Mmap Client](https://gitlab.com/gitlab-org/prometheus-client-mmap) - The ruby Prometheus instrumentation lib we built, which we used to instrument GitLab
* [GitLab CE](https://gitlab.com/gitlab-org/gitlab-ce) and [GitLab EE](https://gitlab.com/gitlab-org/gitlab-ee) - Where much of the user facing code lives
* [Omnibus](https://gitlab.com/gitlab-org/omnibus-gitlab) and [Charts](https://gitlab.com/charts/charts.gitlab.io), where a lot of the packaging related work goes on. (We ship GitLab fully instrumented along with a Prometheus instance)

## Issue boards

* [Health - Planning](https://gitlab.com/groups/gitlab-org/-/boards/1131777) - Main board with all issues scoped to label "group::health"
* [Health - Workflow](https://gitlab.com/groups/gitlab-org/-/boards/1160198) - Issue board organized by workflow labels
* [Charts](https://gitlab.com/groups/gitlab-org/-/boards/1184016) - Issue board with all issues labeled "Charts"
* [Monitor Bugs](https://gitlab.com/groups/gitlab-org/-/boards/979406) - Issue board organized by Priority labels so that we make sure we meet our bug fix SLA

## Development Processes

### Surfacing blockers

To surface blockers, mention your Engineering Manager in the issues, and then contact them via slack and or 1:1's. Also make sure to raise any blockers in your daily async standup using Geekbot.

The engineering managers want to make unblocking their teams their highest priority. Please don't hesitate to raise blockers

### Scheduling

#### Scheduling issues in milestones

The Product Manager is responsible for scheduling issues in a given milestone. During the backlog grooming portion of our weekly meeting, all parties will make sure that issues are scoped and well-defined enough to implement and whether they need UX involvement and/or technical investigation.

As we approach the start of the milestone, Engineering Managers are responsible for adding the ~deliverable label to communicate which issues we are committing to finish in the given milestone. Generally, the Engineering Manager will use the prioritized order of issues in the milestone to determine which issues to label as ~deliverable. The Product Manager will have follow-up conversations with the Engineering Managers if the deliverables do not meet their expectations or if there are other tradeoffs we should make.

#### Scheduling bugs

When new bugs are reported, the engineering managers ensure that they have proper Priority and Severity labels. Bugs are discussed during our backlog grooming session and are scheduled according to severity, priority, and the capacity of the teams. Ideally, we should work on a few bugs each release regardless of priority or severity.

### Weekly async issue updates

Every Friday, each engineer is expected to provide a quick async issue update by commenting on their assigned issues using the following template:

```
<!---
Please be sure to update the workflow labels of your issue to one of the following (that best describes the status)"
- ~"workflow::In dev"
- ~"workflow::In review"
- ~"workflow::verification"
- ~"workflow::blocked"
-->
### Async issue update
1. Please provide a quick summary of the current status (one sentence).
1. When do you predict this feature to be ready for maintainer review?
1. Are there any opportunities to further break the issue or merge request into smaller pieces (if applicable)?
```

We do this to encourage our team to be more async in collaboration and to allow the community and other team members to know the progress of issues that we are actively working on.

### Interacting with community contributors

Community contributions are encouraged and prioritized at GitLab. Please check out the [Contribute page](/community/contribute/) on our website for guidelines on contributing to GitLab overall.

Within the Monitor stage, Product Management will assist a community member with questions regarding priority and scope. If a community member has technical questions on implementation, Engineering Managers will connect them with engineers within the team to collaborate with.

### Using spikes to inform design decisions

Engineers use spikes to conduct research, prototyping, and investigation to gain knowledge necessary to reduce the risk of a technical approach, better understand a requirement, or increase the reliability of a story estimate (paraphrased from [this overview](https://www.scaledagileframework.com/spikes/)). When we identify the need for a spike for a given issue, we will create a new issue, conduct the spike, and document the findings in the spike issue. We then link to the spike and summarize the key decisions in the original issue.

### Assigning MRs for code review

Engineers should typically ignore the suggestion from [Dangerbot's](https://docs.gitlab.com/ee/development/dangerbot.html) Reviewer Roulette and assign their MRs to be reviewed by a [frontend engineer](https://about.gitlab.com/company/team/?department=monitor-fe-team) or [backend engineer](https://about.gitlab.com/company/team/?department=monitor-be-team) from the Monitor stage. If the MR has domain specific knowledge to another team or a person outside of the Monitor Stage, the author should assign their MR to be reviewed by an appropriate domain expert. The MR author should use the Reviewer Roulette suggestion when assigning the MR to a maintainer.

Advantages of keeping most MR reviews inside the Monitor Stage include:

* Quicker reviews because the reviewers hopefully already have the context and don't need additional research to figure out how the MR is supposed to work.
* Knowledge sharing among the engineers in the Monitor Stage. There is a lot of overlap between the groups in the stage and this effort will help engineers maintain context and consistency.

### Preparing UX designs for engineering

Product designers generally try to work one milestone ahead of the engineers, to ensure scope is defined and agreed upon before engineering starts work. So, for example, if engineering is planning on getting started on an issue in 12.2, designers will assign themselves the appropriate issues during 12.1, making sure everything is ready to go before 12.2 starts.

To make sure this happens, early planning is necessary. In the example above, for instance, we'd need to know by the end of 12.0 what will be needed for 12.2 so that we can work on it during 12.1. This takes a lot of coordination between UX and the PMs. We can (and often do) try to pick up smaller things as they come up and in cases where priorities change. But, generally, we have a set of assigned tasks for each milestone in place by the time the milestone starts so anything we take on will be in addition to those existing tasks and dependent on additional capacity.

The current workflow:

* Though Product Designers make an effort to keep an eye on all issues being worked on, PMs add the UX label to specific issues needing UX input for upcoming milestones.

* The week before the milestone starts, the Product Designers divide up issues depending on interest, expertise and capacity.

* Product Designers start work on assigned issues when the milestone starts. We make an effort to start conversations early and to have them often. We collaborate closely with PMs and engineers to make sure that the proposed designs are feasible.

* In terms of what we deliver: we will provide what's needed to move forward, which may or may not include a high-fidelity design spec. Depending on requirements, a text summary of the expected scope, a balsamiq sketch, a screengrab or a higher fidelity measure spec may be provided.

* When we feel like we've achieved a 70% level of confidence that we're aligned on the way forward, we change the label to ~'workflow::ready for development' as a sign that the issue is appropriately scoped and ready for engineering.

* We usually stay assigned to issues after they are ~'workflow::ready for development' to continue to answer questions while the development process is taking place.

* Finally, when development is complete, we conduct UX Reviews on the MRs to ensure that what's been implemented matches the spec.

## Repos we own or use
* [Prometheus Ruby Mmap Client](https://gitlab.com/gitlab-org/prometheus-client-mmap) - The ruby Prometheus instrumentation lib we built, which we used to instrument GitLab
* [GitLab CE](https://gitlab.com/gitlab-org/gitlab-ce) and [GitLab EE](https://gitlab.com/gitlab-org/gitlab-ee) - Where much of the user facing code lives
* [Omnibus](https://gitlab.com/gitlab-org/omnibus-gitlab) and [Charts](https://gitlab.com/charts/charts.gitlab.io), where a lot of the packaging related work goes on. (We ship GitLab fully instrumented along with a Prometheus instance)

## Service accounts we own or use

### Zoom sandbox account

In order to develop and test Zoom features for the [integration with GitLab](https://gitlab.com/groups/gitlab-org/-/epics/1439) we now have our own Zoom sandbox account.

#### Requesting access

To request access to this Zoom sandbox account please open [an issue](https://gitlab.com/gitlab-com/team-member-epics/access-requests/issues/new?issuable_template=New%20Access%20Request) providing your **non-GitLab email address** (which can already be associated an existing non-GitLab Zoom account).

The following people are owners of this account and can [grant access](https://zoom.us/account/user) to other GitLabbers:

* [Andrew Newdigate](https://gitlab.com/andrewn)
* [Peter Leitzen](https://gitlab.com/splattael)
* [Allison Browne](https://gitlab.com/allison.browne)

#### Granting access

1. Log in to [Zoom](http://zoom.us/) with your non-GitLab email
1. Go to [**User Management > Users**](https://zoom.us/account/user)
1. Click on `Add User`
1. Specify email addresses
1. Choose `User Type` - most likely `Pro`
1. Click `Add` - the users receive invitations via email
1. Add the linked name to [the list in "Requesting access"](#requesting-access)

#### Documentation

For more information on how to use Zoom see theirs [guides](https://marketplace.zoom.us/docs/guides) and [API reference](https://marketplace.zoom.us/docs/api-reference/introduction).

## Recurring Meetings
While we try to keep our process pretty light on meetings, we do hold a [Monitor Health Backlog Grooming](https://docs.google.com/document/d/1YWpzwlLVvciuHlpT1ALfixMHRYcWt7oqD4HftkVE5w8/edit#) meeting weekly to triage and prioritize new issues, discuss our upcoming issues, and uncover any unknowns.

## Deliverable Labels
In our group, the (frontend + backend) engineering managers are responsible for adding the `~deliverable` label to any issues that the team is publicly stating that to the best of their ability, they expect that issue to be completed in that milestone. We are not perfect but our goal is that 100% of the issues with that label do ship in the release that they are scheduled in. This allows engineering to share what issues they commit to and helps set expectations for the product manager and for the community.

## Frontend Scheduling
Our goal is to move towards a continuous delivery model such that the team completes tasks on a weekly basis. In our weekly meetings, we prioritize grooming our backlog to prioritize specific issues that are ready for development. Every release, the product manager will collaborate with the team to identify notable features that we want implemented. These issues will be shared in the product kickoff call and will have a frontend engineer assigned to them before the development milestone starts.

The development of these assigned issues should not typically last the entire release cycle. Once frontend engineers have completed their assigned issue, they are expected to go to the Health issue board and assign themselves to the next unassigned issue in the list that has the `frontend` and `workflow:ready for development` labels. The issues in the board are prioritized based on importance (the lower they are on the list, the lower the priority). In the event that all issues are assigned for that milestone, frontend engineers are expected to assign themselves to issues on the next milestone on the issue board list.

## Monitor Stage PTO
Just like the rest of the company, we use [PTO Ninja](/handbook/paid-time-off/#pto-ninja) to track when team members are traveling, attending conferences, and taking time off. The easiest way to see who has upcoming PTO is to run the `/ninja whosout` command in the `#g_monitor_standup` slack channel. This will show you the upcoming PTO for everyone in that channel.