--- layout: handbook-page-toc title: "APM Group" --- ## On this page {:.no_toc .hidden-md .hidden-lg} - TOC {:toc .hidden-md .hidden-lg} ## APM According to Gartner "Application performance monitoring (APM) is a suite of monitoring software comprising digital experience monitoring (DEM), application discovery, tracing and diagnostics, and purpose-built artificial intelligence for IT operations". The APM team in Gitlab is responsible for building a suite of monitoring solutions focusing on Logs, Metrics and Tracing all within Gitlab UI you can read more about our monitoring visions for [Logs](https://about.gitlab.com/direction/monitor/apm/logging/), [Metrics](https://about.gitlab.com/direction/monitor/apm/metrics/) and [Traces](https://about.gitlab.com/direction/monitor/apm/tracing/). ## Common links * Slack channel: [#g_monitor_apm](https://gitlab.slack.com/archives/g_monitor_apm) * Slack alias: @monitor-apm-group * Google groups: monitor-apm-group@gitlab.com (whole team), monitor-apm-be@gitlab.com (backend team), and monitor-apm-fe@gitlab.com (frontend team) ## Backend Team members <%= direct_team(manager_role: 'Engineering Manager, Monitor:APM') %> ## Frontend Team members <%= direct_team(manager_role: 'Frontend Engineering Manager, Monitor', role_regexp: /(?!Monitor:Health)Monitor/) %> ## Stable counterparts <%= stable_counterparts(role_regexp: /(?!Monitor:Health)Monitor/, direct_manager_role: 'Engineering Manager, Monitor:Health', other_manager_roles: ['Engineering Manager, Monitor:APM', 'Frontend Engineering Manager, Monitor:Health', 'Frontend Engineering Manager, Monitor:APM']) %> ## Responsibilities {: #monitoring} The APM group is responsible for: * Providing the tools required to enable monitoring of GitLab.com * Packaging these tools to enable all customers to manage their instances easily and completely * Building integrated monitoring solutions for customers apps into GitLab, including: metrics, logging, and tracing This team maps to the [APM Group](/handbook/product/categories/#apm-group) category. ## How to work with APM ### Adding new metrics to GitLab The APM Group is responsible for providing the underlying libraries and tools to enable GitLab team-members to instrument their code. When adding new metrics, we need to consider a few facets: the impact on GitLab.com, customer deployments, and whether any default alerting rules should be provided. Recommended process for adding new metrics: 1. Open an issue in the desired project outlining the new metrics desired 1. Label with the ~group::apm label, and ping @gl-monitoring for initial review 1. During implementation consider: 1. The Prometheus [naming](https://prometheus.io/docs/practices/naming/) and [instrumentation](https://prometheus.io/docs/practices/instrumentation/) guidelines 1. Impact on cardinality and performance of Prometheus 1. Whether any alerts should be created 1. Assign to an available APM Group reviewer ## How We Work We try to adhere to best practices from across the company in how we work. For example, our Product Manager owns the problem validation backlog and problem validation process as outlined in the [Product Development Workflow](https://about.gitlab.com/handbook/product-development-flow/) and follows the [Product Development Timeline](https://about.gitlab.com/handbook/engineering/workflow/#product-development-timeline). Engineers follow the [Engineering Workflow](https://about.gitlab.com/handbook/engineering/workflow/). In addition, here are some additional details on how we work. ### Adding New Issues When adding a new issue for the Monitor:APM group, follow these guidelines: * Add the `group::apm` and `devops::monitor` label * Do not add a specific milestone. New issues will be reviewed and scheduled appropriately. * For bugs, include [priority and severity labels](https://docs.gitlab.com/ee/development/contributing/issue_workflow.html#priority-labels). These may be updated, but it is helpful to understand the expectation. When creating a new issue, try to consider if this issue can be completed in a single milestone, with the collaboration of at most one frontend and/or one backend engineers and one UX team member. If your issue is larger than that, consider creating an epic or splitting your issue in smaller issues. On a regular basis the product manager will review any new issues and schedule them for the correct milestone. This often happens during the Monitor:APM weekly meeting. ### Creating Issues for Discussion Often we need to create an issue to start a discussion about a new idea or feature. These are issues that do not have immediate implementation work, but rather are for discussion and will, in the future, lead to new issues to implement our idea. Here is the process we use for those types of issues: 1. Create an issue with a title like "Discussion: My Great Idea". This issue can then be assigned to specific people for comments and be assigned to a specific milestone. We do not use epics for this type of discussion because we have found it is hard to keep track of epics on our main issue boards. 1. We should continue to update the description of the issue as we find new information or refine our ideas. 1. Once the discussion around the new idea gets to a point where we want to start breaking it down into implementation details, we create an epic. We use epics at this point so we can be sure to group all the issues together and still have the discussion comments in one place that can be easily referenced. We do this in one of two ways: 1. Promote the issue to an epic. 1. Close the original issue, create a new epic, and then add a link from the epic to the original issue. 1. Create issues to cover the different iterations of implementation. Each issue should be small enough to be completed in a single milestone. If there are dependencies between these issues, we should be sure to include that information for planning purposes. ### Breaking Down Issues We try to break issues into small, deliverable pieces. To do this we use the `workflow::planning breakdown` as described in the [product development flow](/handbook/product-development-flow/#build-phase-1-plan). This lets the team know that the issue needs to be broken down before we can start implementation. Anyone on the team can look for issues in this workflow state and break them down. ### Prioritizing Issues Before the start of a milestone, the product manager is responsible for organizing the [APM Planning Board](https://gitlab.com/groups/gitlab-org/-/boards/1065731) by putting all issues for the upcoming milestone in priority order. By using the planning board as a priority list, and by keeping it in order, then we should always be able to look at the current and upcoming milestone columns to have a prioritized list of upcoming work. ### Starting a Milestone To start the next milestone, the engineering manager will apply the [`deliverable`](https://docs.gitlab.com/ee/development/contributing/issue_workflow.html#release-scoping-labels) label to any issues that we have a high likelyhood of completing. The product manager will apply the `release post item` label to the top issues for the upcoming milestone that we want to highlight in the Kickoff call. ### Assigning Issues As an engineer is available to start a new issue, he/she can self-assign the next highest priority issue. Once assigned, the engineer is responsible for keeping the workflow labels up-to-date and providing async issue updates (see below). If the issue will not be complete in the current milestone, the engineer assigned is also responsible for rescheduling the issue. ### Workflow Labels We use standard workflow labels on issues as described in the [product development flow](https://about.gitlab.com/handbook/product-development-flow/#build-phase-2-develop--test). Specifically we use `workflow::ready for development` when the issue has enough information to start development, `workflow::In dev` as we are working on the issue, `workflow::In review` when a merge request is in review, and `workflow::verification` after the merge request has been merged and we are testing the change in staging and production. It is the responsibility of the assigned engineer for an issue to keep the workflow label up-to-date for the issue. We use the [APM Workflow](https://gitlab.com/groups/gitlab-org/-/boards/1165027) board to visualize the issues. ### Assigning MRs for code review Engineers should typically ignore the suggestion from [Dangerbot's](https://docs.gitlab.com/ee/development/dangerbot.html) Reviewer Roulette and assign their MRs to be reviewed by a [frontend engineer](https://about.gitlab.com/company/team/?department=monitor-fe-team) or [backend engineer](https://about.gitlab.com/company/team/?department=monitor-be-team) from the Monitor stage. If the MR has domain specific knowledge to another team or a person outside of the Monitor Stage, the author should assign their MR to be reviewed by an appropriate domain expert. The MR author should use the Reviewer Roulette suggestion when assigning the MR to a maintainer. Advantages of keeping most MR reviews inside the Monitor Stage include: * Quicker reviews because the reviewers hopefully already have the context and don't need additional research to figure out how the MR is supposed to work. * Knowledge sharing among the engineers in the Monitor Stage. There is a lot of overlap between the groups in the stage and this effort will help engineers maintain context and consistency. ### Weekly async issue updates Every Friday, each engineer is expected to provide a quick async issue update by commenting on their assigned issues using the following template: ``` ### Async issue update 1. Please provide a quick summary of the current status (one sentence). 1. When do you predict this feature to be ready for maintainer review? 1. Are there any opportunities to further break the issue or merge request into smaller pieces (if applicable)? ``` We do this to encourage our team to be more async in collaboration and to allow the community and other team members to know the progress of issues that we are actively working on. ### Rescheduling Issues Towards the end of a milestone, if we find any issues that are not going to be completed, it is the responsibility of the assigned engineer to follow this process for moving the issue to the next milestone. 1. Add a comment to the issue with what work is remaining. 1. Add the `to schedule` label 1. Add an issue weight (see below) 1. Move to the next milestone #### Issue Weights We only use issue weights when we have to move an issue from one milestone to the next. This is to help us understand how much remaining work we have for any issue that had to move. For example, we may schedule an issue that just needs a final review differently than an issue that has not been started. We use a simple 1 to 10 scale to estimate the remaining work: | Weight | Meaning | | ------ | ------------- | | 1 | 10% Remaining | | 5 | 50% Remaining | | 10 | 100% Remaining/Not Started | ## Recurring Meetings While we try to keep our process pretty light on meetings, we do hold a [Monitor APM Weekly Meeting](https://docs.google.com/document/d/1Y9woIjy7ySV3lbIJHuoyROYPZhtO1L_w3XDhgGKzZt8/edit?usp=sharing) to triage and prioritize new issues, discuss our upcoming issues, and uncover any unknowns. ## Repos we own or use * [Prometheus Ruby Mmap Client](https://gitlab.com/gitlab-org/prometheus-client-mmap) - The ruby Prometheus instrumentation lib we built, which we used to instrument GitLab. * [GitLab](https://gitlab.com/gitlab-org/gitlab) - Where much of the user facing code lives. * [Omnibus](https://gitlab.com/gitlab-org/omnibus-gitlab) and [Charts](https://gitlab.com/charts/charts.gitlab.io) - Where the packaging related work goes on as we ship GitLab fully instrumented with a Prometheus instance. ## Issue boards * [APM](https://gitlab.com/groups/gitlab-org/-/boards/1143499) - Main board with all issues labeled "group::apm" * [APM - Planning](https://gitlab.com/groups/gitlab-org/-/boards/1065731) - APM issues organized by milestone * [APM - Product](https://gitlab.com/groups/gitlab-org/-/boards/1117038) - APM issues organized by issue type label like "feature", "bug", "backstage", "security", or "Community contribution" * [APM - Workflow](https://gitlab.com/groups/gitlab-org/-/boards/1165027) - APM issues organized by workflow label of "ready for development", "in dev", or "in review".