--- layout: markdown_page title: "Category Strategy - Geo-replication" --- - TOC {:toc} ## 🌏 Geo-replication ### Introduction and how you can help * [Overall Strategy](/direction/geo) * [Roadmap for Geo Replication](https://gitlab.com/groups/gitlab-org/-/roadmap?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=group%3A%3Ageo&label_name[]=geo%3A%3Aactive) * [Maturity: <%= data.categories["geo_replication"].maturity.capitalize %>](/direction/maturity/) * [Documentation](https://docs.gitlab.com/ee/administration/geo/replication/) * [Complete Maturity epic](https://gitlab.com/groups/gitlab-org/-/epics/1508) * [All Epics](https://gitlab.com/groups/gitlab-org/-/epics?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=group%3A%3Ageo) The Geo-replication category helps distributed developer teams be more productive. With a single GitLab instance working with large repositories can take a long time for developers located in different geographies. Geo-replication provides an easily configurable, read-only mirror (we call it a Geo node) of a GitLab installation that is complete, accurate, verifiable and efficient. This is valuable because using Geo reduces the time it needs to fetch and clone repositires, which increases developer productivity. Please reach out to Fabian Zimmer, Product Manager for the Geo group ([Email](mailto:fzimmer@gitlab.com)) if you'd like to provide feedback or ask any questions related to this product category. This strategy is a work in progress, and everyone can contribute: - Please comment and contribute in the linked [issues](https://gitlab.com/groups/gitlab-org/-/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=group%3A%3Ageo) and [epics](https://gitlab.com/groups/gitlab-org/-/epics?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=group%3A%3Ageo) on this page. Sharing your feedback directly on GitLab.com is the best way to contribute to our strategy and vision. ### Current state Currently, Geo-replication requires a significant investment to be configured, upgraded and maintained by systems administrators. Not all parts of GitLab are replicated and there is not as much control over what is replicated where. ### Where we are headed Our goal for Geo-replication is to offer the same experience to users, regardless of their location. In the future, we want our users to be able to configure Geo within minutes - not hours. We envision Geo-replication to be fully transparent to users. This means that a developer should not need to actively decide to use Geo, or select the right Geo node - GitLab should be able to determine what Geo node should be used to provide the best user experience. For systems adminstrators, it should be simple to add, configure and remove new nodes. * Pulls and clones of git repositories should be fast everywhere - especially for large repositories. * Users should not be required to configure geo-replication manually - we should detect the best Geo node automatically. * Geo-replication should be easy to set-up and configure - new Geo nodes should be added with minimal effort. * Geo-replication should be fully transparent for users - ideally a user of GitLab does not even notice that they are using a Geo node. * Geo-replication should allow fine-grained control over what is replicated where - not every project and repository needs to be shared. * All data types generated by GitLab should be able to be replicated and verified. ### Target audience and experience #### [Sidney (Systems Administrator)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sidney-systems-administrator) * πŸ™‚ **Minimal** - Sidney is able to configure the Geo-replication setup manually, most commonly used data types are replicated. * 😊 **Viable** - Sidney is able to configure Geo-replication in a mostly automated fashion. Various Geo-replication tasks (status reports, maintenance etc.) can be performed using the GitLab UI. All customer releveant data types are replicated. * 😁 **Complete** - Sidney can setup Geo in an automated and fast way (minutes, not hours). A dashboard provides an overview for all relevant Geo tasks. * 😍 **Lovable** - Sidney can setup Geo in an automated and fast way and everything is configurable from the UI, all nodes appear writable. #### [Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer) * πŸ™‚ **Minimal** - Sasha has to configure Git manually to use Geo. No automatic proxying of request to the primary. WebUI on secondary is not available. * 😊 **Viable** - Sasha has to configure Git manually but automatic proxying to the primary is possible. WebUI is read-only. * 😁 **Complete** - Sasha only requires a single URL for all Git operations - the correct Geo node is used automatically. * 😍 **Lovable** - Sasha can push and pull directly to a Geo node. Everything is read and writable, including the WebUI. The user experience is the same independent of location. For more information on how we use personas and roles at GitLab, please [click here](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/). ### What's next & why #### Building a self-service Geo framework As of October 2019 [only ~50% of data types](https://docs.gitlab.com/ee/administration/geo/replication/#current-limitations) (we need a better name) are replicated and of those only ~41% are fully verified. We have made some efforts to change this situation by trying [to replicate the remaining data types](https://gitlab.com/groups/gitlab-org/-/epics/893) and [by trying to verify those data types](https://gitlab.com/groups/gitlab-org/-/epics/1430). As part of those efforts we learned that replicating data types is hard and so is verifying the data. In order to change this situation and allow for adding data types to Geo more quickly, we are investigating how to [build a scalable, self-service geo-replication and verification framework](https://gitlab.com/groups/gitlab-org/-/epics/2161). This should make it easier for other teams within GitLab to add new datatypes and allow us to manage GitLab's growth. Additionally, this will make it easier for the community to contribute to Geo. The goal here is to allow new features to ship with Geo support by default without impacting velocity. #### Geo should be easy to install Installing Geo is highly manual and cumbersome, especially in high-availability configurations. The Distribution team is working to make [deploying and configuring Geo nodes easier](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/4869). The Geo team will support this effort and in the beginning of 2020, we are going to start investigating how we can [simplify Geo's installation](https://gitlab.com/groups/gitlab-org/-/epics/1465). We also identified that a service discovery solution could have a huge benefit in helping administrators set up clusters of Geo nodes. We are currently working to [support Geo on Kubernetes](https://gitlab.com/groups/gitlab-org/-/epics/944) to give us a greater understanding of this tool that will help inform us as to the right direction to take with this proposal. We will [update the service discovery proposal](https://gitlab.com/gitlab-org/gitlab-ee/issues/8932) when Kubernetes support is complete. #### Improving the administrator UI/UX We have identified many [small usability issues with the Geo Administrator UI](https://gitlab.com/groups/gitlab-org/-/epics/369) and will start a comprehensive review of Geo's adminstrator panel. This includes [generating UX scorecards](https://gitlab.com/gitlab-org/gitlab-design/issues/731) and also discovery work to evaluate which specific tasks systems administrators need or want to perform using the UI. We will then iterate on the UI and add additional functionalities as needed. Additionally, we will work on refactoring the frontend code to be in line with our latest design guidelines. #### Multi-mode Geo Currently, Geo can only officially be operated in one mode - [Read-Only](#read-only) - where each the database on a Geo secondary is in a read-only mode. Customer feedback has indicated a desire for additional operational / running modes, namely making Geo read and writable. We have explored two POCs ([1](https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/9354), [2](https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/10309)) and will revisit this in an effort to move Geo from Complete to Lovable. This is not expected to start before the middle of 2020. ### What is not planned right now We are currently not planning on moving away from Postgres as a backend database in favour of e.g CockroachDB or Google Spanner. This has implications for multi-mode Geo, but for now we will continue to support PostgreSQL. ### Maturity plan This category is currently at the `viable` maturity level, and our next maturity target is `complete` (see our [definitions of maturity levels](/direction/maturity/#legend). You can track the work that will move the category to `complete` in [this epic](https://gitlab.com/groups/gitlab-org/-/epics/1508). ### Competitive landscape The top competitors for Geo-replication are - [GitHub Enterprise geo-replication](https://help.github.com/en/enterprise/2.17/admin/installation/about-geo-replication) - [Azure DevOps](https://azure.microsoft.com/en-in/services/devops/) - [Bitbucket Smart Mirroring](https://confluence.atlassian.com/bitbucket/work-with-bitbucket-smart-mirroring-838427532.html) #### Feature overview | Feature | GitHub | AzureDevOps | Bitbucket Smart Mirroring | GitLab | |---|---|---|---|---| | Mirror repositories | βœ… | βœ… | βœ… | βœ… | | Active-active replication | ❌ | N/A | ❌ | ❌ | | Selective sync | N/A | N/A | βœ… | ⚠️ | | UI configuration | ❌ | βœ… | N/A | ⚠️ | | Kubernetes support | ❌ | ❌ | ❌ | ⚠️ | | Mirror docker registries | ❌ | N/A | ❌ | βœ… | | LFS and file upload support | βœ… | N/A | βœ… | βœ… | | Automatic DNS | βœ… | βœ… | ❌ | ⚠️ | GUI Dashboard | βœ… | βœ… | N/A | βœ… | | Request proxying | βœ… | N/A | N/A | ⚠️ | βœ… Fully available ⚠️ Partially available ❌ Not available N/A No information available ### Analyst landscape We do need to engage with analysts more closely to understand the current landscape better. ### Top customer success/sales issue(s) * [https://gitlab.com/gitlab-org/gitlab-ee/issues/2870](https://gitlab.com/gitlab-org/gitlab-ee/issues/2870) * [Active active git replication](https://gitlab.com/gitlab-org/gitlab-ee/issues/1381) ### Top user issues * [Category issues listed by popularity](https://gitlab.com/groups/gitlab-org/-/issues?label_name%5B%5D=Geo&scope=all&sort=popularity&state=opened&utf8=%E2%9C%93) ### Top internal customer issues/epics * [Build a scalable, self-service geo-replication and verification framework](https://gitlab.com/groups/gitlab-org/-/epics/2161) ### Top strategy item(s) * [Simplify Geo's installation](https://gitlab.com/groups/gitlab-org/-/epics/1465)