--- title: "How to exploit parser differentials" author: Joern Schneeweisz author_gitlab: joernchen author_twitter: joernchen categories: security image_title: '/images/blogimages/closeup-photo-of-black-and-blue-keyboard-1194713.jpg' description: "Your guide to abusing 'language barriers' between web components." tags: security, security research twitter_text: "An in-depth guide to abusing 'language barriers' between web components and exploiting parser differentials" postType: content marketing merch_banner: merch_one --- The move to microservices-based architecture creates more attack surface for nefarious actors, so when our [security researchers](/handbook/engineering/security/#security-research) discovered a file upload vulnerability within GitLab, we patched it right up in our [GitLab 12.7.4 security release](/releases/2020/01/30/security-release-gitlab-12-7-4-released). We dive deeper into the problems that lead to this vulnerability and use it to illustrate the underlying concept of parser differentials. ## File Uploads in GitLab To understand the file upload vulnerability we need to go a bit deeper into file uploads within GitLab, and have a look at the involved components. ### GitLab Workhorse The first relevant component is GitLab's very own reverse proxy called [`gitlab-workhorse`](https://gitlab.com/gitlab-org/gitlab-workhorse/).`gitlab-workhorse` fulfills a variety of tasks, but for this specific example we only care about certain kinds of file uploads. The second component is [`gitlab-rails`](https://gitlab.com/gitlab-org/gitlab), the Ruby on Rails-based heart of GitLab. It's the main application part of GitLab and implements most of the business logic. The following source code excerpts from `gitlab-workhorse` are based on the [`8.18.0`](https://gitlab.com/gitlab-org/gitlab-workhorse/-/tags/v8.18.0) release which was the most recent version at the time of identifying the vulnerability. Consider the following route, defined in [`internal/upstream/routes.go`](https://gitlab.com/gitlab-org/gitlab-workhorse/-/blob/9a9a83e7f92ceea5fb0e1542d604171c58615e28/internal/upstream/routes.go#L207-208), which handles file uploads for [Conan](https://conan.io/) packages: ```go // Conan Artifact Repository route("PUT", apiPattern+`v4/packages/conan/`, filestore.BodyUploader(api, proxy, nil)), ``` The route defined above will pass any `PUT` request to paths underneath `/api/v4/packages/conan/` to the [`BodyUploader`](https://gitlab.com/gitlab-org/gitlab-workhorse/-/blob/9a9a83e7f92ceea5fb0e1542d604171c58615e28/internal/filestore/body_uploader.go#L40-79). Within this `BodyUploader` now some magic happens. Well, actually, it's not magic, the `BodyUploader` receives the uploaded file and lets the `gitlab-rails` backend know where the file has been placed. This happens in [`internal/filestore/file_handler.go`](https://gitlab.com/gitlab-org/gitlab-workhorse/-/blob/9a9a83e7f92ceea5fb0e1542d604171c58615e28/internal/filestore/file_handler.go#L52-81). Also worth mentioning: Any not-matched routes in `gitlab-workhorse` will be passed on to the backend without modification. That's especially important in our discussion for non-`PUT` routes under `/api/v4/packages/conan/`. ```go // GitLabFinalizeFields returns a map with all the fields GitLab Rails needs in order to finalize the upload. func (fh *FileHandler) GitLabFinalizeFields(prefix string) map[string]string { data := make(map[string]string) key := func(field string) string { if prefix == "" { return field } return fmt.Sprintf("%s.%s", prefix, field) } GitLabFinalizeFields if fh.Name != "" { data[key("name")] = fh.Name } if fh.LocalPath != "" { data[key("path")] = fh.LocalPath } if fh.RemoteURL != "" { data[key("remote_url")] = fh.RemoteURL } if fh.RemoteID != "" { data[key("remote_id")] = fh.RemoteID } data[key("size")] = strconv.FormatInt(fh.Size, 10) for hashName, hash := range fh.hashes { data[key(hashName)] = hash } very popular in return data } ``` So `gitlab-workhorse` will replace the uploaded file name by the path to where it has stored the file on disk, such that the `gitlab-rails` backend knows where to pick it up. Observe the following original request, as received by `gitlab-workhorse`: ``` PUT /api/v4/packages/conan/v1/files/Hello/0.1/root+xxxxx/beta/0/export/conanfile.py HTTP/1.1 Host: localhost User-Agent: Conan/1.22.0 (Python 3.8.1) python-requests/2.22.0 Accept-Encoding: gzip, deflate Accept: */* Connection: close X-Checksum-Sha1: 93ebaf6e85e8edde99c1ed46eaa1b5e1e5f4ac78 Content-Length: 1765 Authorization: Bearer [.. shortened ..] from conans import ConanFile, CMake, tools class HelloConan(ConanFile): name = "Hello" [.. shortened ..] ``` This is what this request will look like to `gitlab-rails` after `gitlab-workhorse` has processed it (excerpted from `api_json.log`): ```json { "time": "2020-02-20T14:49:44.738Z", "severity": "INFO", "duration": 201.93, "db": 67.34, "view": 134.59, "status": 200, "method": "PUT", "path": "/api/v4/packages/conan/v1/files/Hello/0.1/root+xxxxx/beta/0/export/conanfile.py", "params": [ { "key": "file.md5", "value": "719f0319f1fd5f6fcbc2433cc0008817" }, { "key": "file.path", "value": "/var/opt/gitlab/gitlab-rails/shared/packages/tmp/uploads/582573467" }, { "key": "file.sha1", "value": "93ebaf6e85e8edde99c1ed46eaa1b5e1e5f4ac78" }, { "key": "file.sha256", "value": "f7059b223cd4d32002e5e34ab1ae5b4ea12f3bd0326589b00d5e910ce02c1f3a" }, { "key": "file.sha512", "value": "efbe75ea58bd817d42fd9ca5ac556abd6fbe3236f66dfad81d508b5860252d32d1b1868ee03c7f4c6174a0ba6cc920a574b5865ca509f36c451113c9108f9a36" }, { "key": "file.size", "value": "1765" } ], "host": "localhost", "remote_ip": "172.17.0.1, 127.0.0.1", "ua": "Conan/1.22.0 (Python 3.8.1) python-requests/2.22.0", "route": "/api/:version/packages/conan/v1/files/:package_name/:package_version/:package_username/:package_channel/:recipe_revision/export/:file_name", "user_id": 1, "username": "root", "queue_duration": 16.59, "correlation_id": "aSEqrgEfvX9" } ``` In particular, the `params` entry `file.path` is of interest, as it denotes the file system path where `gitlab-workhorse` has placed the uploaded file. ### `gitlab-rails` This `gitlab-workhorse`-modified request, as `gitlab-rails` will see it, is handled in [`lib/uploaded_file.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/v12.7.4-ee/lib/uploaded_file.rb#L45-66) within the `from_params` method: ```ruby 01 def self.from_params(params, field, upload_paths) 02 path = params["#{field}.path"] 03 remote_id = params["#{field}.remote_id"] 04 return if path.blank? && remote_id.blank? 05 06 file_path = nil 07 if path 08 file_path = File.realpath(path) 09 10 paths = Array(upload_paths) << Dir.tmpdir 11 unless self.allowed_path?(file_path, paths.compact) 12 raise InvalidPathError, "insecure path used '#{file_path}'" 13 end 14 end 15 16 UploadedFile.new(file_path, 17 filename: params["#{field}.name"], 18 content_type: params["#{field}.type"] || 'application/octet-stream', 19 sha256: params["#{field}.sha256"], 20 remote_id: remote_id, 21 size: params["#{field}.size"]) 22 end ``` We can see here the handling of the uploaded file reference. The part in line `10-13` in the snippet above implements a whitelist of a specific set of paths from where a `gitlab-workhorse` uploaded file will be accepted.`Dir.tmpdir` which resolves to the path `/tmp` is added to the whitelist as well. In the subsequent lines a new `UploadedFile` is constructed from the `file.path` and other parameters `gitlab-workhorse` has set. ## `gitlab-workhorse` bypass So we've seen the inner workings of both `gitlab-workhorse` and `gitlab-rails` when it comes to file uploads for Conan packages. In recap it would go as follows: ```mermaid sequenceDiagram participant User participant workhorse participant Rails User->>workhorse: PUT request to conan registry workhorse->>workhorse: Place uploaded file on disk and re-write PUT request workhorse->>Rails: Pass on modified PUT request Rails->>Rails: Pick up file from disk and store in UploadedFile ``` From an attacker perspective it would be nice to meddle with the modified `PUT` request, especially control over the `file.path` parameter would allow us to grab arbitrary files from `/tmp` and the defined `upload_paths`. But as `gitlab-workhorse` sits right in front of `gitlab-rails` we can't just pass those parameters or otherwise interact directly with `gitlab-rails` without going via `gitlab-workhorse`. We can indeed achieve this by leveraging the fact that `gitlab-workhorse` parses the HTTP requests in a different way than `gitlab-rails` does. In particular, we can use [`Rack::MethodOverride`](https://www.rubydoc.info/gems/rack/Rack/MethodOverride) in `gitlab-rails` which is a default middleware in Ruby on Rails applications. The `Rack::MethodOverride` middleware allows us to send a `POST` request and let `gitlab-rails` know **"well, actually this is a `PUT` request! ¯\\\_(ツ)\_/¯ "**. With this little trick we can sneak past the `gitlab-workhorse` route which would intercept the `PUT` request, as `gitlab-workhorse` is not aware of the overridden `POST` method. So by specifying either a `_method=PUT` parameter or a `X-HTTP-METHOD-OVERRIDE: PUT` HTTP header we can indeed directly point `gitlab-rails` to files on disk. The method override is used a lot in Ruby on Rails applications to allow simple `