Working with Download · Download · Develop · Help · GitLab (2023)

proposal

  • When you create a program to load,να γίνει υποκατηγορίαfromAttachment loader
  • Add your loader totablein this document
  • do not addbin to store the new object
  • to applyimmediate loading
  • Whether you need to process your transfers is up to youwhere to do

Accompanying information

  • CarrierWave Uploader
  • GitLab changes to CarrierWave

Where should you store your files?

The CarrierWave uploader determines where the files are stored. When you create a new Uploader class, you decide where to store the files for your new functionality.

First, ask yourself if you need a new Uploader class. The same Uploader class can be used for different attachment points or different models.

If you really want or need your own Uploader class, you should create onesubcategory ofAttachment loaderYou then inherit the storage location and directory schema from this class. The directory format is:

document.Participation(Model.class.tone, install_as.to_s, Model.ID card.to_s)

If you look at the GitLab repository, you will see that many users have their own repository. For item storage, this means loaders have their own containers. right now wethey do not encourageNew containers are added for the following reasons:

  • Using a new container increases development time because you have to addGDK,GitLab for everyoneIcompressed natural gas.
  • Using the new containers required changes to the GitLab.com infrastructure, which slowed down the release of new features
  • Using a new container slows down the adoption of new features for self-managed GitLab installations: users can't start using your new features until your local GitLab admin configures the new container.

By using your existing barrel, you avoid all that extra work and friction. thisGitlab.config.uploadsstorage location, what is itAttachment loaderUse, make sure it is set.

Implement live streaming support

Below we describe how to achieve thisimmediate loadingsupport.

Using direct streaming isn't always necessary, but it's usually a good idea. Unless your function handles infrequent and small loads, you probably want to apply loads directly. Project avatars are an example of a feature with small and infrequent uploads: these avatars rarely change and their application imposes a strict size limit.

If your operation handles transfers that are neither frequent nor small, not implementing direct transfer support means you're taking on technical debt. At least, you should make surecapableAdd live streaming support later.

To support live streaming, you need two things:

  1. Pre-authorization endpoint in Rails
  2. Main routing rules

Workhorse does not know where to store your uploads. To find out, submit a pre-approval request. He also does not know if and where to apply for pre-approval. For this you need routing rules.

A note for those of us who remember,Workhorse was a separate project: It is no longer necessary to split these two steps into separate merge requests. In fact, it may be easier to do both in one merge request.

Add Workhorse routing rules

Routing rules are defined inworkhorse/internal/upstream/routes.go.They include:

  • HTTP verb (usually "POST" or "PUT")
  • path regular expression
  • Transport type: Multipart MIME or "full request body"
  • Alternatively, you can also assign HTTP headers, such asContent type

example:

vessel.ruta("tan", way project api+"package/nuget/"., mimeMultipartUploader),

You should add a test to the routing ruleTry expedited shippingexistsmain/upload_test.go.

You should also manually verify that Workhorse issues a pre-approval request when you submit a request to load a new feature. You can verify this by looking at the Rails access log. This is necessary because if you get the routing rules wrong, you won't have a hard failure: you'll just end up using a less efficient default route.

Add an endpoint before authorization

We distinguish three cases: Rails controllers, Grape API endpoints, and GraphQL resources.

First the bad news: direct transfers in GraphQL are not currently supported. This is because Workhorse does not parse GraphQL queries. See alsoIssue #280819.Consider accepting the file upload via Grape.

For the Grape preauthorization endpoint, look for existing example implementations/approveRoute. An example ispostal:id/upload/authorizationendpoint.This example uses a FileUloader, which means that the upload is stored in the storage location (bucket) of this Uploader class.

For Rails endpoints you can useThe main authorization problem.

loading process

Some features require us to manage uploads, such as extracting metadata from uploaded files. There are several different ways to achieve this. The main options areWhereProcessing implementation, i.e. "who is the processor".

processorCan it be loaded directly?Can an HTTP request be rejected?to apply
SidekiqAndYesdirect
main forceAndAndcomplicated
guideYesAndComfortably

Handling in Rails looks attractive, but it usually causes scaling problems because you can't use transfers directly. You will then be forced to rebuild your function by doing so in Workhorse. So, if your operational requirements allow it, running on Sidekiq can be a good balance between complexity and scalability.

CarrierWave Uploader

GitLab uses a modified versionCarrierTransfer management. Below we describe how we use CarrierWave and how we modify it.

The basic idea of ​​CarrierWave isUploaded by userclass. Loaders define where the files are stored and can optionally contain validation and processing logic. To use a loader, you must bind it to a text column in the ActiveRecord model. This is called "installation" and the column is namedplacement point.For example:

class Work < application file install the uploader : avatar, Attachment loaderend

Now if you upload a file namedcivet cat.pngThe idea is that inproject.avatarcolumn, CarrierWave storage arraycivet cat.png, and the AttachmentUploader class contains configuration data and a directory schema. For example, if the project ID is 123, the actual file might be located at/var/opt/gitlab/gitlab-rails/uploads/-/system/project/avatar/123/tanuki.png.Content/var/opt/gitlab/gitlab-rails/uploads/-/system/project/avatar/123/It is up to the loader to use additional configuration options (/var/opt/gitlab/gitlab-rails/uploads), Model Name (Work), model ID (123) and support point (avatar).

The sender specifies a personal directory to store your uploads. thisplacement pointA column in the model contains the file name.

you never visitplacement pointDirect because CarrierWave defines a getter and setter on your model that operates on the file handler object.

Optional loader behavior

In addition to specifying the storage directory to upload to, the CarrierWave Uploader can implement many other behaviors through callbacks. Not all of these behaviors are available in GitLab. Specifically, you can't use it right nowVersionCarrier wave mechanism. Things you can do include:

  • file name check
  • Not compatible with live streaming:Perform pre-processing on the file contents, such as resizing the image
  • Not compatible with live streaming:encryption at rest

CarrierWave's preprocessing behavior, such as image resizing or encryption, requires local access to the uploaded files. This forces you to load the edited files from Ruby. This is opposed to direct loading, which is approxYesLoading in Ruby. If you use direct loading with a loader that has preprocessing behavior, the preprocessing behavior will be silently skipped.

CarrierWave data storage mechanism

CarrierWave has 2 data storage mechanisms:

CarrierWave classGitLab timedescribe
CarrierWave::Save::FileObject Storage::Storage::LocalLocal files, accessible through Rubystandard library
CarrierWave::Storage::FogObject Storage::Storage::RemoteFiles in the cloud, accessible viadental ornament

GitLab uses both engines depending on the configuration.

A typical way to select a storage mechanism in CarrierWave is to useuploader.storageclass method. At GitLab, we don't do that. we coveredUploaded by #storageinstead of this. This allows us to modify the storage engine file by file.

CarrierWave file life cycle

The uploader is connected to two storage areas: normal storage and cache storage. Each has its own data storage mechanism. If a file is assigned a mount point setter (project.avatar = File.open('/tmp/tanuki.png')) as a side effect, you have to passCrypt!method. To save the file, you need to call it somehowStore!method. This either happens insideActiveRecord callbackor callStore!in the case of the loader.

You usually don't need to use itCrypt!IStore!But if you need to patch changes in GitLab CarrierWave, it's useful to know that it's there and always called. In particular, it is good to know the CarrierWave preprocessing behavior (procedureetc) are implemented asbefore: cachehooks, and in the case of direct loads, these hooks are ignored and will not work.

Direct transmission skips all CarrierWavebefore: cachescream.

GitLab changes to CarrierWave

GitLab uses a modified version of CarrierWave to enable many things.

Data transfer between storage machines

existsapp/uploader/object_storage.rbThere is code to migrate user data between local storage and object storage. This code exists because GitLab.com has long saved commits to local storage via NFS. This changed when we had to move transports to object storage as part of an infrastructure migration.

Zato CarrierWaveSaveIt varies depending on the upload to GitLab and why we have database columns likeupload.storetheci_job_artifacts.file_store.

Direct upload via Workhorse

Workhorse direct loading is a mechanism that allows us to accept large loads without consuming too much Ruby CPU time. Workhorse is written in Go, and goroutines have much fewer resources than Rubythreads.

See how live streaming works.

  1. Workhorse accepts user upload requests
  2. Workhorse uses Rails to pre-validate requests and obtain temporary load positions
  3. Workhorse stores uploaded files from user requests in a temporary upload location
  4. Workhorse propagates the request to Rails
  5. Rails performs a remote copy function to copy an uploaded file from a temporary location to a final location
  6. Rails deletes the temporary transfer
  7. Workhorse flushes the temporary transfer a second time to prevent Rails from timing out

usually,Crypt!return an instanceCarrierWave::SanitizedFile, MeStore!afterwardUpload the file using Fog.

In the case of object storage, copying from a temporary location to a final location is accomplished with Rails tricking CarrierWave with GitLab-specific modifications. When CarrierWave triesCrypt!upload, miRETURNONECarrierWave::Save::Fog::FileThe file handle for the temporary file. During this periodStore!Phase, CarrierWave and timecopythis file to the expected location.

surface

The Scalability::Frameworks team makes saving and loading objects easier to use and more powerful. If you add or change loaders, it would help us to update this form as well. This helps us understand where and how transmitters are being used.

Characteristic container elements

featureloading technologyUploaded by userbarrel structure
work artifactimmediate loadingmain force/tricks////
Production from the assembly lineCarriersidekiq/tricks//piping//tricks/
Live job tracesFOGsidekiq/artifacts/tmp/builds//clippings/.log
Task tracking fileCarriersidekiq/tricks/////job.log
Autoscale cursor cacheDoes not applygitlab-collier/gitlab-com-[platform-]runners-cache/???
backupDoes not applys3cmd,awscli, theground station/gitlab-backup/???
The file system of the Git languageimmediate loadingmain force/lfs-objects//
design management filedisk cacherailway controller/lsf-object//
Project Management ThumbnailCarriersidekiq/uploads/design_management/action/image_v432x230//
General file uploadimmediate loadingmain force/uploads/@hashed/[0:2]/[2:4]///file
General file transfer - individual fragmentsimmediate loadingmain force/uploads/personal_nippet//
Global layout settingsdisk cacherailway controller/uploads/display/...
himdisk cacherailway controller/uploads/projects/themes/...
slika avataraimmediate loadingmain force/uploads/[user,group,project]/avatar/
Introductionimmediate loadingmain force/uploads/import_export_upload/import_file//
ExitCarriersidekiq/uploads/import_export_upload/export_file//_-_export.tag.gz
GitLab migrationCarriersidekiq/uploads/bulk_imports/???
MR differenceCarriersidekiq/external-diffs/merge_request_diffs/mr-/difference-
Package Management Resources (Non-NPM)immediate loadingmain force/packet//packet//archives/
NPM Package Management ResourcesCarrierAPI for grapes/packet//packet//archives/
Debian Package Management Resourcesimmediate loadingmain force/packet//debian_*/
Rely on a proxy cachesending dependencymain force/dependency-proxy//dependency_proxy//archives/
Terraform state fileCarrierrailway controller/terraform//
Website content archiveCarriersidekiq/gitlab-gprd-pages//pages_deployments//
security fileCarriersidekiq/ci-secure-files//safe_files//

CarrierWave integration

documentUsing CarrierWaveClassification
application/models/projekt.rbincluding avatars{cycle check}And
application/models/project/tema.rbincluding avatars{cycle check}And
application/models/group.rbincluding avatars{cycle check}And
application/models/user.rbincluding avatars{cycle check}And
app/models/terraform/state_version.rbEnable FileStoreMounter{cycle check}And
app/models/ci/job_artifact.rbEnable FileStoreMounter{cycle check}And
app/models/ci/pipeline_artifact.rbEnable FileStoreMounter{cycle check}And
app/modeli/pages_deployment.rbEnable FileStoreMounter{cycle check}And
application/models/lfs_object.rbEnable FileStoreMounter{cycle check}And
application/models/dependency_proxy/blob.rbEnable FileStoreMounter{cycle check}And
application/models/dependency_proxy/manifest.rbEnable FileStoreMounter{cycle check}And
app/modeli/packages/composer/cache_file.rbEnable FileStoreMounter{cycle check}And
application/models/package/package_file.rbEnable FileStoreMounter{cycle check}And
app/modeli/concerns/packages/debian/component_file.rbEnable FileStoreMounter{cycle check}And
ee/app/models/issuable_metric_image.rbEnable FileStoreMounter
ee/app/modeli/vulnerabilities/remediation.rbEnable FileStoreMounter
ee/app/modeli/vulnerabilities/export.rbEnable FileStoreMounter
app/modeli/packages/debian/project_distribution.rbContains packages::Debian::Distribution{cycle check}And
application/models/packages/debian/group_distribution.rbContains packages::Debian::Distribution{cycle check}And
app/modeli/packages/debian/project_component_file.rbInclude package::Debian::ComponentFile{cycle check}And
app/modeli/packages/debian/group_component_file.rbInclude package::Debian::ComponentFile{cycle check}And
application/models/merge_request_diff.rbmount_uploader :external_diff, ExternalDiffUploader{cycle check}And
app/models/note.rbmount_uploader : attachment, attachment uploader{cycle check}And
application/models/layout.rbmount_uploader :logo, AttachmentUploader{cycle check}And
application/models/layout.rbmount_uploader : header_logo, attachment uploader{cycle check}And
application/models/layout.rbmount_uploader :favicon, FaviconUloader{cycle check}And
application/models/projekt.rbmount_uploader :bfg_object_map, attachment uploader
application/models/import_export_upload.rbmount_uploader :import_file, ImportExportUploader{cycle check}And
application/models/import_export_upload.rbmount_uploader :export_file, ImportExportUploader{cycle check}And
app/modeli/ci/deleted_object.rbmount_uploader :file, DeletedObjectUploader
app/modeli/design_management/action.rbmount_uploader :image_v432x230, DesignManagement::DesignV432x230Uploader{cycle check}And
app/modeli/concerns/packages/debian/distribution.rbmount_uploader :signed_file, Paketi::Debian::DistributionReleaseFileUloader{cycle check}And
application/models/bulk_imports/export_upload.rbmount_uploader :export_file, ExportUploader{cycle check}And
ee/app/models/user_permission_export_upload.rbmount_uploader :file, AttachmentUploader
app/modeli/ci/secure_file.rbEnable FileStoreMounter
Top Articles
Latest Posts
Article information

Author: Dr. Pierre Goyette

Last Updated: 04/02/2023

Views: 5245

Rating: 5 / 5 (50 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Dr. Pierre Goyette

Birthday: 1998-01-29

Address: Apt. 611 3357 Yong Plain, West Audra, IL 70053

Phone: +5819954278378

Job: Construction Director

Hobby: Embroidery, Creative writing, Shopping, Driving, Stand-up comedy, Coffee roasting, Scrapbooking

Introduction: My name is Dr. Pierre Goyette, I am a enchanting, powerful, jolly, rich, graceful, colorful, zany person who loves writing and wants to share my knowledge and understanding with you.