1
0
mirror of synced 2025-12-19 18:10:59 -05:00

Update migration docs to include guidance for using GitHub owned blob storage (#57122)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Daniel Perez <dpmex4527@github.com>
Co-authored-by: Isaac Brown <101839405+isaacmbrown@users.noreply.github.com>
Co-authored-by: isaacmbrown <isaacmbrown@github.com>
This commit is contained in:
Jim Boyle
2025-08-18 12:26:25 -04:00
committed by GitHub
parent f1356d109c
commit ef7d0f44ea
5 changed files with 129 additions and 13 deletions

View File

@@ -69,30 +69,41 @@ To migrate your repositories from {% data variables.product.prodname_ghe_server
## Step 3: Set up blob storage
{% data reusables.enterprise-migration-tool.blob-storage-intro %}
When performing a repository migration, you must store your repository data in a place that {% data variables.product.prodname_importer_proper_name %} can access. This can be accomplished by:
* Using local storage on the GHES instance (GHES **3.16** and later)
* Using a blob storage provider
### Using local storage (GHES 3.16+)
{% data reusables.enterprise-migration-tool.local-storage-steps %}
### Using a blob storage provider
If your {% data variables.product.prodname_ghe_server %} instance is behind a firewall, you may need to set up blob storage with an external cloud service.
First, you must set up blob storage with a supported provider. Then, if you're using a cloud provider, you must configure your credentials for the storage provider in the {% data variables.enterprise.management_console %} or {% data variables.product.prodname_cli %}.
{% data reusables.enterprise-migration-tool.supported-blob-storage-providers %}
> [!NOTE]
> You only need to configure blob storage if you use {% data variables.product.prodname_ghe_server %} versions 3.8 or higher. If you use {% data variables.product.prodname_ghe_server %} versions 3.7 or lower, skip to [Step 4: Set up a migration source in {% data variables.product.prodname_ghe_cloud %}](#step-4-set-up-a-migration-source-in-github-enterprise-cloud).
>
> Blob storage is required to migrate repositories with large Git source or metadata. If you use {% data variables.product.prodname_ghe_server %} versions 3.7 or lower, you will not be able to perform migrations where your Git source or metadata exports exceed 2GB. To perform these migrations, update to {% data variables.product.prodname_ghe_server %} versions 3.8 or higher.
### Setting up an AWS S3 storage bucket
#### Setting up an AWS S3 storage bucket
{% data reusables.enterprise-migration-tool.set-up-aws-bucket %}
### Setting up an Azure Blob Storage account
#### Setting up an Azure Blob Storage account
{% data reusables.enterprise-migration-tool.set-up-azure-storage-account %}
### Using local storage (GHES 3.16+)
{% data reusables.enterprise-migration-tool.local-storage-steps %}
### Configuring blob storage in the {% data variables.enterprise.management_console %} of {% data variables.location.product_location_enterprise %}
#### Configuring blob storage in the {% data variables.enterprise.management_console %} of {% data variables.location.product_location_enterprise %}
{% data reusables.enterprise-migration-tool.blob-storage-management-console %}
### Allowing network access
#### Allowing network access
If you have configured firewall rules on your storage account, ensure you have allowed access to the IP ranges for your migration destination. See [AUTOTITLE](/migrations/using-github-enterprise-importer/migrating-between-github-products/managing-access-for-a-migration-between-github-products#configuring-ip-allow-lists-for-migrations).
@@ -239,6 +250,78 @@ If you're using {% data variables.product.prodname_ghe_server %} 3.8 or higher,
You may need to allowlist {% data variables.product.company_short %}'s IP ranges. For more information, see [AUTOTITLE](/migrations/using-github-enterprise-importer/migrating-between-github-products/managing-access-for-a-migration-between-github-products#configuring-ip-allow-lists-for-migrations).
### Uploading your migration archives to {% data variables.product.prodname_ghos %}
> [!NOTE]
> Repository migrations with {% data variables.product.prodname_ghos %} are currently in {% data variables.release-phases.public_preview %} and subject to change.
If you're using {% data variables.product.prodname_ghos %}, you will upload your archive to {% data variables.product.prodname_ghos %} using the following process:
1. Create a multipart upload by submitting a POST request
1. Upload archive in multiple parts up to 100MB in size as PATCH requests
1. Submit a PUT request to complete the upload
#### 1. Create the multipart upload
You will submit a POST request to:
```http
https://uploads.github.com/organizations/{organization_id}/gei/archive/blobs/uploads
```
Include a JSON body like below with the archive name and size. The content type can remain as `"application/octet-stream"` for all uploads.
```json
{
"content_type": "application/octet-stream",
"name": "git-archive.tar.gz",
"size": 262144000
}
```
This will return a JSON object response as follows:
```json
{
"guid": "363b2659-b8a3-4878-bfff-eed4bcb54d35",
"node_id": "MA_kgDaACQzNjNiMjY1OS1iOGEzLTQ4NzgtYmZmZi1lZWQ0YmNiNTRkMzU",
"name": "git-archive.tar.gz",
"size": 33287,
"uri": "gei://archive/363b2659-b8a3-4878-bfff-eed4bcb54d35",
"created_at": "2024-11-13T12:35:45.761-08:00"
}
```
This URI represents the uploaded archive, and will be used to enqueue migration when you start your repository migration. The response will also include the location in the response header used to upload file parts using a PATCH request in the next step:
```http
/organizations/{organization_id}/gei/archive/blobs/uploads?part_number=1&guid=<guid>&upload_id=<upload_id>
```
#### 2. Upload the archive in multiple parts
Upload up to 100 MB of your file in each part with a PATCH request using the location from the previous response header, ensuring that the raw data is uploaded in the request body without using a multipart form. If the final part of your file is less than 100 MB, upload only the remaining bytes in that last request:
```http
https://uploads.github.com/{location}
```
This will return an empty response body with the following location in the response header:
```http
/organizations/{organization_id}/gei/archive/blobs/uploads?part_number=2&guid=<guid>&upload_id=<upload_id>
```
Repeat this until the upload is complete. Ensure that you are reading up to 100 MB of the file at a time, and submitting requests to the new location values with the incremented `part_number` values.
#### 3. Complete the upload
Submit a PUT request to the "last path" value from the previous step with an empty body and your upload to GitHub-owned storage is complete. The GEI URI can be constructed with the GUID from this initial POST request in the following format:
```http
gei://archive/{guid}
```
## Step 7: Start your repository migration
{% data reusables.enterprise-migration-tool.start-repository-migration-ec %}
@@ -328,6 +411,19 @@ For {% data variables.product.pat_generic %} requirements, see [AUTOTITLE](/migr
## Step 4: Set up blob storage
### Migrating repositories with {% data variables.product.prodname_ghos %}
> [!NOTE]
> Repository migrations with {% data variables.product.prodname_ghos %} are currently in {% data variables.release-phases.public_preview %} and subject to change.
If you do not want to set up and provide {% data variables.product.prodname_importer_proper_name %} with access to a customer-owned blob storage account for storing your repository archives, you can migrate repositories using {% data variables.product.prodname_ghos %}. To do so, you must be running v1.9.0 (or higher) of {% data variables.product.prodname_gei_cli %}. {% data variables.product.prodname_ghos %} does not require additional setup and is available as an option when you run {% data variables.product.prodname_gei_cli %} commands.
For security purposes, {% data variables.product.prodname_ghos %} is explicitly write-only, and downloads from {% data variables.product.prodname_ghos %} are not possible. After a migration is complete, the repository archives are immediately deleted. If an archive is uploaded and not used in a migration, the archive is deleted after 7 days.
When you use the CLI flag for {% data variables.product.prodname_ghos %}, the repository archive is automatically exported to the destination configured in the Management Console, uploaded to GitHub-owned storage, and imported to your migration destination. When using {% data variables.product.prodname_ghos %} we recommend configuring local storage. See [Using local storage (GHES 3.16+)](#using-local-storage-ghes-316-1).
### Migrating repositories with customer-owned blob storage
{% data reusables.enterprise-migration-tool.blob-storage-intro %}
### Setting up an AWS S3 storage bucket
@@ -419,6 +515,7 @@ gh gei generate-script --github-source-org SOURCE \
| `--target-api-url TARGET-API-URL` | {% data reusables.enterprise-migration-tool.add-target-api-url %} |
| `--no-ssl-verify` | {% data reusables.enterprise-migration-tool.ssl-flag %} |
| `--download-migration-logs` | Download the migration log for each migrated repository. For more information about migration logs, see [AUTOTITLE](/migrations/using-github-enterprise-importer/completing-your-migration-with-github-enterprise-importer/accessing-your-migration-logs-for-github-enterprise-importer#downloading-all-migration-logs-for-an-organization). |
| `--use-github-storage`| Perform a repository migration using {% data variables.product.prodname_ghos %} as the intermediate blob storage solution. |
### Reviewing the migration script
@@ -495,6 +592,7 @@ gh gei migrate-repo --github-source-org SOURCE --source-repo CURRENT-NAME --gith
| `--no-ssl-verify` | {% data reusables.enterprise-migration-tool.ssl-flag %} |
| `--skip-releases` | {% data reusables.enterprise-migration-tool.skip-releases %} |
| `--target-repo-visibility TARGET-VISIBILITY` | {% data reusables.enterprise-migration-tool.set-repository-visibility %} |
| `--use-github-storage`| Perform a repository migration using {% data variables.product.prodname_ghos %} as the intermediate blob storage solution. |
#### Aborting the migration

View File

@@ -94,6 +94,15 @@ You will first generate an archive of the data you want to migrate and push the
Before you can run a migration, you need to set up a storage container with your chosen cloud provider to store your data.
### Using {% data variables.product.prodname_ghos %}
> [!NOTE]
> Repository migrations with {% data variables.product.prodname_ghos %} are currently in {% data variables.release-phases.public_preview %} and subject to change.
If you do not want to set up and provide {% data variables.product.prodname_importer_proper_name %} with access to a blob storage account behind your firewall, you can migrate repositories with {% data variables.product.prodname_ghos %} using the `--use-github-storage` flag. To do so, you must be running v1.9.0 (or higher) of {% data variables.product.prodname_bbs2gh_cli %}.
For security purposes, {% data variables.product.prodname_ghos %} is explicitly write-only, and downloads from {% data variables.product.prodname_ghos %} are not possible. After a migration is complete, the repository archives are immediately deleted. If an archive is uploaded and not used in a migration, the archive is deleted after 7 days.
### Setting up an AWS S3 storage bucket
{% data reusables.enterprise-migration-tool.set-up-aws-bucket %}
@@ -143,10 +152,12 @@ gh bbs2gh migrate-repo --bbs-server-url BBS-SERVER-URL \
--ssh-user SSH-USER --ssh-private-key PATH-TO-KEY
# If your Bitbucket Server instance runs on Windows:
--smb-user SMB-USER
# If you're using AWS S3 as your blob storage provider:
# If you are using AWS S3 as your blob storage provider:
--aws-bucket-name AWS-BUCKET-NAME
# If you are running a Bitbucket Data Center cluster or your Bitbucket Server is behind a load balancer:
--archive-download-host ARCHIVE-DOWNLOAD-HOST
# If you are using GitHub owned blob storage:
--use-github-storage
```
{% data reusables.enterprise-migration-tool.placeholder-table %}
@@ -208,7 +219,7 @@ gh bbs2gh migrate-repo --archive-path ARCHIVE-PATH \
--bbs-server-url BBS-SERVER-URL \
--bbs-project PROJECT \
--bbs-repo CURRENT-NAME \
# If you're using AWS S3 as your blob storage provider:
# If you are using AWS S3 as your blob storage provider:
--aws-bucket-name AWS-BUCKET-NAME
# If you are migrating to {% data variables.enterprise.data_residency_site %}:
--target-api-url TARGET-API-URL
@@ -258,6 +269,8 @@ gh bbs2gh generate-script --bbs-server-url BBS-SERVER-URL \
--smb-user SMB-USER
# If you are running a Bitbucket Data Center cluster or your Bitbucket Server is behind a load balancer:
--archive-download-host ARCHIVE-DOWNLOAD-HOST
# If you are using GitHub owned blob storage:
--use-github-storage
```
{% data reusables.enterprise-migration-tool.download-migration-logs-flag %}

View File

@@ -3,4 +3,5 @@ You must store your repository data in a place that {% data variables.product.pr
First, you must set up blob storage with a supported provider. Then, if you're using a cloud provider, you must configure your credentials for the storage provider in the {% data variables.enterprise.management_console %} or {% data variables.product.prodname_cli %}.
{% data reusables.enterprise-migration-tool.supported-blob-storage-providers %}
* Local storage on the GHES instance (GHES **3.16** and later)
* Local storage on the GHES instance (GHES **3.16** and later). We recommend using this option with {% data variables.product.prodname_ghos %}.

View File

@@ -1,4 +1,7 @@
When you run a migration with local storage, archive data is written to the disk on {% data variables.location.product_location_enterprise %}, without the need for a cloud storage provider. {% data variables.product.prodname_importer_proper_name %} will automatically retrieve the stored archive from {% data variables.product.prodname_ghe_server %}, unless you have blocked egress traffic from {% data variables.product.prodname_ghe_server %}.
When you run a migration with local storage, archive data is written to the disk on {% data variables.location.product_location_enterprise %}, without the need for a cloud storage provider.
* If you do not have firewall rules blocking egress traffic from {% data variables.product.prodname_ghe_server %}, {% data variables.product.prodname_importer_proper_name %} can automatically retrieve the stored archive from {% data variables.product.prodname_ghe_server %}.
* If you do have firewall rules in place and don't want to allow access to {% data variables.product.prodname_importer_proper_name %}, you can upload your archive data to {% data variables.product.prodname_ghos %} for {% data variables.product.prodname_importer_proper_name %} to access. To do so manually, see [Uploading your migration archives to GitHub-owned blob storage](/migrations/using-github-enterprise-importer/migrating-between-github-products/migrating-repositories-from-github-enterprise-server-to-github-enterprise-cloud?tool=api#uploading-your-migration-archives-to-github-owned-blob-storage) in the API version of this article.
1. From an administrative account on {% data variables.product.prodname_ghe_server %}, in the upper-right corner of any page, click {% octicon "rocket" aria-label="Site admin" %}.
{% data reusables.enterprise_site_admin_settings.management-console %}

View File

@@ -51,6 +51,7 @@ prodname_ado2gh_cli_short: ADO2GH extension
prodname_bbs2gh: BBS2GH
prodname_bbs2gh_cli: BBS2GH extension of the GitHub CLI
prodname_bbs2gh_cli_short: BBS2GH extension
prodname_ghos: GitHub-owned blob storage
# GitHub Education
prodname_education: 'GitHub Education'