docs/content/admin/enterprise-management/configuring-clustering/rebalancing-cluster-workloads.md at 52937ae5cae45a71b748db260f0de9c15f680adf

mirror of synced 2025-12-19 09:57:42 -05:00

Files

Laura Coursen 52937ae5ca GitHub Enterprise Server 3.9 release candidate (#36631 )

Co-authored-by: Rachael Sewell <rachmari@github.com>
Co-authored-by: Rachael Rose Renk <91027132+rachaelrenk@users.noreply.github.com>
Co-authored-by: David Jarzebowski <davidjarzebowski@github.com>
Co-authored-by: Anne-Marie <102995847+am-stead@users.noreply.github.com>
Co-authored-by: Matt Pollard <mattpollard@users.noreply.github.com>
Co-authored-by: Steve Guntrip <stevecat@github.com>
Co-authored-by: Isaac Brown <101839405+isaacmbrown@users.noreply.github.com>
Co-authored-by: Sam Browning <106113886+sabrowning1@users.noreply.github.com>
Co-authored-by: Torsten Walter <torstenwalter@github.com>
Co-authored-by: Henry Mercer <henrymercer@github.com>
Co-authored-by: Sarah Edwards <skedwards88@github.com>

2023-06-08 17:40:16 +00:00

4.7 KiB

Raw Blame History

title, shortTitle, intro, product, permissions, versions, type, topics

title

shortTitle

intro

product

permissions

versions

type

topics

Rebalancing cluster workloads

Rebalance workloads

You can force your {% data variables.product.product_name %} cluster to evenly distribute job allocations for workloads on the cluster's nodes.

{% data reusables.gated-features.cluster %}

People with administrative SSH access to a {% data variables.product.product_name %} instance can rebalance cluster workloads on the instance.

feature
cluster-rebalancing

how_to

Clustering

Enterprise

About workload balance for a {% data variables.product.product_name %} cluster

A {% data variables.product.product_name %} instance in a cluster configuration assigns each task to a node according to the node's role. This assignment is called an allocation.

If a cluster node is unreachable by other nodes due to a hardware or software failure, your instance creates a new allocation to distribute jobs from the unhealthy node to another node that can handle the workload. In some situations, this distribution does not occur automatically, and a single node may run more jobs than expected.

You can manage allocations using the ghe-cluster-balance utility, which can display the status of existing allocations or force your instance to balance allocations. For example, you should balance allocations after you add a new node to the cluster. Optionally, you can schedule regular balancing.

You can run the following commands from any node in your cluster using the administrative shell. For more information, see "Accessing the administrative shell (SSH)."

Checking the distribution of cluster jobs

In some cases, such as hardware failure, the underlying software that that manages allocations will migrate tasks from the unhealthy node to a healthy node. If the unhealthy node recovers, the task may remain assigned to the recovered node, which can result in unbalanced load. The risk of job failure may increase if allocations are unbalanced and additional nodes fail. You can check the distribution of allocations using the ghe-cluster-balance status utility.

To see a list of allocations, run the following command. The utility displays healthy allocations in green. If any jobs are not properly distributed, the utility displays the allocation's count in red.
```
ghe-cluster-balance status
```
If a job is not properly distributed, inspect the allocations by running the following command. Replace JOB with a single job or comma-delimited list of jobs.
```
 ghe-cluster-balance status -j JOB
```
For example, to see the status of allocations for your instance's HTTP server and authorization service, you can run ghe-cluster-balance status -j github-unicorn,authzd.

Rebalancing allocations

After you determine which jobs are unbalanced across your cluster's nodes, you can rebalance allocations using the ghe-cluster-balance rebalance utility. The utility checks the distribution of existing jobs. If any jobs are unbalanced, the utility displays the jobs and prompts you to continue. If you continue, the utility creates new allocations to redistribute the jobs.

To perform a dry run and see the result of rebalancing without making changes, run the following command. Replace JOB with a single job or comma-delimited list of jobs.
```
ghe-cluster-balance rebalance --dry-run -j JOB
```
For example, to perform a dry run of rebalancing jobs for your instance's HTTP server and authorization service, you can run ghe-cluster-balance rebalance --dry-run -j github-unicorn,authzd.
To rebalance, run the following command. Replace JOB with a single job or comma-delimited list of jobs.
```
ghe-cluster-balance rebalance -j JOB
```

Scheduling allocation rebalancing

You can schedule rebalancing of jobs on your cluster by setting and applying configuration values for {% data variables.location.product_location %}.

{% note %}

Note: Currently, you can only schedule reallocation of jobs for the HTTP server, github-unicorn.

{% endnote %}

To configure automatic, hourly balancing of jobs, run the following command.
```
ghe-config app.cluster-rebalance.enabled true
```
Optionally, you can override the default schedule by defining a cron expression. For example, run the following command to balance jobs every three hours.
```
ghe-config app.cluster-rebalance.schedule '0 */3 * * *'
```

{% data reusables.enterprise.apply-configuration %}

4.7 KiB Raw Blame History

About workload balance for a {% data variables.product.product_name %} cluster

Checking the distribution of cluster jobs

Rebalancing allocations

Scheduling allocation rebalancing

Further reading

4.7 KiB

Raw Blame History