[Hotfix] Update Going to Production in GitHub Models (#56312)
Co-authored-by: bkoshea <78511330+bkoshea@users.noreply.github.com>
This commit is contained in:
@@ -59,7 +59,7 @@ For accounts that use a custom model with a third-party model provider, billing
|
|||||||
|
|
||||||
## Opting in to paid usage
|
## Opting in to paid usage
|
||||||
|
|
||||||
> [!NOTE] Once you opt in to paid usage, you will have access to production grade rate limits and be billed for all usage thereafter. For more information about these rate limits, see [Azure AI Foundry Models quotas and limits](https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/quotas-limits) in the Azure documentation.
|
{% data reusables.github-models.production-rate-limits-note %}
|
||||||
|
|
||||||
Enterprises and organizations can opt in to paid usage to access expanded model capabilities, including increased request allowances and larger context windows. You can manage their spending by setting a budget.
|
Enterprises and organizations can opt in to paid usage to access expanded model capabilities, including increased request allowances and larger context windows. You can manage their spending by setting a budget.
|
||||||
|
|
||||||
|
|||||||
@@ -73,7 +73,7 @@ The steps to use each model are similar. In general, you will need to:
|
|||||||
1. Optionally, use the language dropdown to select the programming language.
|
1. Optionally, use the language dropdown to select the programming language.
|
||||||
1. Optionally, use the SDK dropdown to select which SDK to use.
|
1. Optionally, use the SDK dropdown to select which SDK to use.
|
||||||
|
|
||||||
All models can be used with the Azure AI Inference SDK, and some models support additional SDKs. If you want to easily switch between models, you should select "Azure AI Inference SDK". If you selected "REST" as the language, you won't use an SDK. Instead, you will use the API endpoint directly. {% ifversion fpt %} See [{% data variables.product.prodname_github_models %} REST API](/rest/models?apiVersion=2022-11-28). {% endif %}
|
All models can be used with the Azure AI Inference SDK, and some models support additional SDKs. If you want to easily switch between models, you should select "Azure AI Inference SDK." If you selected "REST" as the language, you won't use an SDK. Instead, you will use the API endpoint directly. {% ifversion fpt %} See [{% data variables.product.prodname_github_models %} REST API](/rest/models?apiVersion=2022-11-28). {% endif %}
|
||||||
1. Either open a codespace, or set up your local environment:
|
1. Either open a codespace, or set up your local environment:
|
||||||
* To run in a codespace, click **{% octicon "codespaces" aria-hidden="true" aria-label="codespaces" %} Run codespace**, then click **Create new codespace**.
|
* To run in a codespace, click **{% octicon "codespaces" aria-hidden="true" aria-label="codespaces" %} Run codespace**, then click **Create new codespace**.
|
||||||
* To run locally:
|
* To run locally:
|
||||||
@@ -131,16 +131,20 @@ If you prefer to experiment with AI models in your IDE, you can install the AI T
|
|||||||
|
|
||||||
## Going to production
|
## Going to production
|
||||||
|
|
||||||
The rate limits for the playground and free API usage are intended to help you experiment with models and develop your AI application. Once you are ready to bring your application to production, you can use a token from a paid Azure account instead of your {% data variables.product.company_short %} {% data variables.product.pat_generic %}. You don't need to change anything else in your code.
|
The free rate limits provided in the playground and API usage are intended to help you get started with experimentation. When you are ready to move beyond the free offering, you have two options for accessing AI models beyond the free limits:
|
||||||
|
* You can opt in to paid usage for {% data variables.product.prodname_github_models %}, allowing your organization to access increased rate limits, larger context windows, and additional features. See [AUTOTITLE](/billing/managing-billing-for-your-products/about-billing-for-github-models).
|
||||||
For more information, see the [Azure AI](https://aka.ms/azureai/github-models) documentation.
|
* If you have an existing OpenAI or Azure subscription, you can bring your own API keys (BYOK) to access custom models. Billing and usage are managed directly through your provider account, such as your Azure Subscription ID. See [AUTOTITLE](/github-models/github-models-at-scale/set-up-custom-model-integration-models-byok).
|
||||||
|
|
||||||
## Rate limits
|
## Rate limits
|
||||||
|
|
||||||
|
{% data reusables.github-models.production-rate-limits-note %}
|
||||||
|
|
||||||
The playground and free API usage are rate limited by requests per minute, requests per day, tokens per request, and concurrent requests. If you get rate limited, you will need to wait for the rate limit that you hit to reset before you can make more requests.
|
The playground and free API usage are rate limited by requests per minute, requests per day, tokens per request, and concurrent requests. If you get rate limited, you will need to wait for the rate limit that you hit to reset before you can make more requests.
|
||||||
|
|
||||||
Low, high, and embedding models have different rate limits. To see which type of model you are using, refer to the model's information in {% data variables.product.prodname_marketplace %}.
|
Low, high, and embedding models have different rate limits. To see which type of model you are using, refer to the model's information in {% data variables.product.prodname_marketplace %}.
|
||||||
|
|
||||||
|
For custom models accessed with your own API keys, rate limits are set and enforced by your model provider.
|
||||||
|
|
||||||
<table>
|
<table>
|
||||||
<tr>
|
<tr>
|
||||||
<th scope="col" style="width:15%"><b>Rate limit tier</b></th>
|
<th scope="col" style="width:15%"><b>Rate limit tier</b></th>
|
||||||
|
|||||||
@@ -0,0 +1 @@
|
|||||||
|
> [!NOTE] Once you opt in to paid usage, you will have access to production grade rate limits and be billed for all usage thereafter. For more information about these rate limits, see [Azure AI Foundry Models quotas and limits](https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/quotas-limits) in the Azure documentation.
|
||||||
Reference in New Issue
Block a user