diff --git a/content/billing/managing-billing-for-your-products/about-billing-for-github-models.md b/content/billing/managing-billing-for-your-products/about-billing-for-github-models.md index 77421a9bc9..524b79b979 100644 --- a/content/billing/managing-billing-for-your-products/about-billing-for-github-models.md +++ b/content/billing/managing-billing-for-your-products/about-billing-for-github-models.md @@ -59,7 +59,7 @@ For accounts that use a custom model with a third-party model provider, billing ## Opting in to paid usage -> [!NOTE] Once you opt in to paid usage, you will have access to production grade rate limits and be billed for all usage thereafter. For more information about these rate limits, see [Azure AI Foundry Models quotas and limits](https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/quotas-limits) in the Azure documentation. +{% data reusables.github-models.production-rate-limits-note %} Enterprises and organizations can opt in to paid usage to access expanded model capabilities, including increased request allowances and larger context windows. You can manage their spending by setting a budget. diff --git a/content/github-models/use-github-models/prototyping-with-ai-models.md b/content/github-models/use-github-models/prototyping-with-ai-models.md index eaf1aa9196..05a49aa606 100644 --- a/content/github-models/use-github-models/prototyping-with-ai-models.md +++ b/content/github-models/use-github-models/prototyping-with-ai-models.md @@ -73,7 +73,7 @@ The steps to use each model are similar. In general, you will need to: 1. Optionally, use the language dropdown to select the programming language. 1. Optionally, use the SDK dropdown to select which SDK to use. - All models can be used with the Azure AI Inference SDK, and some models support additional SDKs. If you want to easily switch between models, you should select "Azure AI Inference SDK". If you selected "REST" as the language, you won't use an SDK. Instead, you will use the API endpoint directly. {% ifversion fpt %} See [{% data variables.product.prodname_github_models %} REST API](/rest/models?apiVersion=2022-11-28). {% endif %} + All models can be used with the Azure AI Inference SDK, and some models support additional SDKs. If you want to easily switch between models, you should select "Azure AI Inference SDK." If you selected "REST" as the language, you won't use an SDK. Instead, you will use the API endpoint directly. {% ifversion fpt %} See [{% data variables.product.prodname_github_models %} REST API](/rest/models?apiVersion=2022-11-28). {% endif %} 1. Either open a codespace, or set up your local environment: * To run in a codespace, click **{% octicon "codespaces" aria-hidden="true" aria-label="codespaces" %} Run codespace**, then click **Create new codespace**. * To run locally: @@ -131,16 +131,20 @@ If you prefer to experiment with AI models in your IDE, you can install the AI T ## Going to production -The rate limits for the playground and free API usage are intended to help you experiment with models and develop your AI application. Once you are ready to bring your application to production, you can use a token from a paid Azure account instead of your {% data variables.product.company_short %} {% data variables.product.pat_generic %}. You don't need to change anything else in your code. - -For more information, see the [Azure AI](https://aka.ms/azureai/github-models) documentation. +The free rate limits provided in the playground and API usage are intended to help you get started with experimentation. When you are ready to move beyond the free offering, you have two options for accessing AI models beyond the free limits: +* You can opt in to paid usage for {% data variables.product.prodname_github_models %}, allowing your organization to access increased rate limits, larger context windows, and additional features. See [AUTOTITLE](/billing/managing-billing-for-your-products/about-billing-for-github-models). +* If you have an existing OpenAI or Azure subscription, you can bring your own API keys (BYOK) to access custom models. Billing and usage are managed directly through your provider account, such as your Azure Subscription ID. See [AUTOTITLE](/github-models/github-models-at-scale/set-up-custom-model-integration-models-byok). ## Rate limits +{% data reusables.github-models.production-rate-limits-note %} + The playground and free API usage are rate limited by requests per minute, requests per day, tokens per request, and concurrent requests. If you get rate limited, you will need to wait for the rate limit that you hit to reset before you can make more requests. Low, high, and embedding models have different rate limits. To see which type of model you are using, refer to the model's information in {% data variables.product.prodname_marketplace %}. +For custom models accessed with your own API keys, rate limits are set and enforced by your model provider. +
| Rate limit tier | diff --git a/data/reusables/github-models/production-rate-limits-note.md b/data/reusables/github-models/production-rate-limits-note.md new file mode 100644 index 0000000000..9244066066 --- /dev/null +++ b/data/reusables/github-models/production-rate-limits-note.md @@ -0,0 +1 @@ +> [!NOTE] Once you opt in to paid usage, you will have access to production grade rate limits and be billed for all usage thereafter. For more information about these rate limits, see [Azure AI Foundry Models quotas and limits](https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/quotas-limits) in the Azure documentation.
|---|