docs/content/github-models/use-github-models/prototyping-with-ai-models.md at 75e7ad5f4bb029bb88aa2c8d8364aaa03a87ccc5

mirror of synced 2025-12-19 18:10:59 -05:00

Files

Sunbrye Ly 75e7ad5f4b [Hotfix] Update Going to Production in GitHub Models (#56312 )

Co-authored-by: bkoshea <78511330+bkoshea@users.noreply.github.com>

2025-06-25 22:09:15 +00:00

17 KiB

Raw Blame History

title, shortTitle, intro, versions, redirect_from

title

shortTitle

intro

versions

redirect_from

Prototyping with AI models

Prototype with AI models

Find and experiment with AI models for free.

feature
github-models

/github-models/prototyping-with-ai-models

If you want to develop a generative AI application, you can use {% data variables.product.prodname_github_models %} to find and experiment with AI models for free. Once you are ready to bring your application to production, you can switch to a token from a paid Azure account. See the Azure AI documentation.

Organization owners can integrate their preferred custom models into {% data variables.product.prodname_github_models %}, by using an organization's own LLM API keys. See AUTOTITLE.

Finding AI models

To find an AI model:

{% data reusables.models.steps-to-open-model-playground %}

The model is opened in the model playground. Details of the model are displayed in the sidebar on the right. If the sidebar is not displayed, expand it by clicking the {% octicon "sidebar-expand" aria-label="Show parameters setting" %} icon at the right of the playground.

[!NOTE] Access to OpenAI's models is in {% data variables.release-phases.public_preview %} and subject to change.

Experimenting with AI models in the playground

The AI model playground is a free resource that allows you to adjust model parameters and submit prompts to see how a model responds.

Note

The model playground is in {% data variables.release-phases.public_preview %} and subject to change.

The playground is rate limited. See Rate limits below.

To adjust parameters for the model, in the playground, select the Parameters tab in the sidebar.

To see code that corresponds to the parameters that you selected, switch from the Chat tab to the Code tab.

Comparing models

You can submit a prompt to two models at the same time and compare the responses.

With one model open in the playground, click Compare, then, in the dropdown menu, select a model for comparison. The selected model opens in a second chat window. When you type a prompt in either chat window, the prompt is mirrored to the other window. The prompts are submitted simultaneously so that you can compare the responses from each model.

Any parameters you set are used for both models.

Evaluating AI models

Once you've started testing prompts in the playground, you can evaluate model performance using structured metrics. Evaluations help you compare multiple prompt configurations across different models and determine which setup performs best.

In the Comparisons view, you can apply evaluators like similarity, relevance, and groundedness to measure how well each output meets your expectations. You can also define your own evaluation criteria with a custom prompt evaluator.

For step-by-step instructions, see Evaluating outputs.

Experimenting with AI models using the API

Note

The free API usage is in {% data variables.release-phases.public_preview %} and subject to change.

{% data variables.product.company_short %} provides free API usage so that you can experiment with AI models in your own application.

The steps to use each model are similar. In general, you will need to:

{% data reusables.models.steps-to-open-model-playground %}

The model opens in the model playground.

Click the Code tab.
Optionally, use the language dropdown to select the programming language.
Optionally, use the SDK dropdown to select which SDK to use.

All models can be used with the Azure AI Inference SDK, and some models support additional SDKs. If you want to easily switch between models, you should select "Azure AI Inference SDK." If you selected "REST" as the language, you won't use an SDK. Instead, you will use the API endpoint directly. {% ifversion fpt %} See {% data variables.product.prodname_github_models %} REST API. {% endif %}
Either open a codespace, or set up your local environment:
- To run in a codespace, click {% octicon "codespaces" aria-hidden="true" aria-label="codespaces" %} Run codespace, then click Create new codespace.
- To run locally:
  - Create a {% data variables.product.company_short %} {% data variables.product.pat_generic %}. The token needs to have models:read permissions. See AUTOTITLE.
  - Save your token as an environment variable.
  - Install the dependencies for the SDK, if required.
Use the example code to make a request to the model.

The free API usage is rate limited. See Rate limits below.

You can save and share your progress in the playground with presets. Presets save:

Your current state
Your parameters
Your chat history (optional)

To create a preset for your current context, select Preset: PRESET-NAME {% octicon "triangle-down" aria-hidden="true" aria-label="triangle-down" %} at the top right of the playground, then click {% octicon "plus" aria-hidden="true" aria-label="plus" %} Create new preset. You need to name your preset, and you can also choose to provide a preset description, include your chat history, and allow your preset to be shared.

There are two ways to load a preset:

Select the Preset: PRESET-NAME {% octicon "triangle-down" aria-hidden="true" aria-label="triangle-down" %} dropdown menu, then click the preset you want to load.
Open a shared preset URL

After you load a preset, you can edit, share, or delete the preset:

To edit the preset, change the parameters and prompt the model. Once you are satisfied with your changes, select the Preset: PRESET-NAME {% octicon "triangle-down" aria-hidden="true" aria-label="triangle-down" %} dropdown menu, then click {% octicon "pencil" aria-hidden="true" aria-label="pencil" %} Edit preset and save your updates.
To share the preset, select the Preset: PRESET-NAME {% octicon "triangle-down" aria-hidden="true" aria-label="triangle-down" %} dropdown menu, then click {% octicon "share" aria-hidden="true" aria-label="share" %} Share preset to get a shareable URL.
To delete the preset, select the Preset: PRESET-NAME {% octicon "triangle-down" aria-hidden="true" aria-label="triangle-down" %} dropdown menu, then click {% octicon "trash" aria-hidden="true" aria-label="trash" %} Delete preset and confirm the deletion.

Using the prompt editor

The prompt editor in {% data variables.product.prodname_github_models %} is designed to help you iterate, refine, and perfect your prompts. This dedicated view provides a focused and intuitive experience for crafting and testing inputs, enabling you to:

Quickly test and refine prompts without the complexity of multi-turn interactions.
Fine-tune prompts for precision and relevance in your projects.
Use a specialized space for single-turn scenarios to ensure consistent and optimized results.

To access the prompt editor, click {% octicon "stack" aria-hidden="true" aria-label="stack" %} Prompt editor at the top right of the playground.

Experimenting with AI models in {% data variables.product.prodname_vscode %}

[!NOTE] The AI Toolkit extension for {% data variables.product.prodname_vscode %} is in {% data variables.release-phases.public_preview %} and is subject to change.

If you prefer to experiment with AI models in your IDE, you can install the AI Toolkit extension for {% data variables.product.prodname_vscode %}, then test models with adjustable parameters and context.

In {% data variables.product.prodname_vscode %}, install the pre-release version of the AI Toolkit for {% data variables.product.prodname_vscode %}.
To open the extension, click the AI Toolkit icon in the activity bar.
Authorize the AI Toolkit to connect to your {% data variables.product.prodname_dotcom %} account.
In the "My models" section of the AI Toolkit panel, click Open Model Catalog, then find a model to experiment with.
- To use a model hosted remotely through {% data variables.product.prodname_github_models %}, on the model card, click Try in playground.
- To download and use a model locally, on the model card, click Download. Once the download is complete, on the same model card, click Load in playground.
In the sidebar, provide any context instructions and inference parameters for the model, then send a prompt.

Going to production

The free rate limits provided in the playground and API usage are intended to help you get started with experimentation. When you are ready to move beyond the free offering, you have two options for accessing AI models beyond the free limits:

You can opt in to paid usage for {% data variables.product.prodname_github_models %}, allowing your organization to access increased rate limits, larger context windows, and additional features. See AUTOTITLE.
If you have an existing OpenAI or Azure subscription, you can bring your own API keys (BYOK) to access custom models. Billing and usage are managed directly through your provider account, such as your Azure Subscription ID. See AUTOTITLE.

Rate limits

{% data reusables.github-models.production-rate-limits-note %}

The playground and free API usage are rate limited by requests per minute, requests per day, tokens per request, and concurrent requests. If you get rate limited, you will need to wait for the rate limit that you hit to reset before you can make more requests.

Low, high, and embedding models have different rate limits. To see which type of model you are using, refer to the model's information in {% data variables.product.prodname_marketplace %}.

For custom models accessed with your own API keys, rate limits are set and enforced by your model provider.

Rate limit tier	Rate limits	Copilot Free	Copilot Pro	Copilot Business	Copilot Enterprise
Low	Requests per minute	15	15	15	20
	Requests per day	150	150	300	450
	Tokens per request	8000 in, 4000 out	8000 in, 4000 out	8000 in, 4000 out	8000 in, 8000 out
	Concurrent requests	5	5	5	8
High	Requests per minute	10	10	10	15
	Requests per day	50	50	100	150
	Tokens per request	8000 in, 4000 out	8000 in, 4000 out	8000 in, 4000 out	16000 in, 8000 out
	Concurrent requests	2	2	2	4
Embedding	Requests per minute	15	15	15	20
	Requests per day	150	150	300	450
	Tokens per request	64000	64000	64000	64000
	Concurrent requests	5	5	5	8
Azure OpenAI o1-preview	Requests per minute	Not applicable	1	2	2
	Requests per day	Not applicable	8	10	12
	Tokens per request	Not applicable	4000 in, 4000 out	4000 in, 4000 out	4000 in, 8000 out
	Concurrent requests	Not applicable	1	1	1
Azure OpenAI o1 and o3	Requests per minute	Not applicable	1	2	2
	Requests per day	Not applicable	8	10	12
	Tokens per request	Not applicable	4000 in, 4000 out	4000 in, 4000 out	4000 in, 8000 out
	Concurrent requests	Not applicable	1	1	1
Azure OpenAI o1-mini, o3-mini, and o4-mini	Requests per minute	Not applicable	2	3	3
	Requests per day	Not applicable	12	15	20
	Tokens per request	Not applicable	4000 in, 4000 out	4000 in, 4000 out	4000 in, 4000 out
	Concurrent requests	Not applicable	1	1	1
DeepSeek-R1, DeepSeek-R1-0528, and MAI-DS-R1	Requests per minute	1	1	2	2
	Requests per day	8	8	10	12
	Tokens per request	4000 in, 4000 out	4000 in, 4000 out	4000 in, 4000 out	4000 in, 4000 out
	Concurrent requests	1	1	1	1
xAI Grok-3	Requests per minute	1	1	2	2
	Requests per day	15	15	20	30
	Tokens per request	4000 in, 4000 out	4000 in, 4000 out	4000 in, 8000 out	4000 in, 16000 out
	Concurrent requests	1	1	1	1
xAI Grok-3-Mini	Requests per minute	2	2	3	3
	Requests per day	30	30	40	50
	Tokens per request	4000 in, 8000 out	4000 in, 8000 out	4000 in, 12000 out	4000 in, 12000 out
	Concurrent requests	1	1	1	1

These limits are subject to change without notice.

Leaving feedback

To ask questions and share feedback, see this GitHub Models discussion post. To learn how others are using {% data variables.product.prodname_github_models %}, visit the GitHub Community discussions for Models.

17 KiB Raw Blame History