docs/content/rest/models/inference.md

---
title: REST API endpoints for models inference
shortTitle: Inference
intro: Use the REST API to submit a chat completion request to a specified model, with or without organizational attribution.
versions: # DO NOT MANUALLY EDIT. CHANGES WILL BE OVERWRITTEN BY A 🤖
  fpt: '*'
topics:
  - API
autogenerated: rest
allowTitleToDifferFromFilename: true
---

## About {% data variables.product.prodname_github_models %} inference

You can use the REST API to run inference requests using the {% data variables.product.prodname_github_models %} platform. The API requires the `models: read` scope when using a {% data variables.product.pat_v2 %} or when authenticating using a {% data variables.product.prodname_github_app %}.

The API supports:

* Accessing top models from OpenAI, DeepSeek, Microsoft, Llama, and more.
* Running chat-based inference requests with full control over sampling and response parameters.
* Streaming or non-streaming completions.
* Organizational attribution and usage tracking.

<!-- Content after this section is automatically generated -->