1
0
mirror of synced 2025-12-19 18:10:59 -05:00
Files
docs/content/code-security/codeql-cli/getting-started-with-the-codeql-cli/analyzing-your-code-with-codeql-queries.md

27 KiB
Raw Blame History

title, intro, product, shortTitle, versions, topics, redirect_from
title intro product shortTitle versions topics redirect_from
Analyzing your code with CodeQL queries You can run queries against a {% data variables.product.prodname_codeql %} database extracted from a codebase. {% data reusables.gated-features.codeql %} Analyzing code
fpt ghes ghae ghec
* * * *
Advanced Security
Code scanning
CodeQL
/code-security/codeql-cli/analyzing-databases-with-the-codeql-cli
/code-security/codeql-cli/using-the-codeql-cli/analyzing-databases-with-the-codeql-cli

About analyzing databases with the {% data variables.product.prodname_codeql_cli %}

{% data reusables.code-scanning.codeql-cli-version-ghes %}

To analyze a codebase, you run queries against a {% data variables.product.prodname_codeql %} database extracted from the code.

{% data variables.product.prodname_codeql %} analyses produce interpreted results that can be displayed as alerts or paths in source code. For information about writing queries to run with database analyze, see "Using custom queries with the {% data variables.product.prodname_codeql_cli %}."

{% note %}

Other query-running commands

Queries run with database analyze have strict metadata requirements. You can also execute queries using the following plumbing-level subcommands:

  • AUTOTITLE, which outputs non-interpreted results in an intermediate binary format called BQRS

  • AUTOTITLE, which will output BQRS files, or print results tables directly to the command line. Viewing results directly in the command line may be useful for iterative query development using the CLI.

Queries run with these commands don't have the same metadata requirements. However, to save human-readable data you have to process each BQRS results file using the AUTOTITLE plumbing subcommand. Therefore, for most use cases it's easiest to use database analyze to directly generate interpreted results.

{% endnote %}

Before starting an analysis you must:

The simplest way to run codeql database analyze is using {% data variables.product.prodname_codeql %} packs. You can also run the command using queries from a local checkout of the {% data variables.product.prodname_codeql %} repository, which you may want to do if you want to customize the {% data variables.product.prodname_codeql %} core queries.

Running codeql database analyze

When you run database analyze, it:

  1. Optionally downloads any referenced {% data variables.product.prodname_codeql %} packages that are not available locally.
  2. Executes one or more query files, by running them over a {% data variables.product.prodname_codeql %} database.
  3. Interprets the results, based on certain query metadata, so that alerts can be displayed in the correct location in the source code.
  4. Reports the results of any diagnostic and summary queries to standard output.

You can analyze a database by running the following command:

codeql database analyze <database> --format=<format> --output=<output> <query-specifiers>...

{% note %}

Note: If you analyze more than one {% data variables.product.prodname_codeql %} database for a single commit, you must specify a SARIF category for each set of results generated by this command. When you upload the results to {% data variables.product.product_name %}, {% data variables.product.prodname_code_scanning %} uses this category to store the results for each language separately. If you forget to do this, each upload overwrites the previous results.

codeql database analyze <database> --format=<format> \
    --sarif-category=<language-specifier> --output=<output> \
    {% ifversion codeql-packs %}<packs,queries>{% else %}<queries>{% endif %}

{% endnote %}

You must specify <database>, --format, and --output. You can specify additional options depending on what analysis you want to do.

Option Required Usage
<database> {% octicon "check" aria-label="Required" %} Specify the path for the directory that contains the {% data variables.product.prodname_codeql %} database to analyze.
<packs,queries> {% octicon "x" aria-label="Optional" %} Specify {% data variables.product.prodname_codeql %} packs or queries to run. To run the standard queries used for {% data variables.product.prodname_code_scanning %}, omit this parameter. To see the other query suites included in the {% data variables.product.prodname_codeql_cli %} bundle, look in /<extraction-root>/qlpacks/codeql/<language>-queries/codeql-suites. For information about creating your own query suite, see AUTOTITLE in the documentation for the {% data variables.product.prodname_codeql_cli %}.
--format {% octicon "check" aria-label="Required" %} Specify the format for the results file generated during analysis. A number of different formats are supported, including CSV, SARIF, and graph formats. For upload to {% data variables.product.company_short %} this should be: {% ifversion fpt or ghae or ghec %}sarif-latest{% else %}sarifv2.1.0{% endif %}. For more information, see "AUTOTITLE."
--output {% octicon "check" aria-label="Required" %} Specify where to save the SARIF results file.
--sarif-category {% octicon "question" aria-label="Required with multiple results sets" %} Optional for single database analysis. Required to define the language when you analyze multiple databases for a single commit in a repository.

Specify a category to include in the SARIF results file for this analysis. A category is used to distinguish multiple analyses for the same tool and commit, but performed on different languages or different parts of the code.
--sarif-add-baseline-file-info {% octicon "x" aria-label="Optional" %} Recommended. Use to submit file coverage information to the {% data variables.code-scanning.tool_status_page %}. For more information, see "AUTOTITLE."
--sarif-add-query-help {% octicon "x" aria-label="Optional" %} Use if you want to include any available markdown-rendered query help for custom queries used in your analysis. Any query help for custom queries included in the SARIF output will be displayed in the code scanning UI if the relevant query generates an alert. For more information, see "Including query help for custom {% data variables.product.prodname_codeql %} queries in SARIF files."{% ifversion codeql-packs %}
<packs> {% octicon "x" aria-label="Optional" %} Use if you want to include {% data variables.product.prodname_codeql %} query packs in your analysis. For more information, see "Downloading and using {% data variables.product.prodname_codeql %} query packs."
--download {% octicon "x" aria-label="Optional" %} Use if some of your {% data variables.product.prodname_codeql %} query packs are not yet on disk and need to be downloaded before running queries.{% endif %}
--threads {% octicon "x" aria-label="Optional" %} Use if you want to use more than one thread to run queries. The default value is 1. You can specify more threads to speed up query execution. To set the number of threads to the number of logical processors, specify 0.
--verbose {% octicon "x" aria-label="Optional" %} Use to get more detailed information about the analysis process and diagnostic data from the database creation process.

{% note %}

Upgrading databases

For databases that were created by {% data variables.product.prodname_codeql_cli %} v2.3.3 or earlier, you will need to explicitly upgrade the database before you can run an analysis with a newer version of the {% data variables.product.prodname_codeql_cli %}. If this step is necessary, then you will see a message telling you that your database needs to be upgraded when you run database analyze.

For databases that were created by {% data variables.product.prodname_codeql_cli %} v2.3.4 or later, the CLI will implicitly run any required upgrades. Explicitly running the upgrade command is not necessary.

{% endnote %}

For full details of all the options you can use when analyzing databases, see "AUTOTITLE."

Basic example of analyzing a {% data variables.product.prodname_codeql %} database

This example analyzes a {% data variables.product.prodname_codeql %} database stored at /codeql-dbs/example-repo and saves the results as a SARIF file: /temp/example-repo-js.sarif. It uses --sarif-category to include extra information in the SARIF file that identifies the results as JavaScript. This is essential when you have more than one {% data variables.product.prodname_codeql %} database to analyze for a single commit in a repository.

$ codeql database analyze /codeql-dbs/example-repo \
    javascript-code-scanning.qls --sarif-category=javascript \
    --format={% ifversion fpt or ghae or ghec %}sarif-latest{% else %}sarifv2.1.0{% endif %} --output=/temp/example-repo-js.sarif

> Running queries.
> Compiling query plan for /codeql-home/codeql/qlpacks/codeql-javascript/AngularJS/DisablingSce.ql.
...
> Shutting down query evaluator.
> Interpreting results.

{% ifversion code-scanning-tool-status-page %}

Adding file coverage information to your results for monitoring

You can optionally submit file coverage information to {% data variables.product.product_name %} for display on the {% data variables.code-scanning.tool_status_page %} for {% data variables.product.prodname_code_scanning %}. For more information about file coverage information, see "AUTOTITLE."

To include file coverage information with your {% data variables.product.prodname_code_scanning %} results, add the --sarif-add-baseline-file-info flag to the codeql database analyze invocation in your CI system, for example:

$ codeql database analyze /codeql-dbs/example-repo \
    javascript-code-scanning.qls --sarif-category=javascript \
    --sarif-add-baseline-file-info \ --format={% ifversion fpt or ghae or ghec %}sarif-latest{% else %}sarifv2.1.0{% endif %} \
    --output=/temp/example-repo-js.sarif

{% endif %}

Examples of running database analyses

The following examples show how to run database analyze using {% data variables.product.prodname_codeql %} packs, and how to use a local checkout of the {% data variables.product.prodname_codeql %} repository. These examples assume your {% data variables.product.prodname_codeql %} databases have been created in a directory that is a sibling of your local copies of the {% data variables.product.prodname_codeql %} repository.

{% ifversion codeql-packs %}

Running a {% data variables.product.prodname_codeql %} query pack

{% note %}

Note

The {% data variables.product.prodname_codeql %} package management functionality, including {% data variables.product.prodname_codeql %} packs, is currently available as a beta release and is subject to change. During the beta release, {% data variables.product.prodname_codeql %} packs are available only using {% data variables.product.prodname_registry %} - the {% data variables.product.prodname_dotcom %} {% data variables.product.prodname_container_registry %}. To use this beta functionality, install the latest version of the {% data variables.product.prodname_codeql_cli %} bundle from: https://github.com/github/codeql-action/releases.

{% endnote %}

To run an existing {% data variables.product.prodname_codeql %} query pack from the {% data variables.product.prodname_dotcom %} {% data variables.product.prodname_container_registry %}, you can specify one or more pack names:

codeql database analyze <database> microsoft/coding-standards@1.0.0 github/security-queries --format=sarifv2.1.0 --output=query-results.sarif --download

This command runs the default query suite of two {% data variables.product.prodname_codeql %} query packs: microsoft/coding-standards version 1.0.0 and the latest version of github/security-queries on the specified database. For further information about default suites, see "AUTOTITLE."

The --download flag is optional. Using it will ensure the query pack is downloaded if it isnt yet available locally. {% endif %}

Running a single query

To run a single query over a {% data variables.product.prodname_codeql %} database for a JavaScript codebase, you could use the following command from the directory containing your database:

codeql database analyze --download <javascript-database> codeql/javascript-queries:Declarations/UnusedVariable.ql --format=csv --output=js-analysis/js-results.csv

This command runs a simple query that finds potential bugs related to unused variables, imports, functions, or classes—it is one of the JavaScript queries included in the {% data variables.product.prodname_codeql %} repository. You could run more than one query by specifying a space-separated list of similar paths.

The analysis generates a CSV file (js-results.csv) in a new directory (js-analysis).

Alternatively, if you have the {% data variables.product.prodname_codeql %} repository checked out, you can execute the same queries by specifying the path to the query directly:

codeql database analyze <javascript-database> ../ql/javascript/ql/src/Declarations/UnusedVariable.ql --format=csv --output=js-analysis/js-results.csv

You can also run your own custom queries with the database analyze command. For more information about preparing your queries to use with the {% data variables.product.prodname_codeql_cli %}, see "AUTOTITLE."

Running all queries in a directory

You can run all the queries located in a directory by providing the directory path, rather than listing all the individual query files. Paths are searched recursively, so any queries contained in subfolders will also be executed.

{% note %}

Important

You should avoid specifying the root of a core {% data variables.product.prodname_codeql %} query pack when executing database analyze as it might contain some special queries that arent designed to be used with the command. Rather, run the query pack to include the packs default queries in the analysis, or run one of the code scanning query suites.

{% endnote %}

For example, to execute all Python queries contained in the Functions directory in the codeql/python-queries query pack you would run:

codeql database analyze <python-database> codeql/python-queries:Functions --format=sarif-latest --output=python-analysis/python-results.sarif --download

Alternatively, if you have the {% data variables.product.prodname_codeql %} repository checked out, you can execute the same queries by specifying the path to the directory directly:

codeql database analyze <python-database> ../ql/python/ql/src/Functions/ --format=sarif-latest --output=python-analysis/python-results.sarif

When the analysis has finished, a SARIF results file is generated. Specifying --format=sarif-latest ensures that the results are formatted according to the most recent SARIF specification supported by {% data variables.product.prodname_codeql %}.

{% ifversion codeql-packs %}

Running a subset of queries in a {% data variables.product.prodname_codeql %} pack

If you are using {% data variables.product.prodname_codeql_cli %} v2.8.1 or later, you can include a path at the end of a pack specification to run a subset of queries inside the pack. This applies to any command that locates or runs queries within a pack.

The complete way to specify a set of queries is in the form scope/name@range:path, where:

  • scope/name is the qualified name of a {% data variables.product.prodname_codeql %} pack.

  • range is a semver range.

  • path is a file system path to a single query, a directory containing queries, or a query suite file.

When you specify a scope/name, the range and path are optional. If you omit a range then the latest version of the specified pack is used. If you omit a path then the default query suite of the specified pack is used.

The path can be one of a \*.ql query file, a directory containing one or more queries, or a .qls query suite file. If you omit a pack name, then you must provide a path, which will be interpreted relative to the working directory of the current process.

If you specify a scope/name and path, then the path cannot be absolute. It is considered relative to the root of the {% data variables.product.prodname_codeql %} pack.

To analyze a database using all queries in the experimental/Security folder within the codeql/cpp-queries {% data variables.product.prodname_codeql %} pack you can use:

codeql database analyze --format=sarif-latest --output=results <db> \
    codeql/cpp-queries:experimental/Security

To run the RedundantNullCheckParam.ql query in the codeql/cpp-queries {% data variables.product.prodname_codeql %} pack use:

codeql database analyze --format=sarif-latest --output=results <db> \
    'codeql/cpp-queries:experimental/Likely Bugs/RedundantNullCheckParam.ql'

To analyze your database using the cpp-security-and-quality.qls query suite from a version of the codeql/cpp-queries {% data variables.product.prodname_codeql %} pack that is >= 0.0.3 and < 0.1.0 (the highest compatible version will be chosen) you can use:

codeql database analyze --format=sarif-latest --output=results <db> \
   'codeql/cpp-queries@~0.0.3:codeql-suites/cpp-security-and-quality.qls'

If you need to reference a query file, directory, or suite whose path contains a literal @ or :, you can prefix the query specification with path: like so:

codeql database analyze --format=sarif-latest --output=results <db> \
    path:C:/Users/ci/workspace@2/security/query.ql

For more information about {% data variables.product.prodname_codeql %} packs, see AUTOTITLE. {% endif %}

Running query suites

To run a query suite on a {% data variables.product.prodname_codeql %} database for a C/C++ codebase, you could use the following command from the directory containing your database:

codeql database analyze <cpp-database> codeql/cpp-queries:codeql-suites/cpp-code-scanning.qls --format=sarifv2.1.0 --output=cpp-results.sarif --download

This command downloads the codeql/cpp-queries {% data variables.product.prodname_codeql %} query pack, runs the analysis, and generates a file in the SARIF version 2.1.0 format that is supported by all versions of {% data variables.product.prodname_dotcom %}. This file can be uploaded to {% data variables.product.prodname_dotcom %} by executing codeql github upload-results or the code scanning API. For more information, see "AUTOTITLE" or "AUTOTITLE".

{% data variables.product.prodname_codeql %} query suites are .qls files that use directives to select queries to run based on certain metadata properties. The standard {% data variables.product.prodname_codeql %} packs have metadata that specify the location of the query suites used by code scanning, so the {% data variables.product.prodname_codeql_cli %} knows where to find these suite files automatically, and you dont have to specify the full path on the command line. For more information, see "AUTOTITLE."

For information about creating custom query suites, see "AUTOTITLE."

Diagnostic and summary information

When you create a {% data variables.product.prodname_codeql %} database, the extractor stores diagnostic data in the database. The code scanning query suites include additional queries to report on this diagnostic data and calculate summary metrics. When the database analyze command completes, the CLI generates the results file and reports any diagnostic and summary data to standard output. If you choose to generate SARIF output, the additional data is also included in the SARIF file.

If the analysis found fewer results for standard queries than you expected, review the results of the diagnostic and summary queries to check whether the {% data variables.product.prodname_codeql %} database is likely to be a good representation of the codebase that you want to analyze.

Integrating a {% data variables.product.prodname_codeql %} pack into a code scanning workflow in {% data variables.product.prodname_dotcom %}

You can use {% data variables.product.prodname_codeql %} query packs in your code scanning setup. This allows you to select query packs published by various sources and use them to analyze your code. For more information, see "Using {% data variables.product.prodname_codeql %} query packs in the {% data variables.product.prodname_codeql %} action" or "Downloading and using {% data variables.product.prodname_codeql %} query packs in your CI system."

Including query help for custom {% data variables.product.prodname_codeql %} queries in SARIF files

If you use the {% data variables.product.prodname_codeql_cli %} to run code scanning analyses on third party CI/CD systems, you can include the query help for your custom queries in SARIF files generated during an analysis. After uploading the SARIF file to {% data variables.product.prodname_dotcom %}, the query help is shown in the code scanning UI for any alerts generated by the custom queries.

From {% data variables.product.prodname_codeql_cli %} v2.7.1 onwards, you can include markdown-rendered query help in SARIF files by providing the --sarif-add-query-help option when running codeql database analyze. For more information, see AUTOTITLE.

You can write query help for custom queries directly in a markdown file and save it alongside the corresponding query. Alternatively, for consistency with the standard {% data variables.product.prodname_codeql %} queries, you can write query help in the .qhelp format. Query help written in .qhelp files cant be included in SARIF files, and they cant be processed by code scanning so must be converted to markdown before running the analysis. For more information, see "Query help files" and "AUTOTITLE."

Results

You can save analysis results in a number of different formats, including SARIF and CSV.

The SARIF format is designed to represent the output of a broad range of static analysis tools. For more information, see AUTOTITLE.

If you choose to generate results in CSV format, then each line in the output file corresponds to an alert. Each line is a comma-separated list with the following information.

Property Description Example
Name Name of the query that identified the result. Inefficient regular expression
Description Description of the query. A regular expression that requires exponential time to match certain inputs can be a performance bottleneck, and may be vulnerable to denial-of-service attacks.
Severity Severity of the query. error
Message Alert message. This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '\\\\'.
Path Path of the file containing the alert. /vendor/codemirror/markdown.js
Start line Line of the file where the code that triggered the alert begins. 617
Start column Column of the start line that marks the start of the alert code. Not included when equal to 1. 32
End line Line of the file where the code that triggered the alert ends. Not included when the same value as the start line. 64
End column Where available, the column of the end line that marks the end of the alert code. Otherwise the end line is repeated. 617

Results files can be integrated into your own code-review or debugging infrastructure. For example, SARIF file output can be used to highlight alerts in the correct location in your source code using a SARIF viewer plugin for your IDE.

Further reading