mirror of
https://github.com/opentffoundation/opentf.git
synced 2025-12-20 10:19:27 -05:00
496 lines
25 KiB
Markdown
496 lines
25 KiB
Markdown
# OpenTofu Diagnostics Guide
|
|
|
|
"Diagnostics" is the general term we use to describe the error and warning
|
|
messages that OpenTofu returns when there are problems with the configuration,
|
|
or when interactions with external systems fail.
|
|
|
|
This document is an overview of how we typically use diagnostics in OpenTofu.
|
|
It includes both some technical information about how we represent diagnostics
|
|
in code, and some more subjective information about the writing style we most
|
|
often use in diagnostic messages.
|
|
|
|
## Diagnostics in Code
|
|
|
|
Diagnostics are modelled using the types from
|
|
[the `tfdiags` package](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags).
|
|
|
|
In particular:
|
|
- [`tfdiags.Diagnostics`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Diagnostics)
|
|
represents a set of zero or more diagnostics.
|
|
|
|
A total lack of diagnostics is usually represented by a `nil` value of this
|
|
type.
|
|
|
|
When constructing sets of diagnostics to return we typically don't worry
|
|
about the order they are returned in, even though we return them using a
|
|
slice type. The UI-layer code uses
|
|
[`tfdiags.Diagnostics.Sort`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Diagnostics.Sort)
|
|
to place all of the collected diagnostics into a predictable order before
|
|
rendering them, and so that function effectively turns the set of
|
|
diagnostics into an ordered list of diagnostics _just in time_.
|
|
|
|
- [`tfdiags.Diagnostic`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Diagnostic)
|
|
is an interface type that all diagnostic values implement.
|
|
|
|
In practice values of this type are often created automatically as an
|
|
implementation detail of [`Diagnostics.Append`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Diagnostics.Append),
|
|
which accepts various types that _don't_ directly implement
|
|
`Diagnostic` and then automatically wraps them in a type that does.
|
|
In particular:
|
|
|
|
- We often use [`hcl.Diagnostic`](https://pkg.go.dev/github.com/hashicorp/hcl/v2#Diagnostic)
|
|
to describe problems related to the configuration or operations that are
|
|
strongly related to parts of the configuration, because it is the most
|
|
fully-fledged type of diagnostic we allow including support for source
|
|
ranges and relevant expressions as described later.
|
|
|
|
It's also acceptable to append a whole `hcl.Diagnostics` (the HCL
|
|
equivalent of `tfdiags.Diagnostics`) in which case each diagnostic
|
|
will be wrapped and appended in turn. This is common when calling
|
|
HCL's own functions and passing on its diagnostics verbatim.
|
|
- Normal `error` values can be appended to a `tfdiags.Diagnostics`, but
|
|
that's mainly for historical reasons -- adapting code that was present
|
|
before the diagnostic models were added -- and should not be used in new
|
|
code because it typically results in low-quality diagnostics that don't
|
|
meet the style guidelines later in this document.
|
|
|
|
One exception is for "should never happen" cases: we sometimes use
|
|
`error` directly in that case to avoid overwhelming the surrounding
|
|
code with the construction of a full diagnostic.
|
|
|
|
Package `tfdiags` also includes some functions for constructing other kinds
|
|
of diagnostics, including:
|
|
|
|
- [`tfdiags.Sourceless`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Sourceless)
|
|
is good for diagnostics that don't relate to any part of the configuration,
|
|
such as when reporting incorrect usage of a command line argument.
|
|
- [`tfdiags.AttributeValue`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#AttributeValue) and
|
|
[`tfdiags.WholeContainingBody`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#WholeContainingBody)
|
|
produce special "contextual diagnostics" that must be transformed by
|
|
calling [`InConfigBody`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Diagnostics.InConfigBody)
|
|
on the resulting `Diagnostics` value. This is a special mechanism used
|
|
when the subsystem generating the diagnostic does not have direct access
|
|
to the configuration itself, such as when a provider returns a diagnostic
|
|
via the provider wire protocol.
|
|
- [`tfdiags.Severity`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Severity)
|
|
(and its HCL equivalent [`hcl.DiagnosticSeverity`](https://pkg.go.dev/github.com/hashicorp/hcl/v2#DiagnosticSeverity))
|
|
are how we distinguish between "error" and "warning" diagnostics.
|
|
|
|
The `tfdiags.Diagnostics.HasErrors` method returns true if the diagnostics
|
|
contains at least one with the severity `tfdiags.Error`.
|
|
|
|
The most common pattern for handling diagnostics in code is:
|
|
1. Declare `var diags tfdiags.Diagnostics` at the very start of a function.
|
|
2. During the function's body, whenever calling another function that might
|
|
produce its own diagnostics, capture them into a separate variable
|
|
(often called `moreDiags`, or `hclDiags` if the return type is
|
|
`hcl.Diagnostics`) and then immediately append them to the main `diags`
|
|
using `tfdiags.Diagnostics.Append`.
|
|
|
|
If subsequent code depends on the success of the call, check
|
|
`moreDiags.HasErrors()` (or similar) and return early if it returns `true`.
|
|
3. If the function generates any diagnostics of its own, append them directly
|
|
to `diags`.
|
|
4. At all exit points of the function, return `diags` regardless of whether
|
|
it has been assigned to or whether it contains errors. This ensures that
|
|
we always return any warnings that might have been produced and avoids
|
|
the risk of missing certain return paths under future maintenance if we
|
|
introduce additional diagnostics later.
|
|
|
|
Here's a code-example version of the above advice:
|
|
|
|
```go
|
|
func Example() (anything, tfdiags.Diagnostics) {
|
|
var diags tfdiags.Diagnostics
|
|
|
|
somethingElse, moreDiags := otherFunction()
|
|
diags = diags.Append(moreDiags)
|
|
if moreDiags.HasErrors() {
|
|
// NOTE: it isn't _always_ necessary to return immediately when there
|
|
// are errors, as long as the callee clearly documents what it
|
|
// guarantees about an errored result and the caller is able to
|
|
// work within those limitations. Collecting multiple errors to
|
|
// return together is often desirable.
|
|
//
|
|
// If the caller cannot continue at all though, or if continuing is
|
|
// likely to cause redundant errors that just restate the same problem
|
|
// in more confusing terms, then...
|
|
return nil, diags
|
|
}
|
|
if isProblematic(somethingElse) {
|
|
// A function might need to generate its own diagnostics if it detects
|
|
// a problem directly.
|
|
diags = diags.Append(&hcl.Diagnostic{
|
|
Severity: hcl.DiagError,
|
|
// ...
|
|
})
|
|
return nil, diags
|
|
}
|
|
|
|
// ...
|
|
|
|
// The final return statement should include diags even if no errors
|
|
// were detected along the way, because it might contain warnings.
|
|
return something, diags
|
|
}
|
|
```
|
|
|
|
Some functions diverge from this pattern for special reasons, such as capturing
|
|
multiple sets of child function diagnostics and then using some logic to decide
|
|
which ones to append, or processing multiple items in a loop and appending
|
|
new diagnostics for each iteration. The above is just a general example of the
|
|
most common case, not a fixed template to follow in all cases.
|
|
|
|
## Information in a Diagnostic
|
|
|
|
The general model of `tfdiags.Diagnostic` has the following parts, though not
|
|
all implementations of the interface make use of all of them:
|
|
|
|
- Severity: either `tfdiags.Error` or `tfdiags.Warning`.
|
|
- Description: the main human-readable text describing the problem. This
|
|
has the following fields:
|
|
|
|
- Summary: A short, terse description of the general type of problem
|
|
that has occurred.
|
|
- Detail: A longer description of the problem, sometimes including multiple
|
|
paragraphs of information.
|
|
- Address: The address of some object that the error relates to, which
|
|
is most often a resource instance address.
|
|
|
|
OpenTofu does not currently have a localized UI, so built-in diagnostics
|
|
always have their summary and detail written in US English. There's more
|
|
subjective guidance about the content of these fields in sections below.
|
|
- Source location information: optional references to parts of the configuration
|
|
that the problem relates to. This has the following fields:
|
|
|
|
- Subject: source range for the part of the configuration that caused the
|
|
problem or that the problem is directly about.
|
|
- Context: optional source range of a larger section of configuration that
|
|
might make the cause of the problem easier to quickly understand if
|
|
included in the diagnostic message. The Context source range must always
|
|
contain the Subject source range within it.
|
|
|
|
The UI uses the context and subject together to display a source code
|
|
snippet. The lines of code included in the snippet cover both the context
|
|
and the subject, and then the subject itself is rendered with an underline
|
|
if we're rendering into a terminal that supports that style.
|
|
|
|
We don't use "context" very often, but it can be useful if the problem
|
|
we're describing is that just one part of a larger source element is
|
|
problematic. For example, if one of the operands to the `+` operator
|
|
isn't a number then that operand would be the "subject" but the entire
|
|
addition operation could be returned as "context", so that both of the
|
|
operands and the `+` symbol will definitely be included in the rendered
|
|
diagnostic too.
|
|
- Expression-related information: optional information about an expression whose
|
|
evaluation cause the problem. This has the following fields:
|
|
|
|
- Expression: The `hcl.Expression` representing the expression itself.
|
|
- EvalContext: The `hcl.EvalContext` that the expression was being evaluated
|
|
in.
|
|
|
|
The diagnostic renderer for the UI uses this information, when available,
|
|
to offer some extra hints about the values of any symbols that were used
|
|
in the expression, because it's often the dynamic values that cause a
|
|
problem, rather than the syntax used to obtain them.
|
|
- Extra info: this is a rather underspecified collection of assorted other
|
|
information that's only relevant in very specific contexts. Refer to the
|
|
`tfdiags` package documentation for more information.
|
|
|
|
There's _some_ guidance on this later in this document, but it's focused
|
|
only on a few main cases.
|
|
|
|
## Diagnostic Description Writing Style
|
|
|
|
Although there is some variation in diagnostic writing style, particularly in
|
|
parts of the system like state storage backends which were originally written by
|
|
third-parties, most of the _built-in_ diagnostics follow a relatively consistent
|
|
writing style that is in turn based on the writing style used by HCL itself in
|
|
its own diagnostics, because HCL and OpenTofu diagnostics often mix together
|
|
in the same set of problems.
|
|
|
|
The "summary" should typically be a very short and concise description of
|
|
what was wrong and what was wrong about it. Our summaries typically don't
|
|
include any user-chosen information such as symbol names, because that means a
|
|
particular kind of problem is always described using the same text and so
|
|
readers can become familiar enough with the summaries of problems they see
|
|
frequently to skip reading the rest of the diagnostic when skimming.
|
|
|
|
The following are some real examples of summaries currently used across both
|
|
HCL and OpenTofu:
|
|
|
|
- Unsupported operator
|
|
- Duplicate argument
|
|
- Invalid index
|
|
- Unexpected end of template
|
|
- Invalid template interpolation value
|
|
- Invalid default value for variable
|
|
- Required variable not set
|
|
- Invalid "count" attribute
|
|
|
|
The "detail" text is where we tend to put most of the information, and so
|
|
there's a lot more variation here but ideally a good diagnostic detail
|
|
should mention the following information, usually in the following order:
|
|
|
|
- What was wrong and what was wrong about it: similar to the summary but this
|
|
time including information about specifically what was wrong, such as the
|
|
name of the input variable whose default value was invalid.
|
|
- Why the situation is problematic, if knowing that relies on some
|
|
characteristic of OpenTofu's design that might not be obvious to a newcomer.
|
|
- What should be done to fix it, or (if it's unclear what the author's intention
|
|
was) a question-sentence that implies a _possible_ solution, often starting
|
|
with the words "Did you mean" and ending with a question mark.
|
|
|
|
While the summary message is often terse and uses only minimal punctuation,
|
|
the detail message should always be written in full sentences including
|
|
end-of-sentence punctuation (`.`, `?`). If "what was wrong about it" is
|
|
coming from the string representation of an `error` value, we typically
|
|
present it with a prefix ending with a colon and then append a period `.`
|
|
after the error string, and format the error itself using `tfdiags.FormatError`,
|
|
like this:
|
|
|
|
```go
|
|
Detail: fmt.Sprintf("Unsuitable value for thingy: %s.", tfdiags.FormatError(err))
|
|
```
|
|
|
|
If the second and third items in the above take more than a few words, it's
|
|
helpful to split them into their own paragraphs for easier scanning. When
|
|
writing multiple paragraphs in a detail message they should be separated by
|
|
`\n\n` -- two newline characters.
|
|
|
|
In many cases our diagnostics only include a subset of this information because
|
|
either the reason why it's problematic is relatively clear or because we don't
|
|
have any specific suggestion for how to solve the problem, but the following
|
|
is an example of a real diagnostic message from OpenTofu at the time of writing
|
|
this documentation which includes all of these parts:
|
|
|
|
```
|
|
Error: Invalid for_each argument
|
|
|
|
The "for_each" map includes keys derived from resource attributes that cannot
|
|
be determined until apply, and so OpenTofu cannot determine the full set of keys
|
|
that will identify the instances of this resource.
|
|
|
|
When working with unknown values in for_each, it's better to define the map keys
|
|
statically in your configuration and place apply-time results only in the map
|
|
values.
|
|
|
|
Alternatively, you could use the planning option -exclude=aws_instance.example
|
|
to first apply without this object, and then apply normally to converge.
|
|
```
|
|
|
|
The text immediately after "Error:" above is the summary for this diagnostic.
|
|
The paragraphs that follow are all a single "detail" string.
|
|
|
|
That was a particularly extreme diagnostic message with lots of information to
|
|
communicate. Most diagnostics are not so complicated; the following is an
|
|
example with less information to communicate:
|
|
|
|
```
|
|
Error: Invalid value for input variable
|
|
|
|
The given value is not suitable for var.example declared
|
|
at example.tf:12,1: a string is required.
|
|
```
|
|
|
|
This example also illustrates a situation where there are two different source
|
|
locations that could be relevant: the input variable's declaration or the
|
|
expression that's used to define its value. Because this message is talking
|
|
about a problem with the _value_, the diagnostic should have the source
|
|
"Subject" set to the expression that defined it, but it also mentions the
|
|
location of the declaration as part of the detail text as some additional
|
|
context.
|
|
|
|
Some other notes about some other specific situations that arise sometimes:
|
|
|
|
- If a diagnostic message includes a suggestion for a shell command to run
|
|
or a URL to visit for more information, use a paragraph that ends with a
|
|
colon, followed by a single newline, four spaces for indentation, and then the
|
|
command or URL:
|
|
|
|
```
|
|
To view the root module output values, run:
|
|
tofu output
|
|
```
|
|
|
|
The goal of this formatting is to make it very clear what part of the
|
|
message is intended to be copied and used elsewhere, by placing it on a
|
|
line of its own without any surrounding punctuation. The indented text
|
|
should ideally be formatted so that the user can copy it _verbatim_ into
|
|
whatever place it will be used.
|
|
|
|
The diagnostic renderer also has a special case where it will not try to
|
|
word-wrap a line that begins with spaces, and so this layout has the
|
|
useful side-effect of avoiding introducing extra newline characters into
|
|
a command line that is intended to be copied.
|
|
|
|
- There are some terminology choices we use to refer to some OpenTofu-specific
|
|
ideas and concepts that disagree slightly with terminology used in the code.
|
|
These differences are the result of learning from feedback from folks who
|
|
had been confused by the original terminology, even though the code still
|
|
often uses the original terminology:
|
|
|
|
- Instead of referring to "unknown values" or "computed values" we say that
|
|
values are "known after apply" or "cannot be determined until apply".
|
|
- In HCL the word "variable" means anything that's available to refer to
|
|
in the current evaluation context, which is confusing because OpenTofu
|
|
itself uses that word to refer only to input variables.
|
|
|
|
Sometimes messages are generated by HCL itself and so it's unavoidably
|
|
confusing, but when we're generating messages _inside OpenTofu_ we
|
|
use the two words "input variable" to refer to an input variable,
|
|
and "symbol" or "object" (depending on whether we're talking about
|
|
the name itself or what the name refers to) as the general word for
|
|
something you can refer to in an expression.
|
|
- For consistency with our use of "input variable" to distinguish from
|
|
HCL's more general meaning of "variable", we also tend to write
|
|
"local value" and "output value" when referring to those concepts, rather
|
|
than using the shorthands "locals" and "outputs".
|
|
- HCL distinguishes between "attributes" meaning the named keys inside an
|
|
object type, and "arguments" meaning the names used for individual
|
|
settings inside a configuration block.
|
|
|
|
OpenTofu itself uses those words a little more interchangeably because
|
|
in _many_ cases the configuration arguments in a block directly
|
|
correspond to the attributes of an object created by evaluating that
|
|
block.
|
|
|
|
However, if a particular error message is talking about a configuration
|
|
setting inside a block it's better to use "argument" rather than
|
|
"attribute" because that's then consistent with error messages that
|
|
HCL itself might generate.
|
|
|
|
Go uses the term "field" to describe an element of a struct type, and
|
|
JavaScript and JSON use the word "property" to describe an element of
|
|
an object type. We don't use either of those words in OpenTofu: the
|
|
elements of an object are its _attributes_, and the settings available
|
|
in a configuration block are its _arguments_. The string values that
|
|
identify elements of a map are called "keys".
|
|
- The `cty` terminology "marks" or "value marks" refers to an implementation
|
|
detail that should never be mentioned directly in an error message.
|
|
|
|
Instead, we use specific terminology related to what each mark type
|
|
is representing: "sensitive values", "ephemeral values", etc.
|
|
- `aws_instance` is an example of a "resource _type_", not of a "resource",
|
|
even though the provider protocol uses the single noun "resource" to refer
|
|
to both ideas.
|
|
|
|
A "resource" is what's declared by a `resource`, `data`, or `ephemeral`
|
|
block. A "resource _instance_" is what such a block can declare zero
|
|
or more of, when using the `count`, `for_each`, or `enabled` arguments.
|
|
- Although there are certainly some historical diagnostic messages that
|
|
predate this adjustment of terminology, new error messages should use
|
|
"managed resource" to refer to the kind of resource that's declared
|
|
using a `resource` block, "data resource" for `data` blocks, and
|
|
"ephemeral resource" for an `ephemeral` block.
|
|
|
|
In the code we refer to these three as "resource _modes_", but that is
|
|
internal terminology that should never appear in a diagnostic message.
|
|
- When a file or directory path appears as part of a diagnostic message, it
|
|
should typically be presented relative to the current working directory and
|
|
should use the syntax conventions of the platform where OpenTofu is running.
|
|
|
|
In particular, we return paths using backslashes as the separator when we
|
|
are running on Windows, but normal slashes otherwise. Using the Go
|
|
`filepath` package is a good way to get this right, though you might need
|
|
to add some complexity to your tests to make them pass on all platforms.
|
|
- If an error message is describing a "should never happen" case, we typically
|
|
end the detail string with the sentence "This is a bug in OpenTofu.". This
|
|
hopefully prompts the reader that this wasn't directly caused by something
|
|
they did, and so they should probably open a bug report in the
|
|
OpenTofu repository instead of just trying to solve it themselves.
|
|
|
|
For this kind of error message we often relax our preference against
|
|
mentioning implementation details in the error message, because the most
|
|
likely next step is for the user to copy-paste the entire message into their
|
|
bug report text and so the final reader of the message is OpenTofu
|
|
maintainers rather than OpenTofu users.
|
|
|
|
For example, it can be okay to use internal terminology like "cty marks" and
|
|
use the `GoString` representations of values in a "This is a bug in
|
|
OpenTofu" detail message, if that's the most concise way to capture the
|
|
information the OpenTofu maintainers would need to debug the problem.
|
|
|
|
## Diagnostics caused by unknown or sensitive values
|
|
|
|
When a diagnostic has expression information associated with it, the diagnostic
|
|
renderer for the UI includes some additional information about the values
|
|
that were in scope, like this:
|
|
|
|
```
|
|
var.greeting is "Hello"
|
|
var.items is list of string with 5 elements
|
|
```
|
|
|
|
By default, this renderer will not mention any symbol which refers to an unknown
|
|
or sensitive value. That was not historically true: originally, this could
|
|
say something like "var.example is a string, known only after apply".
|
|
|
|
Those who are less familiar with these concepts often misunderstood the
|
|
"known only after apply" part of the message as being _the problem itself_,
|
|
rather than just context to help diagnose the problem, and so the UI no longer
|
|
mentions "unknown-ness" or "sensitive-ness" in most cases.
|
|
|
|
However, there are some diagnostics messages that _are_ directly caused by the
|
|
presence of an unknown or sensitive value, in which case it's helpful to
|
|
mention that in the summary of values that were in scope.
|
|
|
|
To allow for this, we set the "extra info" field of a diagnostic to contain
|
|
an implementation of one of the following interfaces:
|
|
|
|
- [`tfdiags.DiagnosticExtraBecauseUnknown`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#DiagnosticExtraBecauseUnknown)
|
|
for a problem that's caused by an unknown value.
|
|
|
|
(Remember that the _text_ of the error message should refer to this as "known
|
|
only after apply", or similar.)
|
|
- [`tfdiags.DiagnosticExtraBecauseSensitive`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#DiagnosticExtraBecauseSensitive)
|
|
for situations where a sensitive value was used in a location that OpenTofu
|
|
cannot permit it, such as in the instance key of a resource instance.
|
|
|
|
These extra markers should be used only when mentioning the unknown or sensitive
|
|
values in the diagnostic message is likely to help with debugging a problem.
|
|
If the problem is not directly caused by unknown or sensitive values then
|
|
neither of these should be used, to avoid creating a distracting
|
|
[red herring](https://en.wikipedia.org/wiki/Red_herring) for the reader.
|
|
|
|
## Consolidation of Diagnostics
|
|
|
|
The UI layer has some special rules for finding sets of similar diagnostics
|
|
and showing them as just a single diagnostic referring to the first example
|
|
of a problem, with a short extra note about how many other similar diagnostics
|
|
there are.
|
|
|
|
```
|
|
(and 2 similar warnings elsewhere)
|
|
```
|
|
|
|
The main implementation of this behavior is in
|
|
[`tfdiags.Diagnostics.Consolidate`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Diagnostics.Consolidate),
|
|
but we allow end-users to customize (using command line options) whether this
|
|
consolidation applies to errors or warnings separately. By default, we
|
|
consolidate only warnings.
|
|
|
|
For a severity that is subject to consolidation, the main behavior is to group
|
|
together diagnostics that have the same "summary" text, and this is part of
|
|
why we tend to use terse, fixed strings in the summary field.
|
|
|
|
There are two extra mechanisms for customizing this behavior for specific
|
|
diagnostic messages:
|
|
|
|
- If the "extra info" of a diagnostic contains an implementation of
|
|
[`tfdiags.DiagnosticExtraDoNotConsolidate`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#DiagnosticExtraDoNotConsolidate)
|
|
then that diagnostic is not eligible for consolidation at all, regardless
|
|
of how similar it might be to other diagnostics in the same set.
|
|
- If the "extra info" of a diagnostic contains an implementation of
|
|
[`tfdiags.Keyable`](https://pkg.go.dev/github.com/opentofu/opentofu/internal/tfdiags#Keyable)
|
|
then the string returned by its `ExtraInfoKey` method is used _in addition to_
|
|
the summary text for deciding what to consolidate.
|
|
|
|
For example, if there were three warnings with the same summary text but
|
|
two of them have the same `ExtraInfoKey` and the third has a different
|
|
one then only the first two would be able to consolidate.
|
|
|
|
The `ExtraInfoKey` is an internal key used for comparison only and is never
|
|
exposed in the UI, so it can be set to whatever makes sense to define
|
|
separate consolidation groups for diagnostics with a specific summary.
|