OpenTelemetry has various Go packages split across several Go modules that
often need to be carefully upgraded together. And in particular, we are
using the "semconv" package in conjunction with the OpenTelemetry SDK's
"resource" package in a way that requires that they both agree on which
version of the OpenTelemetry Semantic Conventions are being followed.
To help avoid "dependency hell" situations when upgrading, this centralizes
all of our direct calls into the OpenTelemetry SDK and tracing API into
packages under internal/tracing, by exposing a few thin wrapper functions
that other packages can use to access the same functionality indirectly.
We only use a relatively small subset of the OpenTelemetry library surface
area, so we don't need too many of these reexports and they should not
represent a significant additional maintenance burden.
For the semconv and resource interaction in particular this also factors
that out into a separate helper function with a unit test, so we should
notice quickly whenever they become misaligned. This complements the
end-to-end test previously added in opentofu/opentofu#3447 to give us
faster feedback about this particular problem, while the end-to-end test
has the broader scope of making sure there aren't any errors at all when
initializing OpenTelemetry tracing.
Finally, this also replaces the constants we previously had in package
traceaddrs with functions that return attribute.KeyValue values directly.
This matches the API style used by the OpenTelemetry semconv packages, and
makes the calls to these helpers from elsewhere in the system a little
more concise.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
This completes some of the missing connections for contexts in the provider
source codepaths by introducing context.Context parameters and wiring them
through so we can eliminate a few more context.TODO() placeholders.
For consistency's sake this adds context.Context to all four of the
getproviders.Source implementations that directly interact with stuff
outside of OpenTofu (network services or filesystem), even though not
all of them currently make use of it, just because interactions with
outside stuff tends to encourage cross-cutting concerns like logging and
tracing and so this ensures we have contexts propagated in there for such
future uses.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
Previously we were using a third-party library, but that doesn't have any
support for passing context.Context through its API and so isn't suitable
for our goals of adding OpenTelemetry tracing for all outgoing network
requests.
We now have our own fork that is updated to use context.Context. It also
has a slightly reduced scope no longer including various details that
are tightly-coupled to our cliconfig mechanism and so better placed in the
main OpenTofu codebase so we can evolve it in future without making
lockstep library releases.
The "registry-address" library also uses svchost and uses some of its types
in its public API, so this also incorporates v2 of that library that is
updated to use our own svchost module.
Unfortunately this commit is a mix of mechanical updates to the new
libraries and some new code dealing with the functionality that is removed
in our fork of svchost. The new code is primarily in the "svcauthconfig"
package, which is similar in purpose "ociauthconfig" but for OpenTofu's
own auth mechanism instead of the OCI Distribution protocol's auth
mechanism.
This includes some additional plumbing of context.Context where it was
possible to do so without broad changes to files that would not otherwise
have been included in this commit, but there are a few leftover spots that
are context.TODO() which we'll address separately in later commits.
This removes the temporary workaround from d079da6e9e, since we are now
able to plumb the OpenTelemetry span tree all the way to the service
discovery requests.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
The primary reason for this change is that registry.NewClient was
originally imposing its own decision about service discovery request
policy on every other user of the shared disco.Disco object by modifying
it directly.
We have been moving towards using a dependency inversion style where
package main is responsible for deciding how everything should be
configured based on global CLI arguments, environment variables, and the
CLI configuration, and so this commit moves to using that model for the
HTTP clients used by the module and provider registry client code.
This also makes explicit what was previously hidden away: that all service
discovery requests are made using the same HTTP client policy as for
requests to module registries, even if the service being discovered is not
a registry. This doesn't seem to have been the intention of the code as
previously written, but was still its ultimate effect: there is only one
disco.Disco object shared across all discovery callers and so changing its
configuration in any way changes it for everyone.
This initial rework is certainly not perfect: these components were not
originally designed to work in this way and there are lots of existing
test cases relying on them working the old way, and so this is a compromise
to get the behavior we now need (using consistent HTTP client settings
across all callers) without disrupting too much existing code.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
When context.Context was new, APIs using it arrived sporadically and so
the Go team introduced context.TODO() as an explicit way to say "I need a
context but I don't yet have a useful one to provide".
It took quite a while for there to be an established pattern for contexts
in tests, but now there is finally testing.T.Context which returns a
context that gets cancelled once the test is complete, and so that's a good
parent context to use for all contexts belonging to a test case.
This commit therefore mechanically replaces every use of context.TODO in
our test cases throughout the codebase with a call to t.Context instead.
There were a small number of tests that were using a mixture of
context.TODO and context.Background as placeholders and so those are also
updated to use t.Context consistently. There are probably still some
remaining uses of context.Background in our tests, but we'll save those
for another day.
As of this commit there are still various uses of context.TODO left in
_non-test_ code, but we need to take more care in how we update those so
those are intentionally excluded here.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
This allows removing the last remaining context.TODO() from this package,
and allows incoming tracing spans to reach the underlying provider
installation operations in all cases.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
As discussed in opentofu/opentofu#2656, this consolidates the two concerns
of the PackageAuthentication interface into a single function that deals
both with package authentication _and_ with reporting all of the package
hashes that were used to make the authentication decision.
This means that any .zip archive that OpenTofu directly verifies during
installation can now have its hash recorded in the dependency lock file
even if that package didn't come from the provider's origin registry, which
is beneficial when the first installation of a provider comes from a
secondary ("mirror") source because it creates an additional hook by which
that dependency lock file entry can be "upgraded" to be complete in a
future "tofu init" run against the origin registry, or by the
"tofu providers lock" command.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
Previously we treated PackageLocation only as pure data, describing a
location from which something could be retrieved. Unexported functions in
package providercache then treated PackageLocation values as a closed
union dispatched using a type switch.
That strategy has been okay for our relatively-generic package locations
so far, but in a future commit we intend to add a new kind of package
location referring to a manifest in an OCI distribution repository, and
installing from _that_ will require a nontrivial amount of
OCI-distribution-specific logic that will be more tightly coupled to the
getproviders.Source that will return such locations, and so we're switching
to a model where _the location object itself_ is responsible for knowing
how to install a provider package from its location, as a method of
PackageLocation.
The implementation of installing from each location type now moves from
package providercache to package getproviders, which is arguably a better
thematic home for that functionality anyway.
For now these remain tested only indirectly through the tests in package
providercache, since we didn't previously have any independent tests for
these unexported functions. We might want to add more tightly-scoped unit
tests for these to package getproviders in future, but for now this is not
materially worse than it was before.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
This is another step towards breaking the huge functions in this package
into smaller parts that have a clearer set of inputs and outputs.
For the moment the goal is to modify the existing code as little as
possible to make this easier to review, and so the new function
tryInstallPackageFromCacheDir has an unfortunately-large number of
arguments. Future refactoring can hopefully improve on this further.
One significant change to the structure of this code is that because it's
now in a separate function working on only one provider at a time we can
rely on early return for error handling, letting the caller be responsible
for collecting any errors into the "errs" map, and so we don't need quite
as much nesting as the previous code had.
This should not change the observable behavior in any way, which is
reinforced by there being no changes to any tests in this commit.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
This function has grown very large over time as the provider installation
requirements got more complicated. This is a first level of decomposition
of the three main steps into one separate function each.
The "ensureProviderVersionsInstall" method remains too large itself, but
for now that has just acquired a nolint directive so that we can approach
this gradually in the interests of making it easier to review.
This should not change the observable behavior of the provider installer
in any way, which is reinforced by the fact that there are no test changes
in this commit.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
* Rename module name from "github.com/hashicorp/terraform" to "github.com/placeholderplaceholderplaceholder/opentf".
Signed-off-by: Jakub Martin <kubam@spacelift.io>
* Gofmt.
Signed-off-by: Jakub Martin <kubam@spacelift.io>
* Regenerate protobuf.
Signed-off-by: Jakub Martin <kubam@spacelift.io>
* Fix comments.
Signed-off-by: Jakub Martin <kubam@spacelift.io>
* Undo issue and pull request link changes.
Signed-off-by: Jakub Martin <kubam@spacelift.io>
* Undo comment changes.
Signed-off-by: Jakub Martin <kubam@spacelift.io>
* Fix comment.
Signed-off-by: Jakub Martin <kubam@spacelift.io>
* Undo some link changes.
Signed-off-by: Jakub Martin <kubam@spacelift.io>
* make generate && make protobuf
Signed-off-by: Jakub Martin <kubam@spacelift.io>
---------
Signed-off-by: Jakub Martin <kubam@spacelift.io>
Currently Terraform will use an entry from the global plugin cache only if
it matches a checksum already recorded in the dependency lock file. This
allows Terraform to produce a complete lock file entry on the first
encounter with a new provider, whereas using the cache in that case would
cause the lock file to only cover the single package in the cache and
thereefore be unusable on any other operating system or CPU architecture.
This temporary CLI config option is a pragmatic exception to support those
who cannot currently correctly use the dependency lock file but who still
want to benefit from the plugin cache. With this setting enabled,
Terraform has permission to produce a dependency lock file that is only
suitable for the current system if that would allow use of an existing
entry in the plugin cache.
We are introducing this option to resolve a conflict between the needs of
folks who are using the dependency lock file as expected and the needs of
folks who cannot use the dependency lock file for some reason. The hope
then is to give respite to those who need this exception in the meantime
while we understand better why they cannot use the dependency lock file
and improve its design so that everyone will be able to use it
successfully in a future version of Terraform. This option will become a
silent no-op in a future version of Terraform, once the dependency lock
file behavior is sufficient for all supported Terraform development
workflows.
When we originally introduced the trust-on-first-use checksum locking
mechanism in v0.14, we had to make some tricky decisions about how it
should interact with the pre-existing optional read-through global cache
of provider packages:
The global cache essentially conflicts with the checksum locking because
if the needed provider is already in the cache then Terraform skips
installing the provider from upstream and therefore misses the opportunity
to capture the signed checksums published by the provider developer. We
can't use the signed checksums to verify a cache entry because the origin
registry protocol is still using the legacy ziphash scheme and that is
only usable for the original zipped provider packages and not for the
unpacked-layout cache directory. Therefore we decided to prioritize the
existing cache directory behavior at the expense of the lock file behavior,
making Terraform produce an incomplete lock file in that case.
Now that we've had some real-world experience with the lock file mechanism,
we can see that the chosen compromise was not ideal because it causes
"terraform init" to behave significantly differently in its lock file
update behavior depending on whether or not a particular provider is
already cached. By robbing Terraform of its opportunity to fetch the
official checksums, Terraform must generate a lock file that is inherently
non-portable, which is problematic for any team which works with the same
Terraform configuration on multiple different platforms.
This change addresses that problem by essentially flipping the decision so
that we'll prioritize the lock file behavior over the provider cache
behavior. Now a global cache entry is eligible for use if and only if the
lock file already contains a checksum that matches the cache entry. This
means that the first time a particular configuration sees a new provider
it will always be fetched from the configured installation source
(typically the origin registry) and record the checksums from that source.
On subsequent installs of the same provider version already locked,
Terraform will then consider the cache entry to be eligible and skip
re-downloading the same package.
This intentionally makes the global cache mechanism subordinate to the
lock file mechanism: the lock file must be populated in order for the
global cache to be effective. For those who have many separate
configurations which all refer to the same provider version, they will
need to re-download the provider once for each configuration in order to
gather the information needed to populate the lock file, whereas before
they would have only downloaded it for the _first_ configuration using
that provider.
This should therefore remove the most significant cause of folks ending
up with incomplete lock files that don't work for colleagues using other
platforms, and the expense of bypassing the cache for the first use of
each new package with each new configuration. This tradeoff seems
reasonable because otherwise such users would inevitably need to run
"terraform providers lock" separately anyway, and that command _always_
bypasses the cache. Although this change does decrease the hit rate of the
cache, if we subtract the never-cached downloads caused by
"terraform providers lock" then this is a net benefit overall, and does
the right thing by default without the need to run a separate command.