Following the lead of similar earlier work on testing the installation of
provider packages from OCI repositories, this new test exercises the new
OCI-based module source address syntax in an end-to-end fashion by directly
running "tofu init".
For the reasons described inline, this test uses a local test server as its
target OCI Registry and therefore needs to rely on a Go standard library
feature for overriding the trusted TLS certs which only works on Unix
systems other than macOS, and therefore this test will only run when the
e2etest suite is run on Linux systems. This matches the same compromise we
previously made for the provider installation flavor of this test, with
the same assumption that our module installer isn't doing anything
particularly platform-specific and that we're doing this in e2etest only
because that's an effective way to test that "package main" is wiring all
of the internal components together correctly.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
Most of the OCI registry interactions are unit tested in the most relevant
packages, but the overall system will only work correctly if all of the
components are correctly wired together by "package main", and that's one
part of the system that needs to be tested concretely rather than via
test doubles.
Therefore this adds an end-to-end test in our existing e2etest package
that runs "tofu init" with a CLI configuration that forces using an OCI
mirror with a TLS server provided locally by our test program. It exercises
the main happy path of provider installation in the same way that an
end-user would interact with it, to help avoid accidentally regressing
the interactions between these packages in future versions.
Unfortunately the technique this test uses to force the OpenTofu CLI
binary to trust the test server doesn't work on macOS or Windows and so
for now this test is Linux-specific. That's certainly non-ideal, but
pragmatic since we'll be relying mainly on the platform-agnostic unit tests
to cover this behavior, and we're unlikely to ever stop running the
e2etests on Linux as part of our pull request checks so even those
developing on macOS or Windows can still notice if this test becomes
broken before merging a change.
Signed-off-by: Martin Atkins <mart@degeneration.co.uk>
When we originally introduced the trust-on-first-use checksum locking
mechanism in v0.14, we had to make some tricky decisions about how it
should interact with the pre-existing optional read-through global cache
of provider packages:
The global cache essentially conflicts with the checksum locking because
if the needed provider is already in the cache then Terraform skips
installing the provider from upstream and therefore misses the opportunity
to capture the signed checksums published by the provider developer. We
can't use the signed checksums to verify a cache entry because the origin
registry protocol is still using the legacy ziphash scheme and that is
only usable for the original zipped provider packages and not for the
unpacked-layout cache directory. Therefore we decided to prioritize the
existing cache directory behavior at the expense of the lock file behavior,
making Terraform produce an incomplete lock file in that case.
Now that we've had some real-world experience with the lock file mechanism,
we can see that the chosen compromise was not ideal because it causes
"terraform init" to behave significantly differently in its lock file
update behavior depending on whether or not a particular provider is
already cached. By robbing Terraform of its opportunity to fetch the
official checksums, Terraform must generate a lock file that is inherently
non-portable, which is problematic for any team which works with the same
Terraform configuration on multiple different platforms.
This change addresses that problem by essentially flipping the decision so
that we'll prioritize the lock file behavior over the provider cache
behavior. Now a global cache entry is eligible for use if and only if the
lock file already contains a checksum that matches the cache entry. This
means that the first time a particular configuration sees a new provider
it will always be fetched from the configured installation source
(typically the origin registry) and record the checksums from that source.
On subsequent installs of the same provider version already locked,
Terraform will then consider the cache entry to be eligible and skip
re-downloading the same package.
This intentionally makes the global cache mechanism subordinate to the
lock file mechanism: the lock file must be populated in order for the
global cache to be effective. For those who have many separate
configurations which all refer to the same provider version, they will
need to re-download the provider once for each configuration in order to
gather the information needed to populate the lock file, whereas before
they would have only downloaded it for the _first_ configuration using
that provider.
This should therefore remove the most significant cause of folks ending
up with incomplete lock files that don't work for colleagues using other
platforms, and the expense of bypassing the cache for the first use of
each new package with each new configuration. This tradeoff seems
reasonable because otherwise such users would inevitably need to run
"terraform providers lock" separately anyway, and that command _always_
bypasses the cache. Although this change does decrease the hit rate of the
cache, if we subtract the never-cached downloads caused by
"terraform providers lock" then this is a net benefit overall, and does
the right thing by default without the need to run a separate command.
We have various mechanisms that aim to ensure that the installed provider
plugins are consistent with the lock file and that the lock file is
consistent with the provider requirements, and we do have existing unit
tests for them, but all of those cases mock our fake out at least part of
the process and in the past that's caused us to miss usability
regressions, where we still catch the error but do so at the wrong layer
and thus generate error message lacking useful additional context.
Here we'll add some new end-to-end tests to supplement the existing unit
tests, making sure things work as expected when we assemble the system
together as we would in a release. These tests cover a number of different
ways in which the plugin selections can grow inconsistent.
These new tests all run only when we're in a context where we're allowed
to access the network, because they exercise the real plugin installer
codepath. We could technically build this to use a local filesystem mirror
or other such override to avoid that, but the point here is to make sure
we see the expected behavior in the main case, and so it's worth the
small additional cost of downloading the null provider from the real
registry.
This is part of a general effort to move all of Terraform's non-library
package surface under internal in order to reinforce that these are for
internal use within Terraform only.
If you were previously importing packages under this prefix into an
external codebase, you could pin to an earlier release tag as an interim
solution until you've make a plan to achieve the same functionality some
other way.