The "publish" job has no dependencies on other jobs, so nothing prevents
it from being accidentally started before the scheduled publication
date. Although publication still requires confirmation via an SSH
connection to a dedicated, locked-down runner, performing that action
prematurely may have drastic consequences. Therefore, it is worth
implementing additional safeguards.
Add an extra check to the "publish" job to ensure it can only be run on
the scheduled publication day. In exceptional circumstances, this check
can be overridden by setting the FORCE_PUBLICATION CI variable to any
non-empty value.
(cherry picked from commit ce977f53b9)
The "merge-tag" and "update-stable-tag" jobs currently use the
"manual_release_job_qa" YAML anchor, which makes them depend on the
"staging" job. Meanwhile, both of these jobs require the tag they were
created for to be public for them to work. While this is harmless, as
these jobs will simply fail if they are run too early, it still makes
sense for them to depend on the "publish" job instead, if only to reduce
confusion in the pipeline view. Adjust the "needs" key for the
"merge-tag" and "update-stable-tag" jobs accordingly.
(cherry picked from commit 722290dce6)
The commit.txt file produced by each Cloudsmith build job is required to
run the corresponding publication job. Therefore, the artifact lifetime
for the former must be long enough to prevent the file from expiring
before the publication job is run. Set the lifetime of the artifacts
created by Cloudsmith build jobs to one month to ensure that the
publication jobs can access them.
(cherry picked from commit ce09f8d0f8)
Setting "artifacts: false" for the dependency on the "publish-private"
job prevents the url-*.txt files produced by that job from being pulled
from GitLab when the jobs that build EVN & -S Cloudsmith packages are
run, effectively breaking the latter. Fix by making these jobs depend
on the artifacts of the "publish-private" job.
(cherry picked from commit b36f17238b)
The "nsec3-delegation" test was added in a release branch, before commit
e40db975d9 introduced the current system
test naming convention. Rename the test to comply with that convention.
Backport of MR !11753
Merge branch 'backport-michal/rename-nsec3-delegation-test-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11754
The "nsec3-delegation" test was added in a release branch, before commit
e40db975d9 introduced the current system
test naming convention. Rename the test to comply with that convention.
(cherry picked from commit 48bf3d3e65)
Fixed a crash that could occur when running rndc reconfig to change a zone's update policy (e.g., from allow-update to update-policy) while DNS UPDATE requests were being processed for that zone.
ISC would like to thank Vitaly Simonovich for bringing this issue to our attention.
Fixes#5817
Backport of MR !11707
Merge branch 'backport-5817-fix-crash-via-SSU-table-desynchronization-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11738
Race rndc reconfig (toggling between allow-update and update-policy)
against a stream of DNS UPDATEs for 5 seconds and verify that named
does not crash.
Before the fix, the race between send_update() and update_action()
reading the SSU table independently could trigger an assertion
failure (INSIST) when the zone's update policy changed between the
two reads.
(cherry picked from commit c503b6eee8)
Pass the SSU table through the update event struct from
send_update() to update_action() instead of reading it from the
zone twice. If rndc reconfig changed the zone's update policy
between the two reads (e.g., from allow-update to update-policy),
send_update() would skip the maxbytype allocation but
update_action() would see a non-NULL ssutable, triggering
INSIST(ssutable == NULL || maxbytype != NULL) and crashing named.
The ssutable reference is now taken once in send_update() and
transferred to update_action() via the event struct, ensuring
both functions see the same value.
(cherry picked from commit c172416559)
Calling `rndc modzone` didn't work properly for a zone hat was configured in
the configuration file. It could crash if BIND 9 was built without LMDB or if
there was already an NZF file for the zone. In addition, `rndc modzone` failed
in subsequent attempts. These problems are now fixed.
Closes#5826
Merge branch '5826-fix-modzone-issues-ytatuya' into 'bind-9.20'
See merge request isc-projects/bind9!11743
If a zone is in named.conf, not originally added by rndc addzone,
rndc modzone for that zone succeeds once, but subsequent modzone
attempts fail. This is because do_modzone removes the zone config
from global or view options, but it would fail due to 'not found'
once the config is removed.
The fix is to ensure re-adding the updated zone config to the
global or view options. This also works as a more complete fix
for the issue 85453d3 atempted to solve, ensuring rndc showzone
shows the latest config: it now works for multple attemps of
modzone, and with named that is not built with LMDB.
The change in this commit relies on UNCONST in a few places.
That's not clean, but 'add/mod/delzone' generally seems to
need it (for example, delete_zoneconf uses it to modify the list
of zones). In that sense, this change follows the convention
(for a longer term, there may have to be a better API so that we
can modify config obtions that were once parsed).
This reverts commit 85453d393d.
This commit doesn't seem to be a complete solution of what
it appears to fix: showzone succeeds and shows the modified
config after first modzone, but subsequent attempts of modzone
fail (though not because of the commit being reverted), let
alone showing the correct new config.
Revering the change for now, and will provide a more comprehensive
fix in the next commit.
If named is built without LMDB and has a zone in named.conf,
then rndc modzone for that zone triggers an assertion failure
unless there's already an NZF file. This is because load_nzf
doesn't create 'nzf_config' when NZF is missing, while a valid
nzf_config is assumed in do_modzone when it tries to add the
modified zone config to add_parser.
The crash is fixed by skipping the call to cfg_parser_mapadd when
nzf_config is NULL. Skipping it should be okay since the config stored
in add_parser would be needed only for subsequently deleting a zone by
rndc delzone when the zone was originally added by rndc addzone, but
in this case the zone was not 'added'. Checking if nzf_config is NULL
before using it also seems to be consistent with other parts of the
implementation.
A helper macro that returns the current value of a pointer and sets
it to NULL in one expression, useful for transferring ownership in
designated initializers.
Backport of MR !11724
Merge branch 'backport-ondrej/TAKE_OWNERSHIP-macro-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11736
A helper macro that returns the current value of a pointer and sets
it to NULL in one expression, useful for transferring ownership in
designated initializers.
(cherry picked from commit 0f3be0beb8)
The usage still said the default NSEC3 iterations is 10, but this
has been 0 for a while.
Backport of MR !11727
Merge branch 'backport-matthijs-dnssec-signzone-help-nsec3iter-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11734
dns_view_flushnode() was called in the delete_expired() async
callback, which runs after the query that detected the NTA expiry.
This created a race: the query would proceed with stale cached data
from the NTA period before the flush had a chance to run, resulting
in transient SERVFAIL with EDE 22 (No Reachable Authority).
Skip dns_view_flushnode() in the older branches as the solutions for
older branches are too complicated and this was not a critical bug.
Also simplify the expiry comparison in delete_expired() to a direct
pointer comparison (nta == pval) instead of comparing expiry
timestamps.
Backport of MR !11729
Merge branch 'backport-ondrej/refactor-nta-using-RCU-delete-order-fix-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11730
When an NTA already exists for a name, the old code retrieved
and reused the existing NTA object, then reset its timer via
settimer(). This is incorrect because isc_timer_start() and
isc_timer_stop() require the timer to be manipulated from its
owning loop (enforced by REQUIRE(timer->loop == isc_loop()) in
lib/isc/timer.c), and the caller may be running on a different
loop than the one that created the original NTA.
Instead, delete the old NTA (shutting down its timer on the
correct loop) and insert a fresh one that is owned by the
current loop.
dns_view_flushnode() was called in the delete_expired() async
callback, which runs after the query that detected the NTA expiry.
This created a race: the query would proceed with stale cached data
from the NTA period before the flush had a chance to run, resulting
in transient SERVFAIL with EDE 22 (No Reachable Authority).
Skip dns_view_flushnode() in the older branches as the solutions for
older branches are too complicated and this was not a critical bug.
(cherry picked from commit da8e1c956a)
When a configured NTA for a name expired, any possibly cached
data for the name (with "insecure" DNSSEC validation result)
was not flushed from the resolver's cache. This has been fixed.
Closes#5747
Backport of MR !11597
Merge branch 'backport-5747-nta-expiry-cache-flush-bug-fix-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11715
When NTA expires the name's node should be flushed from the view's
cache as it's done when the NTA is manually removed using a rndc
command.
(cherry picked from commit 1899a3318c)
Move the write to fctx->vresult after LOCK(&fctx->lock). The field was
being set before acquiring the lock, but dns_resolver_logfetch() reads
it under the same lock from another thread.
Backport of MR !11717
Merge branch 'backport-ondrej/fix-data-race-on-fctx-result-in-validated-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11721
Move the write to fctx->vresult after LOCK(&fctx->lock). The field was
being set before acquiring the lock, but dns_resolver_logfetch() reads
it under the same lock from another thread.
(cherry picked from commit a2bd833909)
The SRTT (Smoothed Round-Trip Time) update for remote servers was not
atomic — concurrent callers could each read the same value and one
update would be silently lost. Additionally, the aging decay applied
once per second could run multiple times if several threads entered the
function simultaneously.
Use compare-and-swap loops for the SRTT update and for the aging
timestamp to ensure no updates are lost.
Backport of MR !11718
Merge branch 'backport-ondrej/fix-non-atomic-srtt-aging-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11723
The SRTT update loaded the old value, computed a new one, and stored it
back as separate operations. Two concurrent callers could each read the
same old value and one update would be silently lost.
Use a CAS loop for the read-modify-write on entry->srtt. For the aging
path, also CAS on entry->lastage to prevent multiple threads from aging
the same entry in the same second.
(cherry picked from commit 4d15494b94)
A 'dns_dtenv_t' pointer is passed to an async function without taking
a reference first, which can potentially cause a use-after-free error.
Take a reference, then detach in the async function.
Closes#5820
Backport of MR !11705
Merge branch 'backport-5820-dns_dtenv-reference-bug-fix-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11714
The 'env' pointer is passed to an async function without taking
a reference first, which can potentially cause a use-after-free
error. Take a reference, then detach in the async function.
(cherry picked from commit 48d7401f0d)
Change the convention for system test directory names to always use an
underscore rather than a hyphen. Names using underscore are valid python
package names and can be used with standard `import` facilities in
python, which allows easier code reuse.
Backport of MR !11710
Merge branch 'backport-nicki/system-test-dir-underscore-names-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11711
All system tests previously using a hyphen have been renamed to use
underscore instead. A couple of symlinks were corrected and one path in
`nsec3-answer` adjusted accordingly.
(cherry picked from commit 67aca1f8c6)
Change the convention for system test directory names to always use an
underscore rather than a hyphen. Names using underscore are valid python
package names and can be used with standard `import` facilities in
python, which allows easier code reuse.
The temporary directories for test execution and their convenience
symlinks have been switched to using hyphens rather than underscores to
keep the pytest collection, filtering and .gitignore working as
expected.
(cherry picked from commit 9f4c1d1993)
isc_buffer_init() is given MAX_DNS_MESSAGE_SIZE (65535) as capacity but
only h2->content_length bytes are allocated. This makes the buffer
believe it has more space than actually allocated. A secondary bounds
check (new_bufsize <= h2->content_length) prevents actual overflow, but
the buffer invariant is violated.
Pass h2->content_length as the capacity to match the allocation.
Backport of MR !11662
Merge branch 'backport-ondrej/fix-isc_buffer_init-capacity-mismatch-in-DoH-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11709
isc_buffer_init() is given MAX_DNS_MESSAGE_SIZE (65535) as capacity but
only h2->content_length bytes are allocated. This makes the buffer
believe it has more space than actually allocated. A secondary bounds
check (new_bufsize <= h2->content_length) prevents actual overflow, but
the buffer invariant is violated.
Pass h2->content_length as the capacity to match the allocation.
(cherry picked from commit 8e240bbb5f)
Under specific error conditions during query processing, resources were not
being properly released, which could eventually lead to unnecessary memory
consumption for the server. The a potential resource leak in the resolver
has been fixed.
Backport of MR !11658
Merge branch 'backport-ondrej/fix-pthread-primitives-usage-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11706
The keylist_lock rwlock is initialized at startup but never destroyed
on exit, unlike the sibling namelock mutex which is properly cleaned up.
(cherry picked from commit 5dc19a7d92)
The error cleanup in fctx_create() was missing isc_mutex_destroy() and
dns_ede_invalidate() calls. When error paths (cleanup_nameservers,
cleanup_fcount, cleanup_qmessage, cleanup_adb) were taken after the
mutex and edectx were initialized, the fctx memory was freed without
properly destroying these resources first.
(cherry picked from commit 5b1750f15f)
The lock is acquired for reading but the error path from
dns_rdata_fromstruct() incorrectly unlocks it as a write lock.
(cherry picked from commit 96a22451d7)
Fail with a specific error code if we detect a deadlock in the validator.
Closes#5769
Backport of MR !11622
Merge branch 'backport-5769-deadlock-validator-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11702
We return DNS_R_NOVALIDSIG if we detected a deadlock. Then in
'validate_async_done()', this result value is used to check if we
need to fall back to insecure. As part of that we create a new fetch
but that fails because of the detected deadlock. This results in a loop
of deadlock detected, fallback to insecure, deadlock detected, ...
Add a new result value, ISC_R_DEADLOCK, and return this instead when
we have detected a deadlock. This will be treated as a generic error,
as there is no special handling for this result value.
(cherry picked from commit bc1d177cc2)
Calling `rndc modzone` on a zone that was configured in the configuration file caused a crash. This has been fixed.
ISC would like to thank Nathan Reilly for reporting this.
Closes#5800
Backport of MR !11683
Merge branch 'backport-5800-rndc-modzone-non-dynamic-zone-crash-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!11698
'rndc modzone' deletes the old configuration. If we don't store the
new zone config, when we do a 'rndc showzone' it will be a failure.
This is not an issue in the 9.21 version, because of the effective
config behavior.