unbound/validator/val_neg.h
Wouter Wijngaards 218f5cfc92
Fast Reload Option (#1042)
* - fast-reload, add unbound-control fast_reload

* - fast-reload, make a thread to service the unbound-control command.

* - fast-reload, communication sockets for information transfer.

* - fast-reload, fix compile for unbound-dnstap-socket.

* - fast-reload, set nonblocking communication to keep the server thread
  responding to DNS requests.

* - fast-reload, poll routine to test for readiness, timeout fails connection.

* - fast-reload, detect loop in sock_poll_timeout routine.

* - fast-reload, send done and exited notification.

* - fast-reload, defines for constants in ipc.

* - fast-reload, ipc socket recv and send resists partial reads and writes and
  can continue byte by byte. Also it can continue after an interrupt.

* - fast-reload, send exit command to thread when done.

* - fast-reload, output strings for client on string list.

* - fast-reload, add newline to terminal output.

* - fast-reload, send client string to remote client.

* - fast-reload, better debug output.

* - fast-reload, print queue structure, for output to the remote client.

* - fast-reload, move print items to print queue from fast_reload_thread struct.

* - fast-reload, keep list of pending print queue items in daemon struct.

* - fast-reload, comment explains in_list for printq to print remainder.

* - fast-reload, unit test testdata/fast_reload_thread.tdir that tests the
  thread output.

* - fast-reload, fix test link for fast_reload_printq_list_delete function.

* - fast-reload, reread config file from disk.

* - fast-reload, unshare forwards, making the structure locked, with an rwlock.

* - fast-reload, for nonthreaded, the unbound-control commands forward,
  forward_add and forward_delete should be distributed to other processes,
  but when threaded, they should not be distributed to other threads because
  the structure is not thread specific any more.

* - fast-reload, unshared stub hints, making the structure locked, with an rwlock.

* - fast-reload, helpful comments for hints lookup function return value.

* - fast-reload, fix bug in fast reload printout, the strlist appendlist routine,
  and printout time statistics after the reload is done.

* - fast-reload, keep track of reloadtime and deletestime and print them.

* - fast-reload, keep track of constructtime and print it.

* - fast-reload, construct new items.

* - fast-reload, better comment.

* - fast-reload, reload the config and swap trees for forwards and stub hints.

* - fast-reload, in forwards_swap_tree set protection of trees with locks.

* - fast-reload, in hints_swap_tree also swap the node count of the trees.

* - fast-reload, reload ipc to stop and start threads.

* - fast-reload, unused forward declarations removed.

* - fast-reload, unit test that fast reload works with forwards and stubs.

* - fast-reload, fix clang analyzer warnings.

* - fast-reload, small documentation entry in unbound-control -h output.

* - fast-reload, printout memory use by fast reload, in bytes.

* - fast-reload, compile without threads.

* - fast-reload, document fast_reload in man page.

* - fast-reload, print ok when done successfully.

* - fast-reload, option for fast-reload commandline, +v verbosity option,
  with timing and memory use output.

* - fast-reload, option for fast-reload commandline, +p does not pause threads.

* - fast-reload, option for fast-reload commandline, +d drops mesh queries.

* - fast-reload, fix to poll every thread with nopause to make certain that
  resources are not held by the threads and can be deleted.

* - fast-reload, fix to use atomic store for config variables with nopause.

* - fast-reload, reload views.

* - fast-reload, when tag defines are different, it drops the queries.

* - fast-reload, fix tag define check.

* - fast-reload, document that tag change causes drop of queries.

* - fast-reload, fix space in documentation man page.

* - fast-reload, copy respip client information to query state, put views tree
  in module env for lookup.

* - fast-reload, nicer respip view comparison.

* - fast-reload, respip global set is in module env.

* - fast-reload, document that respip_client_info acl info is copied.

* - fast-reload, reload the respip_set.

* - fast-reload, document no pause and pick up of use_response_ip boolean.

* - fast-reload, fix test compile.

* - fast-reload, reload local zones.

* Update locking management for iter_fwd and iter_hints methods. (#1054)

fast reload, move most of the locking management to iter_fwd and
iter_hints methods. The caller still has the ability to handle its
own locking, if desired, for atomic operations on sets of different
structs.

Co-authored-by: Wouter Wijngaards <wcawijngaards@users.noreply.github.com>

* - fast-reload, reload access-control.

* - fast-reload, reload access control interface, such as interface-action.

* - fast-reload, reload tcp-connection-limit.

* - fast-reload, improve comments on acl_list and tcl_list swap tree.

* - fast-reload, fixup references to old tcp connection limits in open tcp
  connections.

* - fast-reload, fixup to clean tcp connection also for different linked order.

* - fast-reload, if no tcp connection limits existed, no need to remove
  references for that.

* - fast-reload, document more options that work and do not work.

* - fast-reload, reload auth_zone and rpz data.

* - fast-reload, fix auth_zones_get_mem.

* - fast-reload, fix compilation of testbound for the new comm_timer_get_mem
  reference in remote control.

* - fast-reload, change use_rpz with reload.

* - fast-reload, list changes in auth zones and stop zonemd callbacks for
  deleted auth zones.

* - fast-reload, note xtree is not swapped, and why it is not swapped.

* - fast-reload, for added auth zones, pick up zone transfer and zonemd tasks.

* - fast-reload, unlock xfr when done with transfer pick up.

* - fast-reload, unlock z when picking up the xfr for it during transfer task
  pick up.

* - fast-reload, pick up task changes for added, deleted and modified auth zones.

* - fast-reload, remove xfr of auth zone deletion without tasks.

* - fast-reload, pick up zone transfer config.

* - fast-reload, the main worker thread picks up the transfer tasks and also
  performs setup of the xfer struct.

* - fast-reload, keep writelock on newzone when auth zone changes.

* - fast-reload, change cachedb_enabled setting.

* - fast-reload, pick up edns-strings config.

* - fast-reload, note that settings are not updated.

* - fast-reload, pick up dnstap config.

* - fast-reload, dnstap options that need to be loaded without +p.

* - fast-reload, fix auth zone reload

* - fast-reload, remove debug for auth zone test.

* - fast-reload, fix auth zone reload with zone transfer.

* - fast-reload, fix auth zone reload lock order.

* - fast-reload, remove debug from fast reload test.

* - fast-reload, remove unused function.

* - fast-reload, fix the worker trust anchor probe timer lock acquisition in
  the probe answer callback routine for trust anchor probes.

* - fast-reload, reload trust anchors.

* - fast-reload, fix trust anchor reload lock on autr global data and test
  for trust anchor reload.

* - fast-reload, adjust cache sizes.

* - fast-reload, reload cache sizes when changed.

* - fast-reload, reload validator env changes.

* - fast-reload, reload mesh changes.

* - fast-reload, check for incompatible changes.

* - fast-reload, improve error text for incompatible change.

* - fast-reload, fix check config option compatibility.

* - fast-reload, improve error text for nopause change.

* - fast-reload, fix spelling of incompatible options.

* - fast-reload, reload target-fetch-policy, outbound-msg-retry, max-sent-count
  and max-query-restarts.

* - fast-reload, check nopause config change for target-fetch-policy.

* - fast-reload, reload do-not-query-address, private-address and capt-exempt.

* - fast-reload, check nopause config change for do-not-query-address,
  private-address and capt-exempt.

* - fast-reload, check fast reload not possible due to interface and
  outgoing-interface changes.

* - fast-reload, reload nat64 settings.

* - fast-reload, reload settings stored in the infra structure.

* - fast-reload, fix modstack lookup and remove outgoing-range check.

* - fast-reload, more explanation for config parse failure.

* - fast-reload, reload worker outside network changes.

* - fast-reload, detect incompatible changes in network settings.

* fast-reload, commit test files.

* - fast-reload, fix warnings for call types in windows compile.

* - fast-reload, fix warnings and comm_point_internal for tcp wouldblock calls.

* - fast-reload, extend lock checks for repeat thread ids.

* - fast-reload, additional test cases, cache change and tag changes.

* - fast-reload, fix documentation for auth_zone_verify_zonemd_with_key.

* - fast-reload, fix copy_cfg type casts and memory leak on config parse failure.

* - fast-reload, fix use of WSAPoll.

* Review comments for the fast reload feature (#1259)

* - fast-reload review, respip set can be null from a view.

* - fast-reload review, typos.

* - fast-reload review, keep clang static analyzer happy.

* - fast-reload review, don't forget to copy tag_actions.

* - fast-reload review, less indentation.

* - fast-reload review, don't leak respip_actions when reloading.

* - fast-reload review, protect NULL pointer dereference in get_mem
  functions.

* - fast-reload review, add fast_reload_most_options.tdir to test most
  options with high verbosity when fast reloading.

* - fast-reload review, don't skip new line on long error printouts.

* - fast-reload review, typo.

* - fast-reload review, use new_z for consistency.

* - fast-reload review, nit for unlock ordering to make eye comparison
  with the lock counterpart easier.

* - fast-reload review, in case of error the sockets are already closed.

* - fast-reload review, identation.

* - fast-reload review, add static keywords.

* - fast-reload review, update unbound-control usage text.

* - fast-reload review, updates to the man page.

* - fast-reload, the fast-reload command is experimental.

* - fast-reload, fix compile of doqclient for fast reload functions.

* Changelog comment for #1042
- Merge #1042: Fast Reload. The unbound-control fast_reload is added.
  It reads changed config in a thread, then only briefly pauses the
  service threads, that keep running. DNS service is only interrupted
  briefly, less than a second.

---------

Co-authored-by: Yorgos Thessalonikefs <yorgos@nlnetlabs.nl>
2025-03-31 15:25:24 +02:00

309 lines
10 KiB
C

/*
* validator/val_neg.h - validator aggressive negative caching functions.
*
* Copyright (c) 2008, NLnet Labs. All rights reserved.
*
* This software is open source.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* Neither the name of the NLNET LABS nor the names of its contributors may
* be used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
* TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
* PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
* LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
* NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/**
* \file
*
* This file contains helper functions for the validator module.
* The functions help with aggressive negative caching.
* This creates new denials of existence, and proofs for absence of types
* from cached NSEC records.
*/
#ifndef VALIDATOR_VAL_NEG_H
#define VALIDATOR_VAL_NEG_H
#include "util/locks.h"
#include "util/rbtree.h"
struct sldns_buffer;
struct val_neg_data;
struct config_file;
struct reply_info;
struct rrset_cache;
struct regional;
struct query_info;
struct dns_msg;
struct ub_packed_rrset_key;
/**
* The negative cache. It is shared between the threads, so locked.
* Kept as validator-environ-state. It refers back to the rrset cache for
* data elements. It can be out of date and contain conflicting data
* from zone content changes.
* It contains a tree of zones, every zone has a tree of data elements.
* The data elements are part of one big LRU list, with one memory counter.
*/
struct val_neg_cache {
/** the big lock on the negative cache. Because we use a rbtree
* for the data (quick lookup), we need a big lock */
lock_basic_type lock;
/** The zone rbtree. contents sorted canonical, type val_neg_zone */
rbtree_type tree;
/** the first in linked list of LRU of val_neg_data */
struct val_neg_data* first;
/** last in lru (least recently used element) */
struct val_neg_data* last;
/** current memory in use (bytes) */
size_t use;
/** max memory to use (bytes) */
size_t max;
/** max nsec3 iterations allowed */
size_t nsec3_max_iter;
/** number of times neg cache records were used to generate NOERROR
* responses. */
size_t num_neg_cache_noerror;
/** number of times neg cache records were used to generate NXDOMAIN
* responses. */
size_t num_neg_cache_nxdomain;
};
/**
* Per Zone aggressive negative caching data.
*/
struct val_neg_zone {
/** rbtree node element, key is this struct: the name, class */
rbnode_type node;
/** name; the key */
uint8_t* name;
/** length of name */
size_t len;
/** labels in name */
int labs;
/** pointer to parent zone in the negative cache */
struct val_neg_zone* parent;
/** the number of elements, including this one and the ones whose
* parents (-parents) include this one, that are in_use
* No elements have a count of zero, those are removed. */
int count;
/** if 0: NSEC zone, else NSEC3 hash algorithm in use */
int nsec3_hash;
/** nsec3 iteration count in use */
size_t nsec3_iter;
/** nsec3 salt in use */
uint8_t* nsec3_salt;
/** length of salt in bytes */
size_t nsec3_saltlen;
/** tree of NSEC data for this zone, sorted canonical
* by NSEC owner name */
rbtree_type tree;
/** class of node; host order */
uint16_t dclass;
/** if this element is in use, boolean */
uint8_t in_use;
};
/**
* Data element for aggressive negative caching.
* The tree of these elements acts as an index onto the rrset cache.
* It shows the NSEC records that (may) exist and are (possibly) secure.
* The rbtree allows for logN search for a covering NSEC record.
* To make tree insertion and deletion logN too, all the parent (one label
* less than the name) data elements are also in the rbtree, with a usage
* count for every data element.
* There is no actual data stored in this data element, if it is in_use,
* then the data can (possibly) be found in the rrset cache.
*/
struct val_neg_data {
/** rbtree node element, key is this struct: the name */
rbnode_type node;
/** name; the key */
uint8_t* name;
/** length of name */
size_t len;
/** labels in name */
int labs;
/** pointer to parent node in the negative cache */
struct val_neg_data* parent;
/** the number of elements, including this one and the ones whose
* parents (-parents) include this one, that are in use
* No elements have a count of zero, those are removed. */
int count;
/** the zone that this denial is part of */
struct val_neg_zone* zone;
/** previous in LRU */
struct val_neg_data* prev;
/** next in LRU (next element was less recently used) */
struct val_neg_data* next;
/** if this element is in use, boolean */
uint8_t in_use;
};
/**
* Create negative cache
* @param cfg: config options.
* @param maxiter: max nsec3 iterations allowed.
* @return neg cache, empty or NULL on failure.
*/
struct val_neg_cache* val_neg_create(struct config_file* cfg, size_t maxiter);
/**
* see how much memory is in use by the negative cache.
* @param neg: negative cache
* @return number of bytes in use.
*/
size_t val_neg_get_mem(struct val_neg_cache* neg);
/**
* Destroy negative cache. There must no longer be any other threads.
* @param neg: negative cache.
*/
void neg_cache_delete(struct val_neg_cache* neg);
/**
* Comparison function for rbtree val neg data elements
*/
int val_neg_data_compare(const void* a, const void* b);
/**
* Comparison function for rbtree val neg zone elements
*/
int val_neg_zone_compare(const void* a, const void* b);
/**
* Insert NSECs from this message into the negative cache for reference.
* @param neg: negative cache
* @param rep: reply with NSECs.
* Errors are ignored, means that storage is omitted.
*/
void val_neg_addreply(struct val_neg_cache* neg, struct reply_info* rep);
/**
* Insert NSECs from this referral into the negative cache for reference.
* @param neg: negative cache
* @param rep: referral reply with NS, NSECs.
* @param zone: bailiwick for the referral.
* Errors are ignored, means that storage is omitted.
*/
void val_neg_addreferral(struct val_neg_cache* neg, struct reply_info* rep,
uint8_t* zone);
/**
* For the given query, try to get a reply out of the negative cache.
* The reply still needs to be validated.
* @param neg: negative cache.
* @param qinfo: query
* @param region: where to allocate reply.
* @param rrset_cache: rrset cache.
* @param buf: temporary buffer.
* @param now: to check TTLs against.
* @param addsoa: if true, produce result for external consumption.
* if false, do not add SOA - for unbound-internal consumption.
* @param topname: do not look higher than this name,
* so that the result cannot be taken from a zone above the current
* trust anchor. Which could happen with multiple islands of trust.
* if NULL, then no trust anchor is used, but also the algorithm becomes
* more conservative, especially for opt-out zones, since the receiver
* may have a trust-anchor below the optout and thus the optout cannot
* be used to create a proof from the negative cache.
* @param cfg: config options.
* @return a reply message if something was found.
* This reply may still need validation.
* NULL if nothing found (or out of memory).
*/
struct dns_msg* val_neg_getmsg(struct val_neg_cache* neg,
struct query_info* qinfo, struct regional* region,
struct rrset_cache* rrset_cache, struct sldns_buffer* buf, time_t now,
int addsoa, uint8_t* topname, struct config_file* cfg);
/**** functions exposed for unit test ****/
/**
* Insert data into the data tree of a zone
* Does not do locking.
* @param neg: negative cache
* @param zone: zone to insert into
* @param nsec: record to insert.
*/
void neg_insert_data(struct val_neg_cache* neg,
struct val_neg_zone* zone, struct ub_packed_rrset_key* nsec);
/**
* Delete a data element from the negative cache.
* May delete other data elements to keep tree coherent, or
* only mark the element as 'not in use'.
* Does not do locking.
* @param neg: negative cache.
* @param el: data element to delete.
*/
void neg_delete_data(struct val_neg_cache* neg, struct val_neg_data* el);
/**
* Find the given zone, from the SOA owner name and class
* Does not do locking.
* @param neg: negative cache
* @param nm: what to look for.
* @param len: length of nm
* @param dclass: class to look for.
* @return zone or NULL if not found.
*/
struct val_neg_zone* neg_find_zone(struct val_neg_cache* neg,
uint8_t* nm, size_t len, uint16_t dclass);
/**
* Create a new zone.
* Does not do locking.
* @param neg: negative cache
* @param nm: what to look for.
* @param nm_len: length of name.
* @param dclass: class of zone, host order.
* @return zone or NULL if out of memory.
*/
struct val_neg_zone* neg_create_zone(struct val_neg_cache* neg,
uint8_t* nm, size_t nm_len, uint16_t dclass);
/**
* take a zone into use. increases counts of parents.
* Does not do locking.
* @param zone: zone to take into use.
*/
void val_neg_zone_take_inuse(struct val_neg_zone* zone);
/**
* Adjust the size of the negative cache.
* @param neg: negative cache
* @param max: new size for max mem.
*/
void val_neg_adjust_size(struct val_neg_cache* neg, size_t max);
#endif /* VALIDATOR_VAL_NEG_H */