Auto-configure '-slabs' values (#1276)

- Auto-configure '-slabs' values to a power of 2 value close to num-threads
  by default for multi-threaded environments.

Co-authored-by: Wouter Wijngaards <wcawijngaards@users.noreply.github.com>
This commit is contained in:
Yorgos Thessalonikefs 2025-04-29 15:21:47 +02:00 committed by GitHub
parent a904a3a2c2
commit fcc21885e4
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 111 additions and 29 deletions

View file

@ -259,9 +259,12 @@ A plain number is in bytes, append 'k', 'm' or 'g' for kilobytes, megabytes
or gigabytes (1024*1024 bytes in a megabyte).
.TP
.B msg\-cache\-slabs: \fI<number>
Number of slabs in the message cache. Slabs reduce lock contention by threads.
Must be set to a power of 2. Setting (close) to the number of cpus is a
reasonable guess.
Number of slabs in the message cache.
Slabs reduce lock contention by threads.
Must be set to a power of 2.
Setting (close) to the number of cpus is a fairly good setting.
If left unconfigured, it will be configured automatically to be a power of 2
close to the number of configured threads in multi-threaded environments.
.TP
.B num\-queries\-per\-thread: \fI<number>
The number of queries that every thread will service simultaneously.
@ -400,8 +403,12 @@ A plain number is in bytes, append 'k', 'm' or 'g' for kilobytes, megabytes
or gigabytes (1024*1024 bytes in a megabyte).
.TP
.B rrset\-cache\-slabs: \fI<number>
Number of slabs in the RRset cache. Slabs reduce lock contention by threads.
Number of slabs in the RRset cache.
Slabs reduce lock contention by threads.
Must be set to a power of 2.
Setting (close) to the number of cpus is a fairly good setting.
If left unconfigured, it will be configured automatically to be a power of 2
close to the number of configured threads in multi-threaded environments.
.TP
.B cache\-max\-ttl: \fI<seconds>
Time to live maximum for RRsets and messages in the cache. Default is
@ -436,8 +443,12 @@ Time to live for entries in the host cache. The host cache contains
roundtrip timing, lameness and EDNS support information. Default is 900.
.TP
.B infra\-cache\-slabs: \fI<number>
Number of slabs in the infrastructure cache. Slabs reduce lock contention
by threads. Must be set to a power of 2.
Number of slabs in the infrastructure cache.
Slabs reduce lock contention by threads.
Must be set to a power of 2.
Setting (close) to the number of cpus is a fairly good setting.
If left unconfigured, it will be configured automatically to be a power of 2
close to the number of configured threads in multi-threaded environments.
.TP
.B infra\-cache\-numhosts: \fI<number>
Number of hosts for which information is cached. Default is 10000.
@ -1494,9 +1505,12 @@ A plain number is in bytes, append 'k', 'm' or 'g' for kilobytes, megabytes
or gigabytes (1024*1024 bytes in a megabyte).
.TP
.B key\-cache\-slabs: \fI<number>
Number of slabs in the key cache. Slabs reduce lock contention by threads.
Must be set to a power of 2. Setting (close) to the number of cpus is a
reasonable guess.
Number of slabs in the key cache.
Slabs reduce lock contention by threads.
Must be set to a power of 2.
Setting (close) to the number of cpus is a fairly good setting.
If left unconfigured, it will be configured automatically to be a power of 2
close to the number of configured threads in multi-threaded environments.
.TP
.B neg\-cache\-size: \fI<number>
Number of bytes size of the aggressive negative cache. Default is 1 megabyte.
@ -1903,9 +1917,12 @@ The ratelimit structure is small, so this data structure likely does
not need to be large.
.TP 5
.B ratelimit\-slabs: \fI<number>
Give power of 2 number of slabs, this is used to reduce lock contention
in the ratelimit tracking data structure. Close to the number of cpus is
a fairly good setting.
Number of slabs in the ratelimit tracking data structure.
Slabs reduce lock contention by threads.
Must be set to a power of 2.
Setting (close) to the number of cpus is a fairly good setting.
If left unconfigured, it will be configured automatically to be a power of 2
close to the number of configured threads in multi-threaded environments.
.TP 5
.B ratelimit\-factor: \fI<number>
Set the amount of queries to rate limit when the limit is exceeded.
@ -1974,9 +1991,12 @@ The ip ratelimit structure is small, so this data structure likely does
not need to be large.
.TP 5
.B ip\-ratelimit\-slabs: \fI<number>
Give power of 2 number of slabs, this is used to reduce lock contention
in the ip ratelimit tracking data structure. Close to the number of cpus is
a fairly good setting.
Number of slabs in the ip ratelimit tracking data structure.
Slabs reduce lock contention by threads.
Must be set to a power of 2.
Setting (close) to the number of cpus is a fairly good setting.
If left unconfigured, it will be configured automatically to be a power of 2
close to the number of configured threads in multi-threaded environments.
.TP 5
.B ip\-ratelimit\-factor: \fI<number>
Set the amount of queries to rate limit when the limit is exceeded.
@ -2610,9 +2630,12 @@ The shared secret cache is used when a same client is making multiple queries
using the same public key. It saves a substantial amount of CPU.
.TP
.B dnscrypt\-shared\-secret\-cache\-slabs: \fI<number>
Give power of 2 number of slabs, this is used to reduce lock contention
in the dnscrypt shared secrets cache. Close to the number of cpus is
a fairly good setting.
Number of slabs in the dnscrypt shared secrets cache.
Slabs reduce lock contention by threads.
Must be set to a power of 2.
Setting (close) to the number of cpus is a fairly good setting.
If left unconfigured, it will be configured automatically to be a power of 2
close to the number of configured threads in multi-threaded environments.
.TP
.B dnscrypt\-nonce\-cache\-size: \fI<memory size>
Give the size of the data structure in which the client nonces are kept in.
@ -2621,9 +2644,12 @@ The nonce cache is used to prevent dnscrypt message replaying. Client nonce
should be unique for any pair of client pk/server sk.
.TP
.B dnscrypt\-nonce\-cache\-slabs: \fI<number>
Give power of 2 number of slabs, this is used to reduce lock contention
in the dnscrypt nonce cache. Close to the number of cpus is
a fairly good setting.
Number of slabs in the dnscrypt nonce cache.
Slabs reduce lock contention by threads.
Must be set to a power of 2.
Setting (close) to the number of cpus is a fairly good setting.
If left unconfigured, it will be configured automatically to be a power of 2
close to the number of configured threads in multi-threaded environments.
.SS "EDNS Client Subnet Module Options"
.LP
The ECS module must be configured in the \fBmodule\-config:\fR directive e.g.,

View file

@ -670,6 +670,7 @@ authtest_addzone(struct auth_zones* az, const char* name, char* fname)
auth_zone_set_zonefile(z, fname);
z->for_upstream = 1;
cfg = config_create();
config_auto_slab_values(cfg);
free(cfg->chrootdir);
cfg->chrootdir = NULL;

View file

@ -131,6 +131,7 @@ void infra_test(void)
unit_show_feature("infra cache");
unit_assert(ipstrtoaddr("127.0.0.1", 53, &one, &onelen));
config_auto_slab_values(cfg);
slab = infra_create(cfg);
/* insert new record */
unit_assert( infra_host(slab, &one, onelen, zone, zonelen, now,

View file

@ -267,6 +267,7 @@ static void zonemd_verify_test(char* zname, char* zfile, char* tastr,
env.cfg = config_create();
if(!env.cfg)
fatal_exit("out of memory");
config_auto_slab_values(env.cfg);
env.now = &now;
env.cfg->val_date_override = cfg_convert_timeval(date_override);
if(!env.cfg->val_date_override)

View file

@ -169,10 +169,10 @@ config_create(void)
cfg->edns_buffer_size = 1232; /* from DNS flagday recommendation */
cfg->msg_buffer_size = 65552; /* 64 k + a small margin */
cfg->msg_cache_size = 4 * 1024 * 1024;
cfg->msg_cache_slabs = 4;
cfg->msg_cache_slabs = 0;
cfg->jostle_time = 200;
cfg->rrset_cache_size = 4 * 1024 * 1024;
cfg->rrset_cache_slabs = 4;
cfg->rrset_cache_slabs = 0;
cfg->host_ttl = 900;
cfg->bogus_ttl = 60;
cfg->min_ttl = 0;
@ -182,7 +182,7 @@ config_create(void)
cfg->prefetch = 0;
cfg->prefetch_key = 0;
cfg->deny_any = 0;
cfg->infra_cache_slabs = 4;
cfg->infra_cache_slabs = 0;
cfg->infra_cache_numhosts = 10000;
cfg->infra_cache_min_rtt = 50;
cfg->infra_cache_max_rtt = 120000;
@ -291,7 +291,7 @@ config_create(void)
cfg->keep_missing = 366*24*3600; /* one year plus a little leeway */
cfg->permit_small_holddown = 0;
cfg->key_cache_size = 4 * 1024 * 1024;
cfg->key_cache_slabs = 4;
cfg->key_cache_slabs = 0;
cfg->neg_cache_size = 1 * 1024 * 1024;
cfg->local_zones = NULL;
cfg->local_zones_nodefault = NULL;
@ -341,8 +341,8 @@ config_create(void)
cfg->ip_ratelimit_cookie = 0;
cfg->ip_ratelimit = 0;
cfg->ratelimit = 0;
cfg->ip_ratelimit_slabs = 4;
cfg->ratelimit_slabs = 4;
cfg->ip_ratelimit_slabs = 0;
cfg->ratelimit_slabs = 0;
cfg->ip_ratelimit_size = 4*1024*1024;
cfg->ratelimit_size = 4*1024*1024;
cfg->ratelimit_for_domain = NULL;
@ -367,9 +367,9 @@ config_create(void)
cfg->dnscrypt_provider_cert_rotated = NULL;
cfg->dnscrypt_secret_key = NULL;
cfg->dnscrypt_shared_secret_cache_size = 4*1024*1024;
cfg->dnscrypt_shared_secret_cache_slabs = 4;
cfg->dnscrypt_shared_secret_cache_slabs = 0;
cfg->dnscrypt_nonce_cache_size = 4*1024*1024;
cfg->dnscrypt_nonce_cache_slabs = 4;
cfg->dnscrypt_nonce_cache_slabs = 0;
cfg->pad_responses = 1;
cfg->pad_responses_block_size = 468; /* from RFC8467 */
cfg->pad_queries = 1;
@ -454,6 +454,11 @@ struct config_file* config_create_forlib(void)
cfg->val_log_squelch = 1;
cfg->minimal_responses = 0;
cfg->harden_short_bufsize = 1;
/* Need to explicitly define the slabs from their 0 default value */
cfg->ip_ratelimit_slabs = 1;
cfg->ratelimit_slabs = 1;
cfg->dnscrypt_shared_secret_cache_slabs = 1;
cfg->dnscrypt_nonce_cache_slabs = 1;
return cfg;
}
@ -1448,6 +1453,41 @@ create_cfg_parser(struct config_file* cfg, char* filename, const char* chroot)
init_cfg_parse();
}
void
config_auto_slab_values(struct config_file* cfg)
{
#define SET_AUTO_SLAB(var, name, val) \
do { \
if(cfg->var == 0) { \
cfg->var = val; \
verbose(VERB_QUERY, "setting "name": %lu", (unsigned long)val); \
} \
} while(0);
#ifdef THREADS_DISABLED
size_t pow_2_threads = 1;
#else
size_t pow_2_threads = 4; /* pow2 start */
while (pow_2_threads < (size_t)(cfg->num_threads?cfg->num_threads:1) &&
/* 1/3 of the distance to the next pow2 value stays with the
* lower value */
(size_t)cfg->num_threads > pow_2_threads + (pow_2_threads - 1)/3) {
pow_2_threads <<= 1;
}
log_assert((pow_2_threads & (pow_2_threads - 1)) == 0); /* powerof2? */
#endif /* THREADS_DISABLED */
SET_AUTO_SLAB(msg_cache_slabs, "msg-cache-slabs", pow_2_threads);
SET_AUTO_SLAB(rrset_cache_slabs, "rrset-cache-slabs", pow_2_threads);
SET_AUTO_SLAB(infra_cache_slabs, "infra-cache-slabs", pow_2_threads);
SET_AUTO_SLAB(key_cache_slabs, "key-cache-slabs", pow_2_threads);
SET_AUTO_SLAB(ip_ratelimit_slabs, "ip-ratelimit-slabs", pow_2_threads);
SET_AUTO_SLAB(ratelimit_slabs, "ratelimit-slabs", pow_2_threads);
SET_AUTO_SLAB(dnscrypt_shared_secret_cache_slabs,
"dnscrypt-shared-secret-cache-slabs", pow_2_threads);
SET_AUTO_SLAB(dnscrypt_nonce_cache_slabs,
"dnscrypt-nonce-cache-slabs", pow_2_threads);
}
int
config_read(struct config_file* cfg, const char* filename, const char* chroot)
{
@ -1512,6 +1552,7 @@ config_read(struct config_file* cfg, const char* filename, const char* chroot)
}
}
globfree(&g);
config_auto_slab_values(cfg);
return 1;
}
#endif /* HAVE_GLOB */
@ -1535,6 +1576,7 @@ config_read(struct config_file* cfg, const char* filename, const char* chroot)
return 0;
}
config_auto_slab_values(cfg);
return 1;
}

View file

@ -966,6 +966,17 @@ struct config_file* config_create(void);
*/
struct config_file* config_create_forlib(void);
/**
* If _slabs values are not explicitly configured, 0 value, put them in a
* pow2 value close to the number of threads used.
* Starts at the current default 4.
* If num_threads is in between two pow2 values, 1/3 of the way stays with
* the lower pow2 value.
* Exported for unit testing.
* @param config: where the _slabs values reside.
*/
void config_auto_slab_values(struct config_file* config);
/**
* Read the config file from the specified filename.
* @param config: where options are stored into, must be freshly created.