review fixes.

git-svn-id: file:///svn/unbound/trunk@1897 be551aaa-1e26-0410-a405-d3ace91eadb9
2026-02-09 22:03:15 -05:00 · 2009-11-10 16:02:05 +00:00 · 2009-11-10 16:02:05 +00:00 · 5bc9a80e40
commit 5bc9a80e40
parent 6e8e4e87b7
5 changed files with 7 additions and 164 deletions
--- a/daemon/worker.c
+++ b/daemon/worker.c
@ -1121,9 +1121,10 @@ worker_init(struct worker* worker, struct config_file *cfg,
 			worker_probe_timer_cb, worker);
 		if(!worker->env.probe_timer) {
 			log_err("could not create 5011-probe timer");
+		} else {
+			/* let timer fire, then it can reset itself */
+			comm_timer_set(worker->env.probe_timer, &tv);
 		}
-		/* let timer fire, then it can reset itself */
-		comm_timer_set(worker->env.probe_timer, &tv);
 	}
 	if(!worker->env.mesh || !worker->env.scratch_buffer) {
 		worker_delete(worker);
--- a/doc/Changelog
+++ b/doc/Changelog
@ -2,6 +2,7 @@
 	- Thanks to Surfnet found bug in new dnssec-retry code that failed
 	  to combine well when combined with DLV and a particular failure. 
 	- Fixed unbound-control -h output about argument optionality.
+	- review comments.

 5 November 2009: Wouter
 	- lint fixes and portability tests.
--- a/doc/TODO
+++ b/doc/TODO
@ -62,167 +62,8 @@ o infra and lame cache: easier size config (in Mb), show usage in graphs.
 - store time of dump in cachedumps, so that on a load the ttls can be
  compared to the absolute time, and now-expired items can be dealt with.

-1.3.x:
- spoofed delegpt fixes - if DNSKEY prime fails
-	- set DNSKEY bogus and DNSKEY query msg bogus.
-	- make NS set bogus too - if not validated as secure.
-	- check where queries go - otherwise reduce TTL on NS.
-	- also make DS NSEC bogus. Also DS msg cache entry.
- mark bogus under stringent conditions
-	- if DS at parent and validly signed. Then DNSKEY must exist.
-	- Also for trust anchor points themselves. DNSKEY must exist.
-	- so if then DNSKEY keyprime fails
-	  - then it is not simply a server that only answers qtype A.
-	  - then parent is agreeing (somewhat) with the DS record
-	  - but it could still be a lame domain, these exist 
-	    The objective is to keep tries for genuinely lame domains to a
-	    minimum, while detecting forgeries quickly. exponential backoff.
-	- for unbound we can check if we got something to verify while
-	  building that chain of trust.  If so - not lame, agressive retry.
-	  - but security-lame zones also exist and should not pose 
-	    too high a burden. Exponential backoff again.
-	    (fe. badly signed or dnskey reply too large fails).
-	- the delegation NS for the domain is bogus.
-	  The referral retried, with exponential backoff.
-	  This exponential backoff should go towards values which are close
-	  to the TTLs that are used now (on lame delegations for example).
-	  so that the extra traffic is manageable.
-	- for unbound, reset the TTL on the NS rrset. Let it timeout.
-	  Set NS rrset bogus - no more queries to the domain are done.
-	  Also set DNSKEY and DS (rrset, NSEC, msg) bogus and ttl like that.
-	  (to the same absolute value, so a clean retry is done).
-	  TTL of NS is (rounddown) timeout in seconds.
-	  Until the NS times out and referral is done again.
-	  Make sure multiple validations for chains of trust do not result
-	  in a flood of queries or backoff too quickly.
- bogus exponential backoff cache. hash(name,t,c), size(1M, 5%).
-	TTL of 24h.  Backoff from 200msec to 24h.
-	x2 on bogus(18 tries), x8 backoff on lameness(6 tries), 
-	  when servfail for DNSKEY.
-	remove entry when validated as secure.
-	delegptspoofrecheck on lameness when harden-referral-path NS
-	query has servfail, then build chain of trust down (check DS,
-	then perform DNSKEY query) if that DNSKEY query fails servfail,
-	perform the x8 lameness retry fallback.
-
-* keep a list of guilty IP addresses in the qstate, which contains both
-  the child side guilty IPs and the parent guilty IPs.  Valid signed DSes
-  are not made guilty in the global cache.  The child IP is made guilty
-  in the global cache.
-* Retry to higher trust anchors.
-  * option not to retry to higher from this ta.
-  * keep longest must-be-secure name. Do no accept insecure above this point.
-  * if failed ta, blame all lower tas for their DNSKEY (get IP from cached
-    rrset),  if failure is insecure - nothing, if at bogus - blame that too.
-    lower tas have isdata=false, so the IP address for the dnskeyrrset in
-    the cache is set to avoid in qstate. Nothing in infracache, no childretry.
-
-Retry harder to get valid DNSSEC data.
-Triggered by a trust anchor or by a signed DS record for a zone.
-* If data is fetched and validation fails for it
-  or DNSKEY is fetched and validated into chain-of-trust fails for it
-  or DS is fetched and validated into chain-of-trust fails for it
-  Then
-  	blame(signer zone, IP origin of the data/DNSKEY/DS, x2, isdata)
-* If data was not fetched (SERVFAIL, lame, ...), and the data
-  is under a signed DS then:
-  	blame(thatDSname, IP origin of the data/DNSKEY/DS, x8)
-  x8 because the zone may be lame.
-  This means a chain of trust is built also for unfetched data, to
-  determine if a signed DS is present.  If insecure, nothing is done.
-* If DNSKEY was not fetched for chain of trust (SERVFAIL, lame, ...),
-  Then
-  	blame(DNSKEYname, IP origin of the data/DNSKEY/DS, x8)
-  x8 because the zone may be lame.
-* blame(zonename, guiltyIP, multiplier, isdata):
-  * if isdata:
-    Set the guiltyIP,zonename as DNSSEC-bogus-data=true in lameness cache.
-    Thusly marked servers are avoided if possible, used as last resort.
-    The guilt TTL is the infra cache ttl (15 minutes).  
-    The dnssec retry scheme works without this cache entry.
-  * If the key cache entry 'being-backed-off' is true and isdata then:
-	The parent is backedoff, it must be the childs fault. Retry to child.
-	if the child-dnskey is bogus, then retry is useless, stop.
-	Perform a child-retry - purge dataonly, childside, mark
-	data-IPaddress from child as to avoid-forquery. counterperquery,
-	max is 3, if reached, set this data element RRset&msg to the 
-	current backoff TTL end-time or bogus-ttl(60 seconds) whichever is less
-	and done.
-  * if no retry entry exists for the zone key, create one with 24h TTL, 10 ms.
-    else the backoff *= multiplier.
-  * If the backoff is less than a second, remove entries from cache and
-    restart query.  Else set the TTL for the entries to that value.
-  * Entries to set or remove: DNSKEY RRset&msg, DS RRset&msg, NS RRset&msg, 
-  	in-zone glue (A and AAAA) RRset&msg, and key-cache-entry TTL.
-	The the data element RRset&msg to the backoff TTL or bogusttl.
-	If TTL>1sec set key-cache-entry flag 'being-backed-off' to true.
-	when entry times out that flag is reset to false again.
-* Storage extra is:
-  IP address per RRset and message.  A lot of memory really, since that is
-  132 bytes per RRset and per message.  Store plain IP: 4/16 bytes, len byte.
-  port number 2bytes. +19bytes per RRset, per msg.
-  guilt flag in infra(lameness) cache.
-  being-backed-off flag for key cache, also backoff time value and its TTL.
-  child-retry-count and guilty-ip-list in qstate.
-* Load on authorities:
-  For lame servers: 7 tries per day (one per three hours on average).
-  Others get up to 23 tries per day (one per hour on average).
-  +1 for original try makes 8/24 hours and 24/24 hours.
-  Unless the cache entry falls out of the cache due to memory. In that
-  case it can be tried more often, this is similar to the NS entry falling
-  out of the cache due to memory, in that case it also has to be retried.
-* Performance analysis:
-  * domain is sold.  Unbound sees invalid signature (expired) or the old
-    servers refuse the queries.  Retry within the second, if parent has
-    new DS and NS available instantly works again (no downtime).
-  * domain is bogus signed.  Parent gets 1 query per hour.
-	Domain itself gets couple tries per queryname, per minute.
-  * domain partly bogus.  Parent gets 1 query per hour.
-	Domain itself gets couple tries per bogus queryname, per minute.
-  * spoof attempt.  Unbound tries a couple times.  If not spoofed again,
-    it works, if spoofed every time unbound backs off and stops trying.
-	But childretry is attempted more often, once per minute.
-  * parent has inconsistently signed DS records.  Together with a subzone that
-    is badly managed.  Unbound backs up to the root once per hour.
-  * parent has bad DS records, different sets on different servers, but they
-    are signed ok.  Works as for every query a list of bad nameserver, parent
-    and child side is kept, walks through them.  But as backoff increases
-    and becomes bigger than the TTL on the DS records, unbound will blackout.
-    The parent really has to be fixed...
-    The issue is that it is validly signed, but bad data. Unbound will very
-    conservatively retry it.
-  * domain is sold, but decommission is faster than the setup of new server.
-    Unbound does exponential backoff, if new setup is fast, it'll pickup the
-    new data fast.
-  * key rollover failed.  The zone has bad keys.  Like it was bogus signed.
-  * one nameserver has bad data.  Unbound goes back to the parent but also
-    marks that server as guilty.  Picks data from other server right after,
-    retry without blackout for the user.  
-    When parent starts to get backed off, if the nameserver is childside,
-    queryretries for childservers are made when queries fail.
-  * domain was sold, but unbound has old entries in the cache.  These somehow
-    need (re)validation (were queried with +cd, now -cd).  The entries are
-    bogus.  
-    Unbound performs childretry for these entries.  Works once the keys
-    have been successfully reprimed with parentretry.
-  * unbound is configured to talk to upstream caches.  These caches have
-    inconsistent bad data.  If one is bad, it is marked bad for that zone.  
-    If all are bad, there may not be any way for unbound to remove the
-    bad entries from the upstream caches.  It simply fails.
-    Recommendation: make the upstream caches validate as well.
-  * Old data that was valid with a long TTL remains in the cache.
-    Valid data has a TTL and this is the protocol.
-  * listing bad servers and trying again may not be good enough, since
-    a combinatorial explosion for DSxDNSKEYxdata is possible for every
-    signature validation (using different nameservers for DS, DNSKEY and
-    data, assuming only the right combination has a chain of trust to data).
-    The parentretries perform DS and DNSKEY searching.
-    childretries perform data searching.
-
-
 later
 - selective verbosity; ubcontrol trace example.com
- option to log only bogus domainname encountered, for demos
 - cache fork-dump, pre-load
 - for fwds, send queries to N servers in fwd-list, use first reply.
  document high scalable, high available unbound setup onepager.
--- a/iterator/iterator.c
+++ b/iterator/iterator.c
@ -946,8 +946,9 @@ processInitRequest(struct module_qstate* qstate, struct iter_qstate* iq,
 		delnamelen = iq->qchase.qname_len;
 	}
 	if(iq->qchase.qtype == LDNS_RR_TYPE_DS || iq->refetch_glue) {
-		/* remove first label from delname, root goes to hints */
-		if(dname_is_root(delname))
+		/* remove first label from delname, root goes to hints,
+		 * but only to fetch glue, not for qtype=DS. */
+		if(dname_is_root(delname) && iq->refetch_glue)
 			delname = NULL; /* go to root priming */
 		else 	dname_remove_label(&delname, &delnamelen);
 		iq->refetch_glue = 0; /* if CNAME causes restart, no refetch */
--- a/testcode/checklocks.c
+++ b/testcode/checklocks.c
@ -87,7 +87,6 @@ static void lock_error(struct checked_lock* lock,
 		(lock->type==check_lock_rwlock)?"rwlock": "badtype")), err);
 	log_err("complete status display:");
 	total_debug_info();
-	abort();
 	fatal_exit("bailing out");
 }