mirror of
https://github.com/isc-projects/bind9.git
synced 2026-04-28 09:37:10 -04:00
updated drafts
This commit is contained in:
parent
bb7babe2ae
commit
843e4dfd2c
8 changed files with 2578 additions and 2148 deletions
|
|
@ -5,7 +5,7 @@ Expires September 2001 T. Lindgreen
|
|||
|
||||
Parent stores the child's zone KEYs
|
||||
|
||||
draft-ietf-dnsext-parent-stores-zone-keys-00.txt
|
||||
draft-ietf-dnsext-parent-stores-zone-keys-01.txt
|
||||
|
||||
|
||||
Status of This Document
|
||||
|
|
@ -28,7 +28,7 @@ Status of This Document
|
|||
Comments should be sent to the authors or the DNSEXT WG mailing
|
||||
list namedroppers@ops.ietf.org.
|
||||
|
||||
This document updates RFC 2535 [2].
|
||||
This document updates RFC 2535.
|
||||
|
||||
|
||||
Copyright Notice
|
||||
|
|
@ -51,9 +51,9 @@ Abstract
|
|||
|
||||
|
||||
|
||||
Gieben Expires September 2001 [Page 2]
|
||||
Gieben & Lindgreen Expires November 2001 [Page 2]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS March 2001
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
simple key rollover and resigning mechanism. For large TLDs this is
|
||||
extremely important.
|
||||
|
|
@ -69,29 +69,29 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
|
||||
Table of Contents
|
||||
|
||||
Status of This Document....................................2
|
||||
Abstract...................................................2
|
||||
Status of This Document....................................
|
||||
Abstract...................................................
|
||||
|
||||
Table of Contents..........................................3
|
||||
1 Introduction.............................................3
|
||||
2 Proposal.................................................4
|
||||
2.1. TTL of the KEY and SIG at the parent..................4
|
||||
2.2. No NULL KEY...........................................5
|
||||
3 Impact on a secure aware resolver/forwarder..............5
|
||||
3.1 Impact of key rollovers on resolver/forwarder..........5
|
||||
4 Scheduled key rollover...................................6
|
||||
5 Unscheduled key rollover.................................6
|
||||
6 Zone resigning...........................................7
|
||||
7. Consequences for KEY and NXT records....................7
|
||||
7.1. KEY bit in NXT records................................7
|
||||
7.2. Authority of KEY records..............................8
|
||||
7.3. Selecting KEY sets....................................8
|
||||
8. The zone-KEY and local KEY records......................8
|
||||
9. Security Considerations.................................8
|
||||
Table of Contents..........................................
|
||||
1 Introduction.............................................
|
||||
2 Proposal.................................................
|
||||
2.1. TTL of the KEY and SIG at the parent..................
|
||||
2.2. No NULL KEY...........................................
|
||||
3 Impact on a secure aware resolver/forwarder..............
|
||||
3.1 Impact of key rollovers on resolver/forwarder..........
|
||||
4 Scheduled key rollover...................................
|
||||
5 Unscheduled key rollover.................................
|
||||
6 Zone resigning...........................................
|
||||
7. Consequences for KEY and NXT records....................
|
||||
7.1. KEY bit in NXT records................................
|
||||
7.2. Authority of KEY records..............................
|
||||
7.3. Selecting KEY sets....................................
|
||||
8. The zone-KEY and local KEY records......................
|
||||
9. Security Considerations.................................
|
||||
|
||||
Authors' Addresses.........................................9
|
||||
References.................................................9
|
||||
Full Copyright Statement...................................9
|
||||
Authors' Addresses.........................................
|
||||
References.................................................
|
||||
Full Copyright Statement...................................
|
||||
|
||||
|
||||
1. Introduction
|
||||
|
|
@ -99,8 +99,8 @@ Table of Contents
|
|||
DNSSEC on the ccTLDs and gTLDs.
|
||||
|
||||
In this document we are considering a secure zone, somewhere under a
|
||||
secure entry point and on-tree [1] validation between the secure
|
||||
entry point and the zone in question. The resolver we are
|
||||
secure entry point and on-tree [RFC 3090] validation between the
|
||||
secure entry point and the zone in question. The resolver we are
|
||||
considering is security aware and is preconfigured with the KEY of
|
||||
the secure entry point. We also make a distinction between a
|
||||
scheduled and a unscheduled key rollover. A scheduled rollover is
|
||||
|
|
@ -109,12 +109,12 @@ Table of Contents
|
|||
|
||||
|
||||
|
||||
Gieben Expires September 2001 [Page 3]
|
||||
Gieben & Lindgreen Expires November 2001 [Page 3]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS March 2001
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
|
||||
RFC 2535 [2] states that a zone KEY must be present in the apex of a
|
||||
RFC 2535 states that a zone KEY must be present in the apex of a
|
||||
zone. This can be in the at the delegation point in the parent's
|
||||
zonefile, or in the child's zonefile, or in both. This key is only
|
||||
valid if it is signed by the parent, so there is also the question
|
||||
|
|
@ -122,8 +122,8 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
|
||||
The original idea was to have the zone KEY RR and the parent's SIG to
|
||||
reside in the child's zone and perhaps also in the parent's zone.
|
||||
There is a draft proposal [3], that describes how a keyrollover can
|
||||
be handled.
|
||||
There is a draft proposal [RFC 2535], that describes how a
|
||||
keyrollover can be handled.
|
||||
|
||||
At NLnet Labs we found that storing the parent's signature over the
|
||||
child's zone KEY in the child's zone:
|
||||
|
|
@ -138,14 +138,16 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
|
||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
|
||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
|
||||
document are to be interpreted as described in RFC 2119 [2].
|
||||
document are to be interpreted as described in RFC 2119.
|
||||
|
||||
|
||||
2. Proposal
|
||||
The core of the new proposal is that the parent zone stores the
|
||||
parent's signature over the child's zone KEY and also the child's
|
||||
zone KEY itself. The child zone may also contain its zone KEY, in
|
||||
which case is must be selfsigned.
|
||||
zone KEY itself, and is authoritative for both KEY and SIG. The
|
||||
child zone may also contain its zone KEY, in which case is must be
|
||||
selfsigned. The child zone must not hold the parent's SIG, and must
|
||||
also not set the AA-bit on requests for its zone KEY.
|
||||
|
||||
The main advantage of this proposal is that all signatures signed by
|
||||
a key are in the same zone file as the producing key. This allows for
|
||||
|
|
@ -162,15 +164,15 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
|
||||
2.1. TTL of the KEY and SIG at the parent
|
||||
Each zone in DNS expresses in its SOA record the maximum and minimum
|
||||
|
||||
|
||||
|
||||
Gieben & Lindgreen Expires November 2001 [Page 4]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
TTL values that they allow in the zone. Thus it is possible that the
|
||||
parent will sign with a value that is unacceptable to the child. The
|
||||
|
||||
|
||||
|
||||
Gieben Expires September 2001 [Page 4]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS March 2001
|
||||
|
||||
parent MUST follow the TTL request of the child as long as that is
|
||||
within the allowed range for the parent.
|
||||
|
||||
|
|
@ -192,19 +194,40 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
Section 3.4 "Determination of Zone Secure/Unsecured Status":
|
||||
|
||||
" A zone KEY RR with the "no-key" type field value (both key type
|
||||
flag bits 0 and 1 on) indicates that the zone named is unsecured
|
||||
while a zone KEY RR with a key present indicates that the zone named
|
||||
is secure. The secured versus unsecured status of a zone may vary
|
||||
with different cryptographic algorithms. Even for the same
|
||||
algorithm, conflicting zone KEY RRs may be present. "
|
||||
flag bits 0 and 1 on) indicates that the zone named is unsecured
|
||||
while a zone KEY RR with a key present indicates that the zone named
|
||||
is secure. The secured versus unsecured status of a zone may vary
|
||||
with different cryptographic algorithms. Even for the same
|
||||
algorithm, conflicting zone KEY RRs may be present. "
|
||||
|
||||
This is rewritten as:
|
||||
|
||||
" A zone is considered secured by on-tree validation [1] when the
|
||||
there is a zone KEY from that zone present at its parent. If there
|
||||
is no zone KEY present, and the resolver is also unaware of
|
||||
alternative algorithms used and/or possible off-tree validation, the
|
||||
zone is considered unsecured. "
|
||||
" A zone is considered secured by on-tree validation [RFC 3090] when
|
||||
the there is a zone KEY from that zone present at its parent. If
|
||||
there is no zone KEY present, and the resolver is also unaware of
|
||||
alternative algorithms used and/or possible off-tree validation, the
|
||||
zone is considered unsecured. "
|
||||
|
||||
To further clarify this. A zone is secure, when the resolver expects
|
||||
it to be, there are two possibilities:
|
||||
1. When its parent is secure and holds a signed KEY for this child.
|
||||
2. When zone is a secure entry point, i.e. the resolver is
|
||||
preconfigured with the KEY of this zone.
|
||||
|
||||
RFC 3090 calls this globally secured.
|
||||
|
||||
When a zone contains SIGs and a selfsigned KEY and this KEY is
|
||||
preconfigured in the resolvers of interest, the a zone can be
|
||||
considered locally secured (the RFC 3090 defintion). hijacked.
|
||||
|
||||
If a zone is not globally or locally it must be considered unsecure.
|
||||
|
||||
|
||||
|
||||
|
||||
Gieben & Lindgreen Expires November 2001 [Page 5]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
|
||||
3. Impact on a secure aware resolver/forwarder
|
||||
|
|
@ -222,13 +245,6 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
|
||||
3.1. Impact of key rollovers on resolver/forwarder
|
||||
When a zone is in the process of a key rollover, there could be a
|
||||
|
||||
|
||||
|
||||
Gieben Expires September 2001 [Page 5]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS March 2001
|
||||
|
||||
discrepancy between the KEY and the SIG in the apex of the zone and
|
||||
the KEY and SIG that are stored in the cache of a resolver.
|
||||
|
||||
|
|
@ -257,13 +273,20 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
4. Scheduled key rollover
|
||||
When the signatures, produced by the key to be rolled over, are all
|
||||
in one zone file, there are two parties involved. Let us look at an
|
||||
example where a TLD rolls over its zone KEY. The new key needs to be
|
||||
signed with the root's key before it can be used to sign the TLD zone
|
||||
and the zone KEYs of the TLD's children. The steps that need to be
|
||||
taken by TLD and root are:
|
||||
possible example where a TLD rolls over its zone KEY. The new key
|
||||
needs to be signed with the root's key before it can be used to sign
|
||||
the TLD zone and the zone KEYs of the TLD's children. The steps that
|
||||
need to be taken by TLD and root are:
|
||||
- the TLD adds the new key to its KEY set in its zonefile. This
|
||||
zone and KEY set are signed with the old zone KEY
|
||||
- then the TLD signals the parent
|
||||
|
||||
|
||||
|
||||
Gieben & Lindgreen Expires November 2001 [Page 6]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
- the root copies the new KEY set, consisting of the both new and
|
||||
the old key, in its zonefile, resigns it and signals the TLD
|
||||
- the TLD removes the old key from its KEY set, resigns its zone
|
||||
|
|
@ -280,25 +303,18 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
|
||||
|
||||
5. Unscheduled key rollover
|
||||
|
||||
|
||||
|
||||
Gieben Expires September 2001 [Page 6]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS March 2001
|
||||
|
||||
Although nobody hopes that this will ever happen, we must be able to
|
||||
cope with possible key compromises. When such an event occurs, an
|
||||
immediate keyrollover is needed and must be completed in the shortest
|
||||
possible time. With two parties involved, it will still be awkward,
|
||||
but not impossible to update two zonefiles overnight. "Out-of-band"
|
||||
communication between the two parties will be necessary, since the
|
||||
compromised old key can not be trusted. We think that between two
|
||||
parties this is doable, but this complicated procedure [5] is beyond
|
||||
the scope of this document.
|
||||
compromised old key can not be trusted. We think that between two
|
||||
parties this is doable, but this complicated procedure is beyond the
|
||||
scope of this document.
|
||||
|
||||
An alternative to an emergency key-rollover is becoming unsecured as
|
||||
an emercengy measure. This has already been mentioned above in
|
||||
an emergency measure. This has already been mentioned above in
|
||||
section 3.1. This only involves an emergency change in the parents
|
||||
zonefile (deleting the child's zone KEY), and allows the child and
|
||||
its underlying zones time to clean up before becoming secured again,
|
||||
|
|
@ -322,6 +338,13 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
|
||||
To cope with 1, secure aware resolvers MUST be aware that during a
|
||||
key-rollover there may be a conflict, and that in that case the
|
||||
|
||||
|
||||
|
||||
Gieben & Lindgreen Expires November 2001 [Page 7]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
parent always holds the active KEY set. To cope with 2, the local
|
||||
resolver/caching forwarder should be preconfigured with the zone-KEY
|
||||
and thus looks at its own zone as were it a secure entry-point. For
|
||||
|
|
@ -329,24 +352,17 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
zonefile.
|
||||
|
||||
7.1. KEY bit in NXT records
|
||||
RFC 2535 [3], section 5.2 states:
|
||||
RFC 2535, section 5.2 states:
|
||||
|
||||
" The NXT RR type bit map format currently defined is one bit per
|
||||
RR type present for the owner name. A one bit indicates that at
|
||||
least one RR of that type is present for the owner name. A zero
|
||||
indicates that no such RR is present. [....] "
|
||||
" The NXT RR type bit map format currently defined is one bit per RR
|
||||
type present for the owner name. A one bit indicates that at least
|
||||
one RR of that type is present for the owner name. A zero indicates
|
||||
that no such RR is present. [....] "
|
||||
|
||||
As the zone KEY is present in a child zone, and signed by the
|
||||
zone KEY (thus selfsigned), the definition of NXT RR type bit states
|
||||
|
||||
|
||||
|
||||
Gieben Expires September 2001 [Page 7]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS March 2001
|
||||
|
||||
in RFC 2535 [3], section 5.2 that the KEY bit must be set. We do not
|
||||
see a compelling reason to change this default behavior.
|
||||
As the zone KEY is present in a child zone, and signed by the zone
|
||||
KEY (thus selfsigned), the definition of NXT RR type bit states in
|
||||
RFC 2535, section 5.2 that the KEY bit must be set. We do not see a
|
||||
compelling reason to change this default behavior.
|
||||
|
||||
7.2. Authority of KEY records
|
||||
The parent of a zone generates the signature for the key belonging to
|
||||
|
|
@ -371,15 +387,22 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
mechanism, like publishing it in a newspaper.
|
||||
|
||||
7.3. Selecting KEY sets
|
||||
As the zone KEY set is present in two places, there may be a
|
||||
possibility to find conflicting KEY sets, and this will at least
|
||||
really happen during a key-rollover.
|
||||
As the zone KEY set is present in two places, there is a possibility
|
||||
of two conflicting KEY sets, this will happen during a key-rollover
|
||||
and may happen at other times.
|
||||
|
||||
With one exception, a resolver MUST always select the KEY set from
|
||||
the parent in case of a conflict, as this is the active KEY set. For
|
||||
this reason, the parent sets the AA-bit on requests, while the child
|
||||
does not.
|
||||
|
||||
|
||||
|
||||
|
||||
Gieben & Lindgreen Expires November 2001 [Page 8]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
The one exception is when a resolver regards the child's zone as a
|
||||
secure-entry point, in which case it has the zone KEY preconfigured.
|
||||
In other words: a preconfigured KEY has even more authority then what
|
||||
|
|
@ -389,28 +412,22 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
|
||||
8. The zone KEY and local KEY records.
|
||||
It must be recognized that the zone KEY RR, which is signed by a
|
||||
non-local organisation, is something special. The external signature
|
||||
non-local organization, is something special. The external signature
|
||||
over the public part of the key provides the local zone-administrator
|
||||
with the authority to use the corresponding private part to sign
|
||||
everything local, and thus to make his/her own zone secure. Please
|
||||
also note that the external signer, and NOT the local zone is
|
||||
authoritative for the zone KEY RRset.
|
||||
|
||||
|
||||
|
||||
|
||||
Gieben Expires September 2001 [Page 8]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS March 2001
|
||||
|
||||
Part of the RRs that the zone-administrator may wish to sign are KEY
|
||||
RRs for local use, for instance for IPSEC.
|
||||
|
||||
To make sure, that the local zone is authoritative for its own local
|
||||
KEY RRs, and that they get not exported and signed externally, these
|
||||
local KEY records SHOULD not be part of the zone KEY RRset.
|
||||
Therefore, they SHOULD be placed under a label in the zonefile, f.i.
|
||||
keys.child.parent.
|
||||
Therefore, they could be placed under a label in the zonefile, f.i.
|
||||
keys.child.parent, or for these kind of keys a new RR type could be
|
||||
defined (e.g. PUBKEY).
|
||||
|
||||
Besides being kept clear of local KEY records, the zone KEY RRset
|
||||
SHOULD also be kept clear of any other obsolete or otherwise not
|
||||
|
|
@ -423,15 +440,38 @@ Internet Draft Parent Stores Zone KEYS March 2001
|
|||
progress. During a keyrollover a new KEY RR must be added to this
|
||||
RRset. Once the new KEY becomes the active zone KEY, the old KEY
|
||||
becomes obsolete and SHOULD be removed as soon as practically
|
||||
possible.
|
||||
possible. Information stored in caches SHOULD NOT be an issue on when
|
||||
to remove the old zone KEY.
|
||||
|
||||
|
||||
9. Security Considerations
|
||||
This document addresses the operational difficulties that arise if
|
||||
DNSSEC is deployed as it stands now, with the child's zone KEY not
|
||||
stored at the parent. By putting that key in the parent's zone the
|
||||
communication between the two is kept to a minimum thus reducing the
|
||||
risk of errors. All security considerations from RFC 2535 apply.
|
||||
This document addresses the operational difficulties that arise when
|
||||
DNSSEC is deployed. By putting the child's zone KEY at the parent we
|
||||
solve at lot of problems by minimizing the amount of communication
|
||||
between the two. There is one security issue: the parent must not
|
||||
ever create a valid parental SIG over a KEY RR, from which the
|
||||
private part is (also) known to someone else than the legitimate
|
||||
administrator of the child zone. This can happen in two ways:
|
||||
1. The private KEY at the child has been compromised.
|
||||
2. The parent has been fooled and thus insufficiently checked
|
||||
|
||||
|
||||
|
||||
Gieben & Lindgreen Expires November 2001 [Page 9]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
whether the KEY RR is really from the child.
|
||||
|
||||
For the security it doesn't matter if the SIG and the KEY are located
|
||||
at the child or at the parent, but if they are located at the parent
|
||||
it is much easier to replace the SIG. And by keeping the parental SIG
|
||||
lifetime short, the parent helps to protect the child against
|
||||
possible key compromises. The selfsigned zone KEY stored in the
|
||||
child's zone can have a long SIG expiration lifetime, this has no
|
||||
impact on the child's security.
|
||||
|
||||
All security considerations from RFC 2535 apply.
|
||||
|
||||
|
||||
Authors' Addresses
|
||||
|
|
@ -445,26 +485,14 @@ Authors' Addresses
|
|||
|
||||
References
|
||||
|
||||
[1] Lewis, E. "DNS Security Extension Clarification on Zone
|
||||
[RFC 3090] Lewis, E. "DNS Security Extension Clarification on Zone
|
||||
Status", RFC 3090
|
||||
www.ietf.org/rfc/rfc3090.txt
|
||||
[2] Bradner, S. "Key words for use in RFCs to Indicate Requirement
|
||||
[RFC 2119] Bradner, S. "Key words for use in RFCs to Indicate Requirement
|
||||
Levels", RFC 2119
|
||||
www.ietf.org/rfc/rfc2119.txt
|
||||
[3] Eastlake, D. "DNS Security Extensions", RFC 2535
|
||||
[RFC 2535] Eastlake, D. "DNS Security Extensions", RFC 2535
|
||||
www.ietf.org/rfc/rfc2535.txt
|
||||
[4] Andrews, M., Eastlake, D. "Domain Name System (DNS) Security
|
||||
|
||||
|
||||
|
||||
Gieben Expires September 2001 [Page 9]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS March 2001
|
||||
|
||||
Key Rollover"
|
||||
www.ietf.org/internet-drafts/draft-ietf-dnsop-rollover-01.txt
|
||||
[5] Gieben, R. "Chain of trust"
|
||||
secnl.nlnetlabs.nl/thesis/thesis.html
|
||||
|
||||
|
||||
Full Copyright Statement
|
||||
|
|
@ -485,6 +513,13 @@ Full Copyright Statement
|
|||
followed, or as required to translate it into languages other than
|
||||
English.
|
||||
|
||||
|
||||
|
||||
Gieben & Lindgreen Expires November 2001 [Page 10]
|
||||
|
||||
Internet Draft Parent Stores Zone KEYS May 2001
|
||||
|
||||
|
||||
The limited permissions granted above are perpetual and will not
|
||||
be revoked by the Internet Society or its successors or assigns.
|
||||
|
||||
|
|
@ -1,898 +0,0 @@
|
|||
Internet Engineering Task Force (IETF) Mark Welter
|
||||
INTERNET-DRAFT Brian W. Spolarich
|
||||
draft-ietf-idn-dude-01.txt WALID, Inc.
|
||||
March 02, 2001 Expires September 02, 2001
|
||||
|
||||
|
||||
DUDE: Differential Unicode Domain Encoding
|
||||
|
||||
|
||||
Status of this memo
|
||||
|
||||
This document is an Internet-Draft and is in full conformance with all
|
||||
provisions of Section 10 of RFC2026.
|
||||
|
||||
Internet-Drafts are working documents of the Internet Engineering Task
|
||||
Force (IETF), its areas, and its working groups. Note that other
|
||||
groups may also distribute working documents as Internet-Drafts.
|
||||
|
||||
Internet-Drafts are draft documents valid for a maximum of six months
|
||||
and may be updated, replaced, or obsoleted by other documents at any
|
||||
time. It is inappropriate to use Internet-Drafts as reference
|
||||
material or to cite them other than as "work in progress."
|
||||
|
||||
The list of current Internet-Drafts can be accessed at
|
||||
http://www.ietf.org/ietf/1id-abstracts.txt
|
||||
|
||||
The list of Internet-Draft Shadow Directories can be accessed at
|
||||
http://www.ietf.org/shadow.html.
|
||||
|
||||
The distribution of this document is unlimited.
|
||||
|
||||
Copyright (c) The Internet Society (2000). All Rights Reserved.
|
||||
|
||||
Abstract
|
||||
|
||||
This document describes a tranformation method for representing
|
||||
Unicode character codepoints in host name parts in a fashion that is
|
||||
completely compatible with the current Domain Name System. It provides
|
||||
for very efficient representation of typical Unicode sequences as
|
||||
host name parts, while preserving simplicity. It is proposed as a
|
||||
potential candidate for an ASCII-Compatible Encoding (ACE) for supporting
|
||||
the deployment of an internationalized Domain Name System.
|
||||
|
||||
|
||||
Table of Contents
|
||||
|
||||
1. Introduction
|
||||
1.1 Terminology
|
||||
2. Hostname Part Transformation
|
||||
2.1 Post-Converted Name Prefix
|
||||
2.2 Radix Selection
|
||||
2.3 Hostname Prepartion
|
||||
2.4 Definitions
|
||||
2.5 DUDE Encoding
|
||||
2.5.1 Extended Variable Length Hex Encoding
|
||||
2.5.2 DUDE Compression Algorithm
|
||||
2.5.3 Forward Transformation Algorithm
|
||||
2.6 DUDE Decoding
|
||||
2.6.1 Extended Variable Length Hex Decoding
|
||||
2.6.2 DUDE Decompression Algorithm
|
||||
2.6.3 Reverse Transformation Algorithm
|
||||
3. Examples
|
||||
4. Optional Case Preservation
|
||||
5. Security Considerations
|
||||
6. References
|
||||
|
||||
|
||||
1. Introduction
|
||||
|
||||
DUDE describes an encoding scheme of the ISO/IEC 10646 [ISO10646]
|
||||
character set (whose character code assignments are synchronized
|
||||
with Unicode [UNICODE3]), and the procedures for using this scheme
|
||||
to transform host name parts containing Unicode character sequences
|
||||
into sequences that are compatible with the current DNS protocol
|
||||
[STD13]. As such, it satisfies the definition of a 'charset' as
|
||||
defined in [IDNREQ].
|
||||
|
||||
1.1 Terminology
|
||||
|
||||
The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and
|
||||
"MAY" in this document are to be interpreted as described in RFC 2119
|
||||
[RFC2119].
|
||||
|
||||
Hexadecimal values are shown preceded with an "0x". For example,
|
||||
"0xa1b5" indicates two octets, 0xa1 followed by 0xb5. Binary values are
|
||||
shown preceded with an "0b". For example, a nine-bit value might be
|
||||
shown as "0b101101111".
|
||||
|
||||
Examples in this document use the notation from the Unicode Standard
|
||||
[UNICODE3] as well as the ISO 10646 names. For example, the letter "a"
|
||||
may be represented as either "U+0061" or "LATIN SMALL LETTER A".
|
||||
|
||||
DUDE converts strings with internationalized characters into
|
||||
strings of US-ASCII that are acceptable as host name parts in current
|
||||
DNS host naming usage. The former are called "pre-converted" and the
|
||||
latter are called "post-converted". This specification defines both
|
||||
a forward and reverse transformation algorithm.
|
||||
|
||||
|
||||
2. Hostname Part Transformation
|
||||
|
||||
According to [STD13], hostname parts must start and end with a letter
|
||||
or digit, and contain only letters, digits, and the hyphen character
|
||||
("-"). This, of course, excludes most characters used by non-English
|
||||
speakers, characters, as well as many other characters in the ASCII
|
||||
character repertoire. Further, domain name parts must be 63 octets or
|
||||
shorter in length.
|
||||
|
||||
2.1 Post-Converted Name Prefix
|
||||
|
||||
This document defines the string 'dq--' as a prefix to identify
|
||||
DUDE-encoded sequences. For the purposes of comparison in the IDN
|
||||
Working Group activities, the 'dq--' prefix should be used solely to
|
||||
identify DUDE sequences. However, should this document proceed beyond
|
||||
draft status the prefix should be changed to whatever prefix, if any,
|
||||
is the final consensus of the IDN working group.
|
||||
|
||||
Note that the prepending of a fixed identifier sequence is only one
|
||||
mechanism for differentiating ASCII character encoded international
|
||||
domain names from 'ordinary' domain names. One method, as proposed in
|
||||
[IDNRACE], is to include a character prefix or suffix that does not
|
||||
appear in any name in any zone file. A second method is to insert a
|
||||
domain component which pushes off any international names one or more
|
||||
levels deeper into the DNS hierarchy. There are trade-offs between
|
||||
these two methods which are independent of the Unicode to ASCII
|
||||
transcoding method finally chosen. We do not address the international
|
||||
vs. 'ordinary' name differention issue in this paper.
|
||||
|
||||
2.2 Radix Selection
|
||||
|
||||
There are many proposed methods for representing Unicode characters
|
||||
within the allowed target character set, which can be split into groups
|
||||
on the basis of the underlying radix. We have chosen a method with
|
||||
radix 16 because both UTF-32 and ASCII are represented by even multiples
|
||||
of four bits. This allows a Unicode character to be encoded as a
|
||||
whole number of ASCII characters, and permits easier manipulation of
|
||||
the resulting encoded data by humans.
|
||||
|
||||
2.3 Hostname Preparation
|
||||
|
||||
The hostname part is assumed to have at least one character disallowed
|
||||
by [STD13], and that is has been processed for logically equivalent
|
||||
character mapping, filtering of disallowed characters (if any), and
|
||||
compatibility composition/decomposition before presentation to the DUDE
|
||||
conversion algorithm.
|
||||
|
||||
While it is possible to invent a transcoding mechanism that relies
|
||||
on certain Unicode characters being deemed illegal within domain names
|
||||
and hence available to the transcoding mechanism for improving encoding
|
||||
efficiency, we feel that such a proposal would complicate matters
|
||||
excessively.
|
||||
|
||||
2.4 Definitions
|
||||
|
||||
For clarity:
|
||||
|
||||
'integer' is an unsigned binary quantity;
|
||||
'byte' is an 8-bit integer quantity;
|
||||
'nibble' is a 4-bit integer quantity.
|
||||
|
||||
2.5 DUDE Encoding
|
||||
|
||||
The idea behind this scheme is to provide compression by encoding the
|
||||
contiguous least significant nibbles of a character that differ from the
|
||||
preceding character. Using a variant of the variable length hex encoding
|
||||
desribed in [IDNDUERST] and elsewhere, by encoding leading zero nibbles
|
||||
this technique allows recovery of the differential length. The encoding
|
||||
is, with some practice, easy to perform manually.
|
||||
|
||||
2.5.1 Extended Variable Length Hex Encoding
|
||||
|
||||
The variable length hex encoding algorithm was introduced by Duerst in
|
||||
[IDNDUERST]. It encodes an integer value in a slight modification of
|
||||
traditional hexadecimal notation, the difference being that the most
|
||||
significant digit is represented with an alternate set of "digits"
|
||||
- -- 'g through 'v' are used to represent 0 through 15. The result is a
|
||||
variable length encoding which can efficiently represent integers of
|
||||
arbitrary length.
|
||||
|
||||
This specification extends the variable length hex encoding algorithm
|
||||
to support the compression scheme defined below by potentially not
|
||||
supressing leading zero nibbles.
|
||||
|
||||
The extended variable length nibble encoding of an integer, C,
|
||||
to length N, is defined as follows:
|
||||
|
||||
1. Start with I, the Nth least significant nibble from the least
|
||||
significant nibble of C;
|
||||
|
||||
2. Emit the Ith character of the sequence [ghijklmnopqrstuv];
|
||||
|
||||
3. Continue from the most to least significant, encoding each
|
||||
remaining nibble J by emitting the Jth character of the
|
||||
sequence [0123456789abcdef].
|
||||
|
||||
2.5.2 DUDE Compression Algorithm
|
||||
|
||||
1. Let PREV = 0;
|
||||
|
||||
2. If there are no more characters in the input, terminate successfully;
|
||||
|
||||
4. Let C be the next character in the input;
|
||||
|
||||
5. If C != '-' , then go to step 7;
|
||||
|
||||
6. Consume the input character, emit '-', and go to step 2;
|
||||
|
||||
7. Let D be the result of PREV exclusive ORed with C;
|
||||
|
||||
8. Find the least positive value N such that
|
||||
D bitwise ANDed with M is zero
|
||||
where M = the bitwise complement of (16**N) - 1;
|
||||
|
||||
9. Let V be C ANDed with the bitwise complement of M;
|
||||
|
||||
10. Variable length hex encode V to length N and emit the result;
|
||||
|
||||
11. Let PREV = C and go to step 2.
|
||||
|
||||
|
||||
2.5.3 Forward Transformation Algorithm
|
||||
|
||||
The DUDE transformation algorithm accepts a string in UTF-32
|
||||
[UNICODE3] format as input. It is assumed that prior nameprep
|
||||
processing has disallowed the private use code points in
|
||||
0X100000 throuh 0X10FFFF, so that we are left with the task of
|
||||
encoding 20 bit integers. The encoding algorithm is as follows:
|
||||
|
||||
1. Break the hostname string into dot-separated hostname parts.
|
||||
For each hostname part which contains one or more characters
|
||||
disallowed by [STD13], perform steps 2 and 3 below;
|
||||
|
||||
2. Compress the hostname part using the method described in section
|
||||
2.5.2 above, and encode using the encoding described in section
|
||||
2.5.1;
|
||||
|
||||
3. Prepend the post-converted name prefix 'dq--' (see section 2.1
|
||||
above) to the resulting string.
|
||||
|
||||
|
||||
2.6 DUDE Decoding
|
||||
|
||||
2.6.1 Extended Variable Length Hex Decoding
|
||||
|
||||
Decoding extended variable length hex encoded strings is identical
|
||||
to the standard variable length hex encoding, and is defined as
|
||||
follows:
|
||||
|
||||
1. Let CL be the lower case of the first input character,
|
||||
|
||||
If CL is not in set [ghijklmnopqrstuv],
|
||||
return error,
|
||||
else
|
||||
consume the input character;
|
||||
|
||||
2. Let R = CL - 'g',
|
||||
Let N = 1;
|
||||
|
||||
3. If no more input characters exist, go to step 9.
|
||||
|
||||
4. Let CL be the lower case of the next input character;
|
||||
|
||||
5. If CL is not in the set [0123456789abcdef], go to Step 9;
|
||||
|
||||
6. Consume the next input character,
|
||||
Let N = N + 1;
|
||||
Let R = R * 16;
|
||||
|
||||
7. If N is in set [0123456789],
|
||||
then let R = R + (N - '0')
|
||||
else let R = R + (N - 'a') + 10;
|
||||
|
||||
8. Go to step 3;
|
||||
|
||||
9. Let MASK be the bitwise complement of (16**N) - 1;
|
||||
|
||||
10. Return decoded result R as well as MASK.
|
||||
|
||||
2.6.2 DUDE Decompression Algorithm
|
||||
|
||||
1. Let PREV = 0;
|
||||
|
||||
2. If there are no more input characters then terminate successfully;
|
||||
|
||||
3. Let C be the next input character;
|
||||
|
||||
4. If C == '-', append '-' to the result string, consume the character,
|
||||
and go to step 2,
|
||||
|
||||
5. Let VPART, MASK be the next extended variable length hex decoded
|
||||
value and mask;
|
||||
|
||||
6. If VPART > 0xFFFFF then return error status,
|
||||
|
||||
7. Let CU = ( PREV bitwise-AND MASK) + VPART,
|
||||
Let PREV = CU;
|
||||
|
||||
8. Append the UTF-32 character CU to the result string;
|
||||
|
||||
9. Go to step 2.
|
||||
|
||||
|
||||
2.6.3 Reverse Transformation Algorithm
|
||||
|
||||
1. Break the string into dot-separated components and apply Steps
|
||||
2 through 4 to each component;
|
||||
|
||||
2. Remove the post converted name prefix 'dq--' (see Section 2.1);
|
||||
|
||||
3. Decompress the component using the decompression algorithm
|
||||
described above (which in turn invokes the decoding algorithm
|
||||
also described above);
|
||||
|
||||
4. Concatenate the decoded segments with dot separators and return.
|
||||
|
||||
3. Examples
|
||||
|
||||
The examples below illustrate the encoding algorithm. Allowed RFC1035
|
||||
characters, including period [U+002E] and dash [U+002D] are shown as
|
||||
literals in the UTF-16 version of the example. DUDE is compared to
|
||||
LACE as proposed in [IDNLACE]. A comprehensive comparison of ACE
|
||||
proposals is outside of the scope of this document. However we believe
|
||||
that DUDE shows a good balance between efficiency (resulting in shorter
|
||||
ACE sequences for typical names) and complexity.
|
||||
|
||||
|
||||
3.1 'www.walid.com' [Arabic]:
|
||||
|
||||
UTF-16: U+0645 U+0648 U+0642 U+0639 . U+0648 U+0644 U+064A U+062F .
|
||||
U+0634 U+0631 U+0643 U+0629
|
||||
|
||||
DUDE: dq--m45oij9.dq--m48kqif.dq--m34hk3i9
|
||||
|
||||
LACE: bq--aqdekscche.bq--aqdeqrckf5.bq--aqddimkdfe
|
||||
|
||||
3.2 'Abugazalah-Intellectual-Property.com' [Arabic]:
|
||||
|
||||
UTF-16: U+0623 U+0628 U+0648 U+063A U+0632 U+0627 U+0644 U+0629 -
|
||||
U+0644 U+0644 U+0645 U+0644 U+0643 U+064A U+0629 - U+0627
|
||||
U+0644 U+0641 U+0643 U+0631 U+064A U+0629 . U+0634 U+0631
|
||||
U+0643 U+0629
|
||||
|
||||
DUDE: dq--m23ok8jaii7k4i9-m44klkjqi9-m27k4hjj1kai9.dq--m34hk3i9
|
||||
|
||||
LACE: bq--badcgkcihizcorbjaeac2bygircekrcdjiuqcabna4dcorcbimyuuki.
|
||||
bq--aqddimkdfe
|
||||
|
||||
3.3 'King-Hussain.person.jr' [Arabic]
|
||||
|
||||
UTF-16: U+0627 U+0644 U+0645 U+0644 U+0643 - U+062D U+0633 U+064A
|
||||
U+0646 . U+0634 U+062E U+0635 . U+0627 U+0644 U+0623 U+0631
|
||||
U+062F U+0646
|
||||
|
||||
DUDE: dq--m27k4lkj-m2dj3kam.dq--m34iej5.dq--m27k4i3j1ifk6
|
||||
|
||||
LACE: bq--audcorcfirbqcabnaudegljtjjda.bq--amddilrv.
|
||||
bq--aydcorbdgexum
|
||||
|
||||
3.4 'Jordanian-Dental-Center.com.jr' [Arabic]
|
||||
|
||||
UTF-16: U+0645 U+0631 U+0643 U+0632 - U+0627 U+0644 U+0623 U+0631 U+062F
|
||||
U+0646 - U+0644 U+0644 U+0623 U+0633 U+0646 U+0627 U+0646 .
|
||||
U+0634 U+0631 U+0643 U+0629 . U+0627 U+0644 U+0623 U+0631 U+062F
|
||||
U+0646
|
||||
|
||||
DUDE: dq--m45j1k3j2-m27k4i3j1ifk6-m44ki3j3k6i7k6.dq--m34hk3i9.
|
||||
dq--m27k4i3j1ifk6
|
||||
|
||||
LACE: bq--aqdekmkdgiaqaligaytuiizrf5dacabna4deirbdgndcorq.
|
||||
bq--aqddimkdfe.bq--aydcorbdgexum
|
||||
|
||||
3.5 'Mahindra.com' [Hindi]:
|
||||
|
||||
UTF-16: U+092E U+0939 U+093F U+0928 U+094D U+0926 U+094D U+0930
|
||||
U+093E . U+0935 U+094D U+092F U+093E U+092A U+093E U+0930
|
||||
|
||||
DUDE: dq--p2ej9vi8kdi6kdj0u.dq--p35kdifjeiajeg
|
||||
|
||||
LACE: bq--bees4oj7fbgsmtjqhy.bq--a4etktjphyvd4ma
|
||||
|
||||
3.6 'Webdunia.com' [Hindi]:
|
||||
|
||||
UTF-16: U+0935 U+0947 U+092C U+0926 U+0941 U+0928 U+093F U+092F
|
||||
U+093E . U+0935 U+094D U+092F U+093E U+092A U+093E U+0930
|
||||
|
||||
DUDE: dq--p35k7icmk1i8jfifje.dq--p35kdifjeiajeg
|
||||
|
||||
LACE: bq--beetkrzmezasqpzphy.bq--a4etktjphyvd4ma
|
||||
|
||||
3.7 'Chinese Finance.com' [Traditional Chinese]
|
||||
|
||||
UTF-16: U+4E2D U+83EF U+8CA1 U+7D93 . c o m
|
||||
|
||||
DUDE: dq--ke2do3efsa1nd93.com
|
||||
|
||||
LACE: bq--75hc3a7prsqx3ey.com
|
||||
|
||||
3.8 'Chinese Readers.net' [Chinese]
|
||||
|
||||
UTF-16: U+842C U+7DAD U+8B80 U+8005 . U+7DB2 U+7D61
|
||||
|
||||
DUDE: dq--o42cndadob80g05.dq--ndb2m1
|
||||
|
||||
LACE: bq--76ccy7nnroaiabi.bq--aj63eyi
|
||||
|
||||
3.9 'Russian-Standard.com.ru' [Russian]
|
||||
|
||||
UTF-16: U+0440 U+0443 U+0441 U+0441 U+043A U+0438 U+0439 -
|
||||
U+0441 U+0442 U+0430 U+043D U+0434 U+0430 U+0440 U+0442 .
|
||||
U+043A U+043E U+043C . U+0440 U+0444
|
||||
|
||||
DUDE: dq--k40jhhjaop-k3ausk1ij0tkgk0i.dq--k3aus.dq--k40k
|
||||
|
||||
LACE: bq--a4ceaq2bie5dqoibaawqqbcbiiyd2nbqibba.bq--amcdupr4.
|
||||
bq--aiceara
|
||||
|
||||
3.10 'Vladimir-Putin.person.ru' [Russian]
|
||||
|
||||
UTF-16: U+0432 U+043B U+0430 U+0434 U+0438 U+043C U+0438 U+0440 -
|
||||
U+043F U+0443 U+0442 U+0438 U+043D . U+043B U+0438 U+0447
|
||||
U+043D U+043E U+0441 U+0442 U+044C . U+0440 U+0444 U+0020
|
||||
|
||||
DUDE: dq--k32rgkosok0-k3fk3ij8t.dq--k3bok7jduk1is.dq--k40k
|
||||
|
||||
LACE: bq--bacdeozqgq4dyocaaeac2bieh5bueob5.
|
||||
bq--bacdwochhu7ecqsm.bq--aiceara
|
||||
|
||||
|
||||
4. Optional Case Preservation
|
||||
|
||||
An extension to the DUDE concept recognizes that the first
|
||||
character emitted by the variable length hex encoding algorithm is
|
||||
always alphabetic. We encode the case (if any) of the original Unicode
|
||||
character in the case of the initial "hex" character. Because the DNS
|
||||
performs case-insensitive comparisons, mixed case international domain
|
||||
names behave in exactly the same way as traditional domain names.
|
||||
In particular, this enables reverse lookups to return names in the
|
||||
preferred case.
|
||||
|
||||
In contrast to other proposals as of this writing, such a case preserving
|
||||
version of DUDE will interoperate with the non case preserving version.
|
||||
|
||||
Despite the foregoing, we feel that the additional complexity of tracking
|
||||
character case through the nameprep processing is not warranted by the
|
||||
marginal utility of the result.
|
||||
|
||||
5. Security Considerations
|
||||
|
||||
Much of the security of the Internet relies on the DNS and any
|
||||
change to the characteristics of the DNS may change the security of
|
||||
much of the Internet. Therefore DUDE makes no changes to the DNS itself.
|
||||
|
||||
DUDE is designed so that distinct Unicode sequences map to distinct
|
||||
domain name sequences (modulo the Unicode and DNS equivalence rules).
|
||||
Therefore use of DUDE with DNS will not negatively affect security below
|
||||
the application level.
|
||||
|
||||
If an application has security reliance on the Unicode string S, produced
|
||||
by an inverse ACE transformation of a name T, the application must verify
|
||||
that the nameprepped and ACE encoded result of S is DNS-equivalent to T.
|
||||
|
||||
6. Change History
|
||||
|
||||
The statement that we intended to submit a Nameprep draft was removed in
|
||||
light of the changes made between the frist and second nameprep drafts.
|
||||
|
||||
The details of DUDE extensions for case preservation etc. have been
|
||||
removed. Basic DUDE was changed to operate over the relevant 20 bit
|
||||
UTF32 code points.
|
||||
|
||||
Examples have been extended.
|
||||
|
||||
ACE security issues were clarified.
|
||||
|
||||
7. References
|
||||
|
||||
[IDNCOMP] Paul Hoffman, "Comparison of Internationalized Domain Name
|
||||
Proposals", draft-ietf-idn-compare;
|
||||
|
||||
[IDNrACE] Paul Hoffman, "RACE: Row-Based ASCII Compatible Encoding for
|
||||
IDN", draft-ietf-idn-race;
|
||||
|
||||
[IDNLACE] Mark Davis, "LACE: Length-Based ASCII Compatible Encoding for
|
||||
IDN", draft-ietf-idn-lace;
|
||||
|
||||
[IDNREQ] James Seng, "Requirements of Internationalized Domain Names",
|
||||
draft-ietf-idn-requirement;
|
||||
|
||||
[IDNNAMEPREP] Paul Hoffman and Marc Blanchet, "Preparation of
|
||||
Internationalized Host Names", draft-ietf-idn-nameprep;
|
||||
|
||||
[IDNDUERST] M. Duerst, "Internationalization of Domain Names",
|
||||
draft-duerst-dns-i18n;
|
||||
|
||||
[ISO10646] ISO/IEC 10646-1:1993. International Standard -- Information
|
||||
technology -- Universal Multiple-Octet Coded Character Set (UCS) --
|
||||
Part 1: Architecture and Basic Multilingual Plane. Five amendments and
|
||||
a technical corrigendum have been published up to now. UTF-16 is
|
||||
described in Annex Q, published as Amendment 1. 17 other amendments are
|
||||
currently at various stages of standardization;
|
||||
|
||||
[RFC2119] Scott Bradner, "Key words for use in RFCs to Indicate
|
||||
Requirement Levels", March 1997, RFC 2119;
|
||||
|
||||
[STD13] Paul Mockapetris, "Domain names - implementation and
|
||||
specification", November 1987, STD 13 (RFC 1035);
|
||||
|
||||
[UNICODE3] The Unicode Consortium, "The Unicode Standard -- Version
|
||||
3.0", ISBN 0-201-61633-5. Described at
|
||||
<http://www.unicode.org/unicode/standard/versions/Unicode3.0.html>.
|
||||
|
||||
|
||||
A. Acknowledgements
|
||||
|
||||
The structure (and some of the structural text) of this document is
|
||||
intentionally borrowed from the LACE IDN draft (draft-ietf-idn-lace-00)
|
||||
by Mark Davis and Paul Hoffman.
|
||||
|
||||
B. IANA Considerations
|
||||
|
||||
There are no IANA considerations in this document.
|
||||
|
||||
|
||||
C. Author Contact Information
|
||||
|
||||
Mark Welter
|
||||
Brian W. Spolarich
|
||||
WALID, Inc.
|
||||
State Technology Park
|
||||
2245 S. State St.
|
||||
Ann Arbor, MI 48104
|
||||
+1-734-822-2020
|
||||
|
||||
mwelter@walid.com
|
||||
briansp@walid.com
|
||||
|
||||
D. DUDE C++ Implementation
|
||||
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
#include <ctype.h>
|
||||
#include <limits.h>
|
||||
|
||||
#define IDN_ERROR INT_MIN
|
||||
|
||||
#define DUDETAG "dq--"
|
||||
|
||||
typedef unsigned int uchar_t;
|
||||
|
||||
bool idn_isRFC1035(const uchar_t * in, int len)
|
||||
{
|
||||
const uchar_t * end = in + len;
|
||||
|
||||
while (in < end)
|
||||
{
|
||||
if ((*in > 127) ||
|
||||
!strchr("abcdefghijklmnopqrstuvwxyz0123456789-.", tolower(*in)))
|
||||
return false;
|
||||
in++;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
static const char *hexchar = "0123456789abcdef";
|
||||
static const char *leadchar = "ghijklmnopqrstuv";
|
||||
|
||||
/*
|
||||
dudehex -- convert an integer, v, into n DUDE hex characters.
|
||||
The result is placed in ostr. The buffer ends at the byte before
|
||||
eop, and false is returned to indicate insufficient buffer space.
|
||||
*/
|
||||
static bool dudehex(char * & ostr, const char * eop,
|
||||
unsigned int v, int n)
|
||||
{
|
||||
if ((ostr + n) >= eop)
|
||||
return false;
|
||||
|
||||
n--; // convert to zero origin
|
||||
|
||||
*ostr++ = leadchar[(v >> (n << 2)) & 0x0F];
|
||||
|
||||
while (n > 0)
|
||||
{
|
||||
n--;
|
||||
*ostr++ = hexchar[(v >> (n << 2)) & 0x0F];
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
/*
|
||||
idn_dudeseg converts istr, a utf-32 domain name segment into DUDE.
|
||||
eip points at the character after the input segment.
|
||||
ostr points at an output buffer which ends just before eop.
|
||||
If there is insufficient buffer space, the function return is false.
|
||||
Invalid surrogate sequences will also cause a return of false.
|
||||
*/
|
||||
static bool idn_dudeseg(const uchar_t * istr, const uchar_t * eip,
|
||||
char * & ostr, char * eop)
|
||||
{
|
||||
const uchar_t * ip = istr;
|
||||
unsigned p = 0;
|
||||
|
||||
while (ip < eip)
|
||||
{
|
||||
if (*ip == '-')
|
||||
*ostr++ = *ip;
|
||||
else // if (validnc(*ip))
|
||||
{
|
||||
unsigned int c = *ip;
|
||||
|
||||
unsigned d = p ^ c; // d now has the difference (xor)
|
||||
// between the current and previous char
|
||||
|
||||
int n = 1; // Count the number of significant nibbles
|
||||
while (d >>= 4)
|
||||
n++;
|
||||
|
||||
dudehex(ostr, eop, c, n);
|
||||
p = c;
|
||||
}
|
||||
ip++;
|
||||
}
|
||||
*ostr = 0;
|
||||
return true;
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
idn_UTF32toDUDE converts a UTF-32 domain name into DUDE.
|
||||
in, a UTF-32 vector of length inlen is the input domain name.
|
||||
outstr is a char output buffer of length outmax.
|
||||
On success, the number of output characters is returned.
|
||||
On failure, a negative number is returned.
|
||||
|
||||
It is assumed that the input has been nameprepped.
|
||||
|
||||
If this routine is used in a registration context, segment and
|
||||
overall length restrictions must be checked by the user.
|
||||
*/
|
||||
|
||||
int idn_UTF32toDUDE(const uchar_t * in, int inlen, char *outstr, int outmax)
|
||||
{
|
||||
const uchar_t *ip = in;
|
||||
const uchar_t *eip = in + inlen;
|
||||
const uchar_t *ep = ip;
|
||||
char *op = outstr;
|
||||
char *eop = outstr + outmax - 1;
|
||||
|
||||
while (ip < eip)
|
||||
{
|
||||
ep = ip;
|
||||
while ((ep < eip) && (*ep != '.'))
|
||||
ep++;
|
||||
|
||||
const char * tagp = DUDETAG; // prefix the segment
|
||||
while (*tagp) // with the tag (dq--)
|
||||
{
|
||||
if (op >= eop)
|
||||
{
|
||||
*outstr = '\0';
|
||||
return IDN_ERROR;
|
||||
}
|
||||
*op++ = *tagp++;
|
||||
}
|
||||
|
||||
if (idn_isRFC1035(ip, ep - ip))
|
||||
{
|
||||
if ((ep - ip) >= (eop - op))
|
||||
{
|
||||
*outstr = '\0';
|
||||
return IDN_ERROR;
|
||||
}
|
||||
while (ip < ep)
|
||||
*op++ = *ip++;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (!idn_dudeseg(ip, ep, op, eop))
|
||||
{
|
||||
*outstr = '\0';
|
||||
return IDN_ERROR;
|
||||
}
|
||||
}
|
||||
|
||||
if (op >= eop) // check for output buffer overflow
|
||||
{
|
||||
*outstr = '\0';
|
||||
return IDN_ERROR;
|
||||
}
|
||||
if (ep < eip)
|
||||
*op++ = *ep; // copy '.'
|
||||
|
||||
ip = ep + 1;
|
||||
}
|
||||
|
||||
*op = '\0';
|
||||
|
||||
return (op - outstr) - 1;
|
||||
}
|
||||
|
||||
/*
|
||||
idn_DUDEsegtoUTF32 converts instr, DUDE encoded domain name segment
|
||||
into UTF32.
|
||||
eip points at the character after the input segment.
|
||||
ostr points at an output buffer which ends just before eop.
|
||||
If there is insufficient buffer space, the function return is false.
|
||||
*/
|
||||
static int idn_DUDEsegtoUTF32(const char * instr, int inlen,
|
||||
uchar_t * outstr, int maxlen)
|
||||
{
|
||||
const char * ip = instr;
|
||||
const char * eip = instr + inlen;
|
||||
uchar_t * op = outstr;
|
||||
uchar_t * eop = op + maxlen - 1;
|
||||
|
||||
unsigned prev = 0;
|
||||
|
||||
while (ip < eip)
|
||||
{
|
||||
if (*ip == '-')
|
||||
*op++ = '-';
|
||||
else
|
||||
{
|
||||
char c0 = tolower(*ip);
|
||||
if ((c0 < 'g') || (c0 > 'v'))
|
||||
return false;
|
||||
|
||||
ip++;
|
||||
|
||||
unsigned r = c0 - 'g';
|
||||
int n = 1;
|
||||
while (ip < eip)
|
||||
{
|
||||
char cl = tolower(*ip);
|
||||
if ((cl >= '0') && (cl <= '9'))
|
||||
{
|
||||
r <<= 4;
|
||||
r += cl - '0';
|
||||
}
|
||||
else if ((cl >= 'a') && (cl <= 'f'))
|
||||
{
|
||||
r <<= 4;
|
||||
r += (cl - 'a') + 10;
|
||||
}
|
||||
else
|
||||
break;
|
||||
|
||||
ip++;
|
||||
n++;
|
||||
}
|
||||
|
||||
if (r >= 0x0fffff)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
unsigned mask = -1 << (n << 2);
|
||||
|
||||
unsigned cu = (prev & mask) + r;
|
||||
prev = cu;
|
||||
|
||||
if (op >= eop)
|
||||
return IDN_ERROR;
|
||||
*op++ = cu;
|
||||
}
|
||||
}
|
||||
*op = '\0';
|
||||
return (op - outstr);
|
||||
}
|
||||
|
||||
int idn_DUDEtoUTF32(const char * in, int inlen, uchar_t * outstr, int outmax)
|
||||
{
|
||||
const char *ip = in;
|
||||
const char *eip = in + inlen;
|
||||
const char *ep = ip;
|
||||
uchar_t *op = outstr;
|
||||
uchar_t *eop = outstr + outmax - 1;
|
||||
|
||||
while (ip < eip)
|
||||
{
|
||||
ep = ip;
|
||||
while ((ep < eip) && (*ep != L'.'))
|
||||
ep++;
|
||||
|
||||
const char * tip = ip;
|
||||
const char * tagp = DUDETAG;
|
||||
while (*tagp && (tip < ep) && (tolower(*tagp) == tolower(*tip)))
|
||||
{
|
||||
tip++;
|
||||
tagp++;
|
||||
}
|
||||
|
||||
if (*tagp)
|
||||
{ // tag doesn't match, copy segment verbatim
|
||||
while (ip < ep)
|
||||
{
|
||||
if (op >= eop)
|
||||
return IDN_ERROR;
|
||||
*op++ = *ip++;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
ip = tip;
|
||||
int rv = idn_DUDEsegtoUTF32(ip, ep - ip, op, eop - op);
|
||||
|
||||
if (rv < 0)
|
||||
return IDN_ERROR;
|
||||
|
||||
op += rv;
|
||||
}
|
||||
|
||||
*op++ = *ep;
|
||||
|
||||
if (!*ep)
|
||||
break;
|
||||
|
||||
ip = ep + 1;
|
||||
}
|
||||
|
||||
if (op >= eop)
|
||||
return IDN_ERROR;
|
||||
|
||||
*op = '\0';
|
||||
|
||||
return (op - outstr) - 1;
|
||||
}
|
||||
|
||||
/*
|
||||
DUDE test driver
|
||||
*/
|
||||
|
||||
void printres(char *title, int rv, char *buff);
|
||||
void printres(char *title, int rv, uchar_t *buff);
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
char inbuff[512];
|
||||
|
||||
while (fgets(inbuff, sizeof(inbuff), stdin))
|
||||
{
|
||||
char cbuff[128];
|
||||
uchar_t wbuff[128];
|
||||
uchar_t iwbuff[128];
|
||||
uchar_t *wsp = wbuff;
|
||||
uchar_t wc;
|
||||
int in;
|
||||
int nr;
|
||||
|
||||
char * inp = inbuff;
|
||||
wsp = wbuff;
|
||||
while (sscanf(inp, "%x%n", &in, &nr) > 0)
|
||||
{
|
||||
inp += nr;
|
||||
*wsp++ = in;
|
||||
}
|
||||
fprintf(stdout, "\n");
|
||||
|
||||
int rv;
|
||||
rv = idn_UTF32toDUDE(wbuff, wsp - wbuff, cbuff, sizeof(cbuff));
|
||||
printres("toDUDE", rv, cbuff);
|
||||
|
||||
if (rv >= 0)
|
||||
{
|
||||
rv = idn_DUDEtoUTF32(cbuff, rv, iwbuff, sizeof(iwbuff));
|
||||
printres("toUTF32", rv, iwbuff);
|
||||
}
|
||||
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
void printres(char *title, int rv, char *buff)
|
||||
{
|
||||
fprintf(stdout, "%s (%d) : ", title, rv);
|
||||
if (rv >= 0)
|
||||
{
|
||||
unsigned char *dp = (unsigned char *) buff;
|
||||
while (*dp)
|
||||
{
|
||||
fprintf(stdout, "%c", *dp++);
|
||||
}
|
||||
}
|
||||
fprintf(stdout, "\n");
|
||||
}
|
||||
|
||||
void printres(char *title, int rv, uchar_t *buff)
|
||||
{
|
||||
fprintf(stdout, "%s (%d) : ", title, rv);
|
||||
if (rv >= 0)
|
||||
{
|
||||
uchar_t *dp = buff;
|
||||
while (*dp)
|
||||
{
|
||||
fprintf(stdout, " %05x", *dp++);
|
||||
}
|
||||
}
|
||||
fprintf(stdout, "\n");
|
||||
}
|
||||
864
doc/draft/draft-ietf-idn-dude-02.txt
Normal file
864
doc/draft/draft-ietf-idn-dude-02.txt
Normal file
|
|
@ -0,0 +1,864 @@
|
|||
INTERNET-DRAFT Mark Welter
|
||||
draft-ietf-idn-dude-02.txt Brian W. Spolarich
|
||||
Expires 2001-Dec-07 Adam M. Costello
|
||||
2001-Jun-07
|
||||
|
||||
Differential Unicode Domain Encoding (DUDE)
|
||||
|
||||
Status of this Memo
|
||||
|
||||
This document is an Internet-Draft and is in full conformance with
|
||||
all provisions of Section 10 of RFC2026.
|
||||
|
||||
Internet-Drafts are working documents of the Internet Engineering
|
||||
Task Force (IETF), its areas, and its working groups. Note
|
||||
that other groups may also distribute working documents as
|
||||
Internet-Drafts.
|
||||
|
||||
Internet-Drafts are draft documents valid for a maximum of six
|
||||
months and may be updated, replaced, or obsoleted by other documents
|
||||
at any time. It is inappropriate to use Internet-Drafts as
|
||||
reference material or to cite them other than as "work in progress."
|
||||
|
||||
The list of current Internet-Drafts can be accessed at
|
||||
http://www.ietf.org/ietf/1id-abstracts.txt
|
||||
|
||||
The list of Internet-Draft Shadow Directories can be accessed at
|
||||
http://www.ietf.org/shadow.html
|
||||
|
||||
Distribution of this document is unlimited. Please send comments to
|
||||
the authors or to the idn working group at idn@ops.ietf.org.
|
||||
|
||||
Abstract
|
||||
|
||||
DUDE is a reversible transformation from a sequence of nonnegative
|
||||
integer values to a sequence of letters, digits, and hyphens (LDH
|
||||
characters). DUDE provides a simple and efficient ASCII-Compatible
|
||||
Encoding (ACE) of Unicode strings [UNICODE] for use with
|
||||
Internationalized Domain Names [IDN] [IDNA].
|
||||
|
||||
Contents
|
||||
|
||||
1. Introduction
|
||||
2. Terminology
|
||||
3. Overview
|
||||
4. Base-32 characters
|
||||
5. Encoding procedure
|
||||
6. Decoding procedure
|
||||
7. Example strings
|
||||
8. Security considerations
|
||||
9. References
|
||||
A. Acknowledgements
|
||||
B. Author contact information
|
||||
C. Mixed-case annotation
|
||||
D. Differences from draft-ietf-idn-dude-01
|
||||
E. Example implementation
|
||||
|
||||
1. Introduction
|
||||
|
||||
The IDNA draft [IDNA] describes an architecture for supporting
|
||||
internationalized domain names. Each label of a domain name may
|
||||
begin with a special prefix, in which case the remainder of the
|
||||
label is an ASCII-Compatible Encoding (ACE) of a Unicode string
|
||||
satisfying certain constraints. For the details of the constraints,
|
||||
see [IDNA] and [NAMEPREP]. The prefix has not yet been specified,
|
||||
but see http://www.i-d-n.net/ for prefixes to be used for testing
|
||||
and experimentation.
|
||||
|
||||
DUDE is intended to be used as an ACE within IDNA, and has been
|
||||
designed to have the following features:
|
||||
|
||||
* Completeness: Every sequence of nonnegative integers maps to an
|
||||
LDH string. Restrictions on which integers are allowed, and on
|
||||
sequence length, may be imposed by higher layers.
|
||||
|
||||
* Uniqueness: Every sequence of nonnegative integers maps to at
|
||||
most one LDH string.
|
||||
|
||||
* Reversibility: Any Unicode string mapped to an LDH string can
|
||||
be recovered from that LDH string.
|
||||
|
||||
* Efficient encoding: The ratio of encoded size to original size
|
||||
is small. This is important in the context of domain names
|
||||
because [RFC1034] restricts the length of a domain label to 63
|
||||
characters.
|
||||
|
||||
* Simplicity: The encoding and decoding algorithms are reasonably
|
||||
simple to implement. The goals of efficiency and simplicity are
|
||||
at odds; DUDE places greater emphasis on simplicity.
|
||||
|
||||
An optional feature is described in appendix C "Mixed-case
|
||||
annotation".
|
||||
|
||||
2. Terminology
|
||||
|
||||
The key words "must", "shall", "required", "should", "recommended",
|
||||
and "may" in this document are to be interpreted as described in
|
||||
RFC 2119 [RFC2119].
|
||||
|
||||
LDH characters are the letters A-Z and a-z, the digits 0-9, and
|
||||
hyphen-minus.
|
||||
|
||||
A quartet is a sequence of four bits (also known as a nibble or
|
||||
nybble).
|
||||
|
||||
A quintet is a sequence of five bits.
|
||||
|
||||
Hexadecimal values are shown preceeded by "0x". For example, 0x60
|
||||
is decimal 96.
|
||||
|
||||
As in the Unicode Standard [UNICODE], Unicode code points are
|
||||
denoted by "U+" followed by four to six hexadecimal digits, while a
|
||||
range of code points is denoted by two hexadecimal numbers separated
|
||||
by "..", with no prefixes.
|
||||
|
||||
XOR means bitwise exclusive or. Given two nonnegative integer
|
||||
values A and B, A XOR B is the nonnegative integer value whose
|
||||
binary representation is 1 in whichever places the binary
|
||||
representations of A and B disagree, and 0 wherever they agree.
|
||||
For the purpose of applying this rule, recall that an integer's
|
||||
representation begins with an infinite number of unwritten zeros.
|
||||
In some programming languages, care may need to be taken that A and
|
||||
B are stored in variables of the same type and size.
|
||||
|
||||
3. Overview
|
||||
|
||||
DUDE encodes a sequence of nonnegative integral values as a sequence
|
||||
of LDH characters, although implementations will of course need to
|
||||
represent the output characters somehow, typically as ASCII octets.
|
||||
When DUDE is used to encode Unicode characters, the input values are
|
||||
Unicode code points (integral values in the range 0..10FFFF, but not
|
||||
D800..DFFF, which are reserved for use by UTF-16).
|
||||
|
||||
Each value in the input sequence is represented by one or more LDH
|
||||
characters in the encoded string. The value 0x2D is represented
|
||||
by hyphen-minus (U+002D). Each non-hyphen-minus character in
|
||||
the encoded string represents a quintet. A sequence of quintets
|
||||
represents the bitwise XOR between each non-0x2D integer and the
|
||||
previous one.
|
||||
|
||||
4. Base-32 characters
|
||||
|
||||
"a" = 0 = 0x00 = 00000 "s" = 16 = 0x10 = 10000
|
||||
"b" = 1 = 0x01 = 00001 "t" = 17 = 0x11 = 10001
|
||||
"c" = 2 = 0x02 = 00010 "u" = 18 = 0x12 = 10010
|
||||
"d" = 3 = 0x03 = 00011 "v" = 19 = 0x13 = 10011
|
||||
"e" = 4 = 0x04 = 00100 "w" = 20 = 0x14 = 10100
|
||||
"f" = 5 = 0x05 = 00101 "x" = 21 = 0x15 = 10101
|
||||
"g" = 6 = 0x06 = 00110 "y" = 22 = 0x16 = 10110
|
||||
"h" = 7 = 0x07 = 00111 "z" = 23 = 0x17 = 10111
|
||||
"i" = 8 = 0x08 = 01000 "2" = 24 = 0x18 = 11000
|
||||
"j" = 9 = 0x09 = 01001 "3" = 25 = 0x19 = 11001
|
||||
"k" = 10 = 0x0A = 01010 "4" = 26 = 0x1A = 11010
|
||||
"m" = 11 = 0x0B = 01011 "5" = 27 = 0x1B = 11011
|
||||
"n" = 12 = 0x0C = 01100 "6" = 28 = 0x1C = 11100
|
||||
"p" = 13 = 0x0D = 01101 "7" = 29 = 0x1D = 11101
|
||||
"q" = 14 = 0x0E = 01110 "8" = 30 = 0x1E = 11110
|
||||
"r" = 15 = 0x0F = 01111 "9" = 31 = 0x1F = 11111
|
||||
|
||||
The digits "0" and "1" and the letters "o" and "l" are not used, to
|
||||
avoid transcription errors.
|
||||
|
||||
A decoder must accept both the uppercase and lowercase forms of
|
||||
the base-32 characters (including mixtures of both forms). An
|
||||
encoder should output only lowercase forms or only uppercase forms
|
||||
(unless it uses the feature described in the appendix C "Mixed-case
|
||||
annotation").
|
||||
|
||||
5. Encoding procedure
|
||||
|
||||
All ordering of bits, quartets, and quintets is big-endian (most
|
||||
significant first).
|
||||
|
||||
let prev = 0x60
|
||||
for each input integer n (in order) do begin
|
||||
if n == 0x2D then output hyphen-minus
|
||||
else begin
|
||||
let diff = prev XOR n
|
||||
represent diff in base 16 as a sequence of quartets,
|
||||
as few as are sufficient (but at least one)
|
||||
prepend 0 to the last quartet and 1 to each of the others
|
||||
output a base-32 character corresponding to each quintet
|
||||
let prev = n
|
||||
end
|
||||
end
|
||||
|
||||
If an encoder encounters an input value larger than expected (for
|
||||
example, the largest Unicode code point is U+10FFFF, and nameprep
|
||||
[NAMEPREP03] can never output a code point larger than U+EFFFD),
|
||||
the encoder may either encode the value correctly, or may fail, but
|
||||
it must not produce incorrect output. The encoder must fail if it
|
||||
encounters a negative input value.
|
||||
|
||||
6. Decoding procedure
|
||||
|
||||
let prev = 0x60
|
||||
while the input string is not exhausted do begin
|
||||
if the next character is hyphen-minus
|
||||
then consume it and output 0x2D
|
||||
else begin
|
||||
consume characters and convert them to quintets until
|
||||
encountering a quintet whose first bit is 0
|
||||
fail upon encountering a non-base-32 character or end-of-input
|
||||
strip the first bit of each quintet
|
||||
concatenate the resulting quartets to form diff
|
||||
let prev = prev XOR diff
|
||||
output prev
|
||||
end
|
||||
end
|
||||
encode the output sequence and compare it to the input string
|
||||
fail if they do not match (case-insensitively)
|
||||
|
||||
The comparison at the end is necessary to guarantee the uniqueness
|
||||
property (there cannot be two distinct encoded strings representing
|
||||
the same sequence of integers). This check also frees the decoder
|
||||
from having to check for overflow while decoding the base-32
|
||||
characters. (If the decoder is one step of a larger decoding
|
||||
process, it may be possible to defer the re-encoding and comparison
|
||||
to the end of that larger decoding process.)
|
||||
|
||||
7. Example strings
|
||||
|
||||
The first several examples are nonsense strings of mostly unassigned
|
||||
code points intended to exercise the corner cases of the algorithm.
|
||||
|
||||
(A) u+0061
|
||||
DUDE: b
|
||||
|
||||
(B) u+2C7EF u+2C7EF
|
||||
DUDE: u6z2ra
|
||||
|
||||
(C) u+1752B u+1752A
|
||||
DUDE: tzxwmb
|
||||
|
||||
(D) u+63AB1 u+63ABA
|
||||
DUDE: yv47bm
|
||||
|
||||
(E) u+261AF u+261BF
|
||||
DUDE: uyt6rta
|
||||
|
||||
(F) u+C3A31 u+C3A8C
|
||||
DUDE: 6v4xb5p
|
||||
|
||||
(G) u+09F44 u+0954C
|
||||
DUDE: 39ue4si
|
||||
|
||||
(H) u+8D1A3 u+8C8A3
|
||||
DUDE: 27t6dt3sa
|
||||
|
||||
(I) u+6C2B6 u+CC266
|
||||
DUDE: y6u7g4ss7a
|
||||
|
||||
(J) u+002D u+002D u+002D u+E848F
|
||||
DUDE: ---82w8r
|
||||
|
||||
(K) u+BD08E u+002D u+002D u+002D
|
||||
DUDE: 57s8q---
|
||||
|
||||
(L) u+A9A24 u+002D u+002D u+002D u+C05B7
|
||||
DUDE: 434we---y393d
|
||||
|
||||
(M) u+7FFFFFFF
|
||||
DUDE: z999993r or explicit failure
|
||||
|
||||
The next several examples are realistic Unicode strings that could
|
||||
be used in domain names. They exhibit single-row text, two-row
|
||||
text, ideographic text, and mixtures thereof. These examples are
|
||||
names of Japanese television programs, music artists, and songs,
|
||||
merely because one of the authors happened to have them handy.
|
||||
|
||||
(N) 3<nen>b<gumi><kinpachi><sensei> (Latin, kanji)
|
||||
u+0033 u+5E74 u+0062 u+7D44 u+91D1 u+516B u+5148 u+751F
|
||||
DUDE: xdx8whx8tgz7ug863f6s5kuduwxh
|
||||
|
||||
(O) <amuro><namie>-with-super-monkeys (Latin, kanji, hyphens)
|
||||
u+5B89 u+5BA4 u+5948 u+7F8E u+6075 u+002D u+0077 u+0069 u+0074
|
||||
u+0068 u+002D u+0073 u+0075 u+0070 u+0065 u+0072 u+002D u+006D
|
||||
u+006F u+006E u+006B u+0065 u+0079 u+0073
|
||||
DUDE: x58jupu8nuy6gt99m-yssctqtptn-tmgftfth-trcbfqtnk
|
||||
|
||||
(P) maji<de>koi<suru>5<byou><mae> (Latin, hiragana, kanji)
|
||||
u+006D u+0061 u+006A u+0069 u+3067 u+006B u+006F u+0069 u+3059
|
||||
u+308B u+0035 u+79D2 u+524D
|
||||
DUDE: pnmdvssqvssnegvsva7cvs5qz38hu53r
|
||||
|
||||
(Q) <pafii>de<runba> (Latin, katakana)
|
||||
u+30D1 u+30D5 u+30A3 u+30FC u+0064 u+0065 u+30EB u+30F3 u+30D0
|
||||
DUDE: vs5bezgxrvs3ibvs2qtiud
|
||||
|
||||
(R) <sono><supiido><de> (hiragana, katakana)
|
||||
u+305D u+306E u+30B9 u+30D4 u+30FC u+30C9 u+3067
|
||||
DUDE: vsvpvd7hypuivf4q
|
||||
|
||||
8. Security considerations
|
||||
|
||||
Users expect each domain name in DNS to be controlled by a single
|
||||
authority. If a Unicode string intended for use as a domain label
|
||||
could map to multiple ACE labels, then an internationalized domain
|
||||
name could map to multiple ACE domain names, each controlled by
|
||||
a different authority, some of which could be spoofs that hijack
|
||||
service requests intended for another. Therefore DUDE is designed
|
||||
so that each Unicode string has a unique encoding.
|
||||
|
||||
However, there can still be multiple Unicode representations of the
|
||||
"same" text, for various definitions of "same". This problem is
|
||||
addressed to some extent by the Unicode standard under the topic of
|
||||
canonicalization, and this work is leveraged for domain names by
|
||||
"nameprep" [NAMEPREP03].
|
||||
|
||||
9. References
|
||||
|
||||
[IDN] Internationalized Domain Names (IETF working group),
|
||||
http://www.i-d-n.net/, idn@ops.ietf.org.
|
||||
|
||||
[IDNA] Patrik Faltstrom, Paul Hoffman, "Internationalizing Host
|
||||
Names In Applications (IDNA)", draft-ietf-idn-idna-01.
|
||||
|
||||
[NAMEPREP03] Paul Hoffman, Marc Blanchet, "Preparation
|
||||
of Internationalized Host Names", 2001-Feb-24,
|
||||
draft-ietf-idn-nameprep-03.
|
||||
|
||||
[RFC952] K. Harrenstien, M. Stahl, E. Feinler, "DOD Internet Host
|
||||
Table Specification", 1985-Oct, RFC 952.
|
||||
|
||||
[RFC1034] P. Mockapetris, "Domain Names - Concepts and Facilities",
|
||||
1987-Nov, RFC 1034.
|
||||
|
||||
[RFC1123] Internet Engineering Task Force, R. Braden (editor),
|
||||
"Requirements for Internet Hosts -- Application and Support",
|
||||
1989-Oct, RFC 1123.
|
||||
|
||||
[RFC2119] Scott Bradner, "Key words for use in RFCs to Indicate
|
||||
Requirement Levels", 1997-Mar, RFC 2119.
|
||||
|
||||
[SFS] David Mazieres et al, "Self-certifying File System",
|
||||
http://www.fs.net/.
|
||||
|
||||
[UNICODE] The Unicode Consortium, "The Unicode Standard",
|
||||
http://www.unicode.org/unicode/standard/standard.html.
|
||||
|
||||
A. Acknowledgements
|
||||
|
||||
The basic encoding of integers to quartets to quintets to base-32
|
||||
comes from earlier IETF work by Martin Duerst. DUDE uses a slight
|
||||
variation on the idea.
|
||||
|
||||
Paul Hoffman provided helpful comments on this document.
|
||||
|
||||
The idea of avoiding 0, 1, o, and l in base-32 strings was taken
|
||||
from SFS [SFS].
|
||||
|
||||
B. Author contact information
|
||||
|
||||
Mark Welter <mwelter@walid.com>
|
||||
Brian W. Spolarich <briansp@walid.com>
|
||||
WALID, Inc.
|
||||
State Technology Park
|
||||
2245 S. State St.
|
||||
Ann Arbor, MI 48104
|
||||
+1 734 822 2020
|
||||
|
||||
Adam M. Costello <amc@cs.berkeley.edu>
|
||||
University of California, Berkeley
|
||||
http://www.cs.berkeley.edu/~amc/
|
||||
|
||||
C. Mixed-case annotation
|
||||
|
||||
In order to use DUDE to represent case-insensitive Unicode strings,
|
||||
higher layers need to case-fold the Unicode strings prior to DUDE
|
||||
encoding. The encoded string can, however, use mixed-case base-32
|
||||
(rather than all-lowercase or all-uppercase as recommended in
|
||||
section 4 "Base-32 characters") as an annotation telling how to
|
||||
convert the folded Unicode string into a mixed-case Unicode string
|
||||
for display purposes.
|
||||
|
||||
Each Unicode code point (unless it is U+002D hyphen-minus) is
|
||||
represented by a sequence of base-32 characters, the last of which
|
||||
is always a letter (as opposed to a digit). If that letter is
|
||||
uppercase, it is a suggestion that the Unicode character be mapped
|
||||
to uppercase (if possible); if the letter is lowercase, it is a
|
||||
suggestion that the Unicode character be mapped to lowercase (if
|
||||
possible).
|
||||
|
||||
DUDE encoders and decoders are not required to support these
|
||||
annotations, and higher layers need not use them.
|
||||
|
||||
Example: In order to suggest that example O in section 7 "Example
|
||||
strings" be displayed as:
|
||||
|
||||
<amuro><namie>-with-SUPER-MONKEYS
|
||||
|
||||
one could capitalize the DUDE encoding as:
|
||||
|
||||
x58jupu8nuy6gt99m-yssctqtptn-tMGFtFtH-tRCBFQtNK
|
||||
|
||||
D. Differences from draft-ietf-idn-dude-01
|
||||
|
||||
Four changes have been made since draft-ietf-idn-dude-01 (DUDE-01):
|
||||
|
||||
1) DUDE-01 computed the XOR of each integer with the previous one
|
||||
in order to decide how many bits of each integer to encode, but
|
||||
now the XOR itself is encoded, so there is no need for a mask.
|
||||
|
||||
2) DUDE-01 made the first quintet of each sequence different from
|
||||
the rest, while now it is the last quintet that differs, so it's
|
||||
easier for the decoder to detect the end of the sequence.
|
||||
|
||||
3) The base-32 map has changed to avoid 0, 1, o, and l, to help
|
||||
humans avoid transcription errors.
|
||||
|
||||
4) The initial value of the previous code point has changed from 0
|
||||
to 0x60, making the encodings of a few domain names shorter and
|
||||
none longer.
|
||||
|
||||
|
||||
E. Example implementation
|
||||
|
||||
|
||||
|
||||
/******************************************/
|
||||
/* dude.c 0.2.3 (2001-May-31-Thu) */
|
||||
/* Adam M. Costello <amc@cs.berkeley.edu> */
|
||||
/******************************************/
|
||||
|
||||
/* This is ANSI C code (C89) implementing */
|
||||
/* DUDE (draft-ietf-idn-dude-02). */
|
||||
|
||||
|
||||
/************************************************************/
|
||||
/* Public interface (would normally go in its own .h file): */
|
||||
|
||||
#include <limits.h>
|
||||
|
||||
enum dude_status {
|
||||
dude_success,
|
||||
dude_bad_input,
|
||||
dude_big_output /* Output would exceed the space provided. */
|
||||
};
|
||||
|
||||
enum case_sensitivity { case_sensitive, case_insensitive };
|
||||
|
||||
#if UINT_MAX >= 0x1FFFFF
|
||||
typedef unsigned int u_code_point;
|
||||
#else
|
||||
typedef unsigned long u_code_point;
|
||||
#endif
|
||||
|
||||
enum dude_status dude_encode(
|
||||
unsigned int input_length,
|
||||
const u_code_point input[],
|
||||
const unsigned char uppercase_flags[],
|
||||
unsigned int *output_size,
|
||||
char output[] );
|
||||
|
||||
/* dude_encode() converts Unicode to DUDE (without any */
|
||||
/* signature). The input must be represented as an array */
|
||||
/* of Unicode code points (not code units; surrogate pairs */
|
||||
/* are not allowed), and the output will be represented as */
|
||||
/* null-terminated ASCII. The input_length is the number of code */
|
||||
/* points in the input. The output_size is an in/out argument: */
|
||||
/* the caller must pass in the maximum number of characters */
|
||||
/* that may be output (including the terminating null), and on */
|
||||
/* successful return it will contain the number of characters */
|
||||
/* actually output (including the terminating null, so it will be */
|
||||
/* one more than strlen() would return, which is why it is called */
|
||||
/* output_size rather than output_length). The uppercase_flags */
|
||||
/* array must hold input_length boolean values, where nonzero */
|
||||
/* means the corresponding Unicode character should be forced */
|
||||
/* to uppercase after being decoded, and zero means it is */
|
||||
/* caseless or should be forced to lowercase. Alternatively, */
|
||||
/* uppercase_flags may be a null pointer, which is equivalent */
|
||||
/* to all zeros. The encoder always outputs lowercase base-32 */
|
||||
/* characters except when nonzero values of uppercase_flags */
|
||||
/* require otherwise. The return value may be any of the */
|
||||
/* dude_status values defined above; if not dude_success, then */
|
||||
/* output_size and output may contain garbage. On success, the */
|
||||
/* encoder will never need to write an output_size greater than */
|
||||
/* input_length*k+1 if all the input code points are less than 1 */
|
||||
/* << (4*k), because of how the encoding is defined. */
|
||||
|
||||
enum dude_status dude_decode(
|
||||
enum case_sensitivity case_sensitivity,
|
||||
char scratch_space[],
|
||||
const char input[],
|
||||
unsigned int *output_length,
|
||||
u_code_point output[],
|
||||
unsigned char uppercase_flags[] );
|
||||
|
||||
/* dude_decode() converts DUDE (without any signature) to */
|
||||
/* Unicode. The input must be represented as null-terminated */
|
||||
/* ASCII, and the output will be represented as an array of */
|
||||
/* Unicode code points. The case_sensitivity argument influences */
|
||||
/* the check on the well-formedness of the input string; it */
|
||||
/* must be case_sensitive if case-sensitive comparisons are */
|
||||
/* allowed on encoded strings, case_insensitive otherwise. */
|
||||
/* The scratch_space must point to space at least as large */
|
||||
/* as the input, which will get overwritten (this allows the */
|
||||
/* decoder to avoid calling malloc()). The output_length is */
|
||||
/* an in/out argument: the caller must pass in the maximum */
|
||||
/* number of code points that may be output, and on successful */
|
||||
/* return it will contain the actual number of code points */
|
||||
/* output. The uppercase_flags array must have room for at */
|
||||
/* least output_length values, or it may be a null pointer if */
|
||||
/* the case information is not needed. A nonzero flag indicates */
|
||||
/* that the corresponding Unicode character should be forced to */
|
||||
/* uppercase by the caller, while zero means it is caseless or */
|
||||
/* should be forced to lowercase. The return value may be any */
|
||||
/* of the dude_status values defined above; if not dude_success, */
|
||||
/* then output_length, output, and uppercase_flags may contain */
|
||||
/* garbage. On success, the decoder will never need to write */
|
||||
/* an output_length greater than the length of the input (not */
|
||||
/* counting the null terminator), because of how the encoding is */
|
||||
/* defined. */
|
||||
|
||||
|
||||
/**********************************************************/
|
||||
/* Implementation (would normally go in its own .c file): */
|
||||
|
||||
#include <string.h>
|
||||
|
||||
/* Character utilities: */
|
||||
|
||||
/* base32[q] is the lowercase base-32 character representing */
|
||||
/* the number q from the range 0 to 31. Note that we cannot */
|
||||
/* use string literals for ASCII characters because an ANSI C */
|
||||
/* compiler does not necessarily use ASCII. */
|
||||
|
||||
static const char base32[] = {
|
||||
97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, /* a-k */
|
||||
109, 110, /* m-n */
|
||||
112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, /* p-z */
|
||||
50, 51, 52, 53, 54, 55, 56, 57 /* 2-9 */
|
||||
};
|
||||
|
||||
/* base32_decode(c) returns the value of a base-32 character, in the */
|
||||
/* range 0 to 31, or the constant base32_invalid if c is not a valid */
|
||||
/* base-32 character. */
|
||||
|
||||
enum { base32_invalid = 32 };
|
||||
|
||||
static unsigned int base32_decode(char c)
|
||||
{
|
||||
if (c < 50) return base32_invalid;
|
||||
if (c <= 57) return c - 26;
|
||||
if (c < 97) c += 32;
|
||||
if (c < 97 || c == 108 || c == 111 || c > 122) return base32_invalid;
|
||||
return c - 97 - (c > 108) - (c > 111);
|
||||
}
|
||||
|
||||
/* unequal(case_sensitivity,s1,s2) returns 0 if the strings s1 and s2 */
|
||||
/* are equal, 1 otherwise. If case_sensitivity is case_insensitive, */
|
||||
/* then ASCII A-Z are considered equal to a-z respectively. */
|
||||
|
||||
static int unequal( enum case_sensitivity case_sensitivity,
|
||||
const char s1[], const char s2[] )
|
||||
{
|
||||
char c1, c2;
|
||||
|
||||
if (case_sensitivity != case_insensitive) return strcmp(s1,s2) != 0;
|
||||
|
||||
for (;;) {
|
||||
c1 = *s1;
|
||||
c2 = *s2;
|
||||
if (c1 >= 65 && c1 <= 90) c1 += 32;
|
||||
if (c2 >= 65 && c2 <= 90) c2 += 32;
|
||||
if (c1 != c2) return 1;
|
||||
if (c1 == 0) return 0;
|
||||
++s1, ++s2;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/* Encoder: */
|
||||
|
||||
enum dude_status dude_encode(
|
||||
unsigned int input_length,
|
||||
const u_code_point input[],
|
||||
const unsigned char uppercase_flags[],
|
||||
unsigned int *output_size,
|
||||
char output[] )
|
||||
{
|
||||
unsigned int max_out, in, out, k, j;
|
||||
u_code_point prev, codept, diff, tmp;
|
||||
char shift;
|
||||
|
||||
prev = 0x60;
|
||||
max_out = *output_size;
|
||||
|
||||
for (in = out = 0; in < input_length; ++in) {
|
||||
|
||||
/* At the start of each iteration, in and out are the number of */
|
||||
/* items already input/output, or equivalently, the indices of */
|
||||
/* the next items to be input/output. */
|
||||
|
||||
codept = input[in];
|
||||
|
||||
if (codept == 0x2D) {
|
||||
/* Hyphen-minus stands for itself. */
|
||||
if (max_out - out < 1) return dude_big_output;
|
||||
output[out++] = 0x2D;
|
||||
continue;
|
||||
}
|
||||
|
||||
diff = prev ^ codept;
|
||||
|
||||
/* Compute the number of base-32 characters (k): */
|
||||
for (tmp = diff >> 4, k = 1; tmp != 0; ++k, tmp >>= 4);
|
||||
|
||||
if (max_out - out < k) return dude_big_output;
|
||||
shift = uppercase_flags && uppercase_flags[in] ? 32 : 0;
|
||||
/* shift controls the case of the last base-32 digit. */
|
||||
|
||||
/* Each quintet has the form 1xxxx except the last is 0xxxx. */
|
||||
/* Computing the base-32 digits in reverse order is easiest. */
|
||||
|
||||
out += k;
|
||||
output[out - 1] = base32[diff & 0xF] - shift;
|
||||
|
||||
for (j = 2; j <= k; ++j) {
|
||||
diff >>= 4;
|
||||
output[out - j] = base32[0x10 | (diff & 0xF)];
|
||||
}
|
||||
|
||||
prev = codept;
|
||||
}
|
||||
|
||||
/* Append the null terminator: */
|
||||
if (max_out - out < 1) return dude_big_output;
|
||||
output[out++] = 0;
|
||||
|
||||
*output_size = out;
|
||||
return dude_success;
|
||||
}
|
||||
|
||||
|
||||
/* Decoder: */
|
||||
|
||||
enum dude_status dude_decode(
|
||||
enum case_sensitivity case_sensitivity,
|
||||
char scratch_space[],
|
||||
const char input[],
|
||||
unsigned int *output_length,
|
||||
u_code_point output[],
|
||||
unsigned char uppercase_flags[] )
|
||||
{
|
||||
u_code_point prev, q, diff;
|
||||
char c;
|
||||
unsigned int max_out, in, out, scratch_size;
|
||||
enum dude_status status;
|
||||
|
||||
prev = 0x60;
|
||||
max_out = *output_length;
|
||||
|
||||
for (c = input[in = 0], out = 0; c != 0; c = input[++in], ++out) {
|
||||
|
||||
/* At the start of each iteration, in and out are the number of */
|
||||
/* items already input/output, or equivalently, the indices of */
|
||||
/* the next items to be input/output. */
|
||||
|
||||
if (max_out - out < 1) return dude_big_output;
|
||||
|
||||
if (c == 0x2D) output[out] = c; /* hyphen-minus is literal */
|
||||
else {
|
||||
/* Base-32 sequence. Decode quintets until 0xxxx is found: */
|
||||
|
||||
for (diff = 0; ; c = input[++in]) {
|
||||
q = base32_decode(c);
|
||||
if (q == base32_invalid) return dude_bad_input;
|
||||
diff = (diff << 4) | (q & 0xF);
|
||||
if (q >> 4 == 0) break;
|
||||
}
|
||||
|
||||
prev = output[out] = prev ^ diff;
|
||||
}
|
||||
|
||||
/* Case of last character determines uppercase flag: */
|
||||
if (uppercase_flags) uppercase_flags[out] = c >= 65 && c <= 90;
|
||||
}
|
||||
|
||||
/* Enforce the uniqueness of the encoding by re-encoding */
|
||||
/* the output and comparing the result to the input: */
|
||||
|
||||
scratch_size = ++in;
|
||||
status = dude_encode(out, output, uppercase_flags,
|
||||
&scratch_size, scratch_space);
|
||||
if (status != dude_success || scratch_size != in ||
|
||||
unequal(case_sensitivity, scratch_space, input)
|
||||
) return dude_bad_input;
|
||||
|
||||
*output_length = out;
|
||||
return dude_success;
|
||||
}
|
||||
|
||||
|
||||
/******************************************************************/
|
||||
/* Wrapper for testing (would normally go in a separate .c file): */
|
||||
|
||||
#include <assert.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
/* For testing, we'll just set some compile-time limits rather than */
|
||||
/* use malloc(), and set a compile-time option rather than using a */
|
||||
/* command-line option. */
|
||||
|
||||
enum {
|
||||
unicode_max_length = 256,
|
||||
ace_max_size = 256,
|
||||
test_case_sensitivity = case_insensitive
|
||||
/* suitable for host names */
|
||||
};
|
||||
|
||||
|
||||
static void usage(char **argv)
|
||||
{
|
||||
fprintf(stderr,
|
||||
"%s -e reads code points and writes a DUDE string.\n"
|
||||
"%s -d reads a DUDE string and writes code points.\n"
|
||||
"Input and output are plain text in the native character set.\n"
|
||||
"Code points are in the form u+hex separated by whitespace.\n"
|
||||
"A DUDE string is a newline-terminated sequence of LDH characters\n"
|
||||
"(without any signature).\n"
|
||||
"The case of the u in u+hex is the force-to-uppercase flag.\n"
|
||||
, argv[0], argv[0]);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
|
||||
static void fail(const char *msg)
|
||||
{
|
||||
fputs(msg,stderr);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
static const char too_big[] =
|
||||
"input or output is too large, recompile with larger limits\n";
|
||||
static const char invalid_input[] = "invalid input\n";
|
||||
static const char io_error[] = "I/O error\n";
|
||||
|
||||
|
||||
/* The following string is used to convert LDH */
|
||||
/* characters between ASCII and the native charset: */
|
||||
|
||||
static const char ldh_ascii[] =
|
||||
"................"
|
||||
"................"
|
||||
".............-.."
|
||||
"0123456789......"
|
||||
".ABCDEFGHIJKLMNO"
|
||||
"PQRSTUVWXYZ....."
|
||||
".abcdefghijklmno"
|
||||
"pqrstuvwxyz";
|
||||
|
||||
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
enum dude_status status;
|
||||
int r;
|
||||
char *p;
|
||||
|
||||
if (argc != 2) usage(argv);
|
||||
if (argv[1][0] != '-') usage(argv);
|
||||
if (argv[1][2] != 0) usage(argv);
|
||||
|
||||
if (argv[1][1] == 'e') {
|
||||
u_code_point input[unicode_max_length];
|
||||
unsigned long codept;
|
||||
unsigned char uppercase_flags[unicode_max_length];
|
||||
char output[ace_max_size], uplus[3];
|
||||
unsigned int input_length, output_size, i;
|
||||
|
||||
/* Read the input code points: */
|
||||
|
||||
input_length = 0;
|
||||
|
||||
for (;;) {
|
||||
r = scanf("%2s%lx", uplus, &codept);
|
||||
if (ferror(stdin)) fail(io_error);
|
||||
if (r == EOF || r == 0) break;
|
||||
|
||||
if (r != 2 || uplus[1] != '+' || codept > (u_code_point)-1) {
|
||||
fail(invalid_input);
|
||||
}
|
||||
|
||||
if (input_length == unicode_max_length) fail(too_big);
|
||||
|
||||
if (uplus[0] == 'u') uppercase_flags[input_length] = 0;
|
||||
else if (uplus[0] == 'U') uppercase_flags[input_length] = 1;
|
||||
else fail(invalid_input);
|
||||
|
||||
input[input_length++] = codept;
|
||||
}
|
||||
|
||||
/* Encode: */
|
||||
|
||||
output_size = ace_max_size;
|
||||
status = dude_encode(input_length, input, uppercase_flags,
|
||||
&output_size, output);
|
||||
if (status == dude_bad_input) fail(invalid_input);
|
||||
if (status == dude_big_output) fail(too_big);
|
||||
assert(status == dude_success);
|
||||
|
||||
/* Convert to native charset and output: */
|
||||
|
||||
for (p = output; *p != 0; ++p) {
|
||||
i = *p;
|
||||
assert(i <= 122 && ldh_ascii[i] != '.');
|
||||
*p = ldh_ascii[i];
|
||||
}
|
||||
|
||||
r = puts(output);
|
||||
if (r == EOF) fail(io_error);
|
||||
return EXIT_SUCCESS;
|
||||
}
|
||||
|
||||
if (argv[1][1] == 'd') {
|
||||
char input[ace_max_size], scratch[ace_max_size], *pp;
|
||||
u_code_point output[unicode_max_length];
|
||||
unsigned char uppercase_flags[unicode_max_length];
|
||||
unsigned int input_length, output_length, i;
|
||||
|
||||
/* Read the DUDE input string and convert to ASCII: */
|
||||
|
||||
fgets(input, ace_max_size, stdin);
|
||||
if (ferror(stdin)) fail(io_error);
|
||||
if (feof(stdin)) fail(invalid_input);
|
||||
input_length = strlen(input);
|
||||
if (input[input_length - 1] != '\n') fail(too_big);
|
||||
input[--input_length] = 0;
|
||||
|
||||
for (p = input; *p != 0; ++p) {
|
||||
pp = strchr(ldh_ascii, *p);
|
||||
if (pp == 0) fail(invalid_input);
|
||||
*p = pp - ldh_ascii;
|
||||
}
|
||||
|
||||
/* Decode: */
|
||||
|
||||
output_length = unicode_max_length;
|
||||
status = dude_decode(test_case_sensitivity, scratch, input,
|
||||
&output_length, output, uppercase_flags);
|
||||
if (status == dude_bad_input) fail(invalid_input);
|
||||
if (status == dude_big_output) fail(too_big);
|
||||
assert(status == dude_success);
|
||||
|
||||
/* Output the result: */
|
||||
|
||||
for (i = 0; i < output_length; ++i) {
|
||||
r = printf("%s+%04lX\n",
|
||||
uppercase_flags[i] ? "U" : "u",
|
||||
(unsigned long) output[i] );
|
||||
if (r < 0) fail(io_error);
|
||||
}
|
||||
|
||||
return EXIT_SUCCESS;
|
||||
}
|
||||
|
||||
usage(argv);
|
||||
return EXIT_SUCCESS; /* not reached, but quiets compiler warning */
|
||||
}
|
||||
|
||||
|
||||
|
||||
INTERNET-DRAFT expires 2001-Dec-07
|
||||
|
|
@ -5,9 +5,9 @@
|
|||
|
||||
|
||||
INTERNET-DRAFT Hongbo Shi
|
||||
draft-ietf-idn-iptr-01.txt Waseda University
|
||||
17 November 2000 Jiang Ming Liang
|
||||
Expires: 17 May 2001 i-DNS.net
|
||||
draft-ietf-idn-iptr-02.txt Waseda University
|
||||
17 May 2001 Jiang Ming Liang
|
||||
Expires: 17 November 2001 i-DNS.net
|
||||
|
||||
|
||||
Internationalized PTR Resource Record (IPTR)
|
||||
|
|
@ -61,7 +61,7 @@ Shi, Jiang [Page 1]
|
|||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 17 May 2001
|
||||
|
||||
|
||||
mapping architecture. This document describes a new RR TYPE named IPTR
|
||||
|
|
@ -121,7 +121,7 @@ Shi, Jiang [Page 2]
|
|||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 17 May 2001
|
||||
|
||||
|
||||
properties:
|
||||
|
|
@ -181,7 +181,7 @@ Shi, Jiang [Page 3]
|
|||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 17 May 2001
|
||||
|
||||
|
||||
Mapping IPv6 to IDNs can be similarly supported. This document recom-
|
||||
|
|
@ -241,7 +241,7 @@ Shi, Jiang [Page 4]
|
|||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 17 May 2001
|
||||
|
||||
|
||||
not find the corresponding LANGUAGE IDN finally, then the correspond-
|
||||
|
|
@ -301,7 +301,7 @@ Shi, Jiang [Page 5]
|
|||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 17 May 2001
|
||||
|
||||
|
||||
|
||||
|
|
@ -361,7 +361,7 @@ Shi, Jiang [Page 6]
|
|||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 17 May 2001
|
||||
|
||||
|
||||
Thus,
|
||||
|
|
@ -381,123 +381,27 @@ INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
|||
|
||||
is allowed.
|
||||
|
||||
8. Open Issues
|
||||
8. Changes
|
||||
|
||||
1. Is it necessary to let a IDN aware server to send back all of
|
||||
the corresponding IDNs to a resolver? Meanings,
|
||||
Through the discussion on the IETF49 meeting in San Diego, we
|
||||
deleted the chapter "Open Issues" of our previous draft (version
|
||||
01).
|
||||
|
||||
And,
|
||||
|
||||
+------------------------------------------------------+
|
||||
Header | OPCODE=SQUERY, RESPONSE, AA |
|
||||
+------------------------------------------------------+
|
||||
Question | QNAME=4.3.2.1.IN-ADDR.ARPA.,QCLASS=IN,QTYPE=IPTR |
|
||||
+------------------------------------------------------+
|
||||
Answer | 4.3.2.1.IN-ADDR.ARPA. IPTR "zh-CN" "name1-in-utf8" |
|
||||
| 4.3.2.1.IN-ADDR.ARPA. IPTR "zh-CN" "name2-in-utf8" |
|
||||
| 4.3.2.1.IN-ADDR.ARPA. IPTR "zh-CN" "name3-in-utf8" |
|
||||
| 4.3.2.1.IN-ADDR.ARPA. IPTR "zh-TW" "name4-in-utf8" |
|
||||
| 4.3.2.1.IN-ADDR.ARPA. IPTR "ko-KR" "name5-in-utf8" |
|
||||
| 4.3.2.1.IN-ADDR.ARPA. IPTR "ko-KR" "name6-in-utf8" |
|
||||
+------------------------------------------------------+
|
||||
Authority | ... |
|
||||
+------------------------------------------------------+
|
||||
Additional | ... |
|
||||
+------------------------------------------------------+
|
||||
4.3.2.1.IN-ADDR.ARPA IPTR "zh-TW" "[samefoo.sample] in utf8"
|
||||
IPTR "zh-TW" "[difffoo.sample] in utf8"
|
||||
IPTR "zh-CN" "[samefoo.sample] in utf8"
|
||||
IPTR "ja-JP" "[samefoo.sample] in utf8"
|
||||
IPTR "ko-KR" "[samefoo.sample] in utf8"
|
||||
|
||||
is allowed.
|
||||
|
||||
Or, just using current fixed/cyclic/random options to return
|
||||
one of the corresponding IDNs per LANGUAGE? In short, "one IP
|
||||
one IDN per LANGUAGE". Such like
|
||||
8. Changes
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Shi, Jiang [Page 7]
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
|
||||
|
||||
|
||||
+------------------------------------------------------+
|
||||
Header | OPCODE=SQUERY, RESPONSE, AA |
|
||||
+------------------------------------------------------+
|
||||
Question | QNAME=4.3.2.1.IN-ADDR.ARPA.,QCLASS=IN,QTYPE=IPTR |
|
||||
+------------------------------------------------------+
|
||||
Answer | 4.3.2.1.IN-ADDR.ARPA. IPTR "zh-CN" "name1-in-utf8" |
|
||||
| 4.3.2.1.IN-ADDR.ARPA. IPTR "zh-TW" "name4-in-utf8" |
|
||||
| 4.3.2.1.IN-ADDR.ARPA. IPTR "ko-KR" "name5-in-utf8" |
|
||||
| 4.3.2.1.IN-ADDR.ARPA. IPTR "ko-KR" "name6-in-utf8" |
|
||||
+------------------------------------------------------+
|
||||
Authority | ... |
|
||||
+------------------------------------------------------+
|
||||
Additional | ... |
|
||||
+------------------------------------------------------+
|
||||
|
||||
|
||||
|
||||
|
||||
2. If QTYPE is IPTR, should an IDN aware server send all of the
|
||||
corresponding IDNs to the resolver? Is this kind of behavior
|
||||
friendly to implent the resolver? How about letting a server
|
||||
just feedback the corresponding PTR record, if a server
|
||||
doesn't find the corresponding LANGUAGE IDN that a client
|
||||
requires.
|
||||
|
||||
In the following case, it is wasteful to return all the
|
||||
corresponding IDNs to the clients.
|
||||
|
||||
4.3.2.1.IN-ADDR.ARPA IPTR "zh-TW" "[foo1.example] in utf8"
|
||||
IPTR "zh-TW" "[foo2.example] in utf8"
|
||||
...
|
||||
IPTR "zh-CN" "[foo1.example] in utf8"
|
||||
IPTR "zh-CN" "[foo2.example] in utf8"
|
||||
...
|
||||
IPTR "ja-JP" "[foo1.example] in utf8"
|
||||
IPTR "ja-JP" "[foo2.example] in utf8"
|
||||
...
|
||||
IPTR "ko-KR" "[foo1.example] in utf8"
|
||||
IPTR "ko-KR" "[foo2.example] in utf8"
|
||||
...
|
||||
|
||||
The benefit of the IPTR is introducing LANGUAGE. It SHOULD be
|
||||
used in query from clients, then servers can select minimum
|
||||
size of corresponding IDNs. For working this effectively, you
|
||||
should introduce default LANGUAGE if no corresponding LANGUAGE
|
||||
exists. The default MUST be ASCII. So that default IPTR can be
|
||||
natural extension of PTR. I.E.
|
||||
|
||||
|
||||
|
||||
Shi, Jiang [Page 8]
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
|
||||
|
||||
4.3.2.1.in-addr.arpa. IN PTR ASCII-domain-name
|
||||
|
||||
is equivalent to
|
||||
|
||||
4.3.2.1.in-addr.arpa. IN IPTR "default" ASCII-domain-name
|
||||
|
||||
Of course, ASCII includes ACE.
|
||||
|
||||
|
||||
3. According to the consideration above, how about the following
|
||||
thinking? That means a response MAY include not only a
|
||||
corresponding IDN in a specific LANGUAGE but also the LANGUAGE
|
||||
tags of the corresponding IDNs. And the client will load these
|
||||
LANGUAGE tags in the DNS cache for the next IPTR query.
|
||||
Through the discussion on the IETF49 meeting in San Diego, we
|
||||
deleted the chapter "Open Issues" of our previous draft (version
|
||||
01).
|
||||
|
||||
References
|
||||
|
||||
|
|
@ -507,8 +411,20 @@ References
|
|||
[IDNE] Marc Blanchet & Paul Hoffman, "Internationalized domain
|
||||
names using EDNS", draft-ietf-idn-idne.
|
||||
|
||||
[NAMEPREP] Paul Hoffman & Marc Blanchet, "Preparation of Interna-
|
||||
tionalized Host Names", draft-ietf-idn-nameprep.
|
||||
[NAMEPREP] Paul Hoffman & Marc Blanchet, "Preparation of
|
||||
|
||||
|
||||
|
||||
Shi, Jiang [Page 7]
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 17 May 2001
|
||||
|
||||
|
||||
Internationalized Host Names", draft-ietf-idn-nameprep.
|
||||
|
||||
[RFC1034] P. Mockapetris, "DOMAIN NAMES - CONCEPTS AND FACILITIES",
|
||||
November 1987, RFC1034
|
||||
|
|
@ -532,18 +448,6 @@ References
|
|||
August 1999, RFC 2671.
|
||||
|
||||
[ISO 639] ISO 639:1988 (E/F) - Code for the representation of names
|
||||
|
||||
|
||||
|
||||
Shi, Jiang [Page 9]
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 14 Nov. 2000
|
||||
|
||||
|
||||
of languages - The International Organization for Standardization,
|
||||
1st edition, 1988 17 pages Prepared by ISO/TC 37 - Terminology
|
||||
(principles and coordination).
|
||||
|
|
@ -568,6 +472,18 @@ Authors' Information
|
|||
Tokyo, 169-8555 Japan
|
||||
shi@goto.info.waseda.ac.jp
|
||||
|
||||
|
||||
|
||||
|
||||
Shi, Jiang [Page 8]
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
INTERNET-DRAFT Internationalized PTR Resource Record 17 May 2001
|
||||
|
||||
|
||||
Jiang Ming Liang
|
||||
i-DNS.net
|
||||
8 Temasek Boulevard
|
||||
|
|
@ -595,6 +511,30 @@ Authors' Information
|
|||
|
||||
|
||||
|
||||
Shi, Jiang [Page 10]
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Shi, Jiang [Page 9]
|
||||
|
||||
|
||||
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
IPng Working Group Richard Draves
|
||||
Internet Draft Microsoft Research
|
||||
Document: draft-ietf-ipngwg-default-addr-select-03.txt March 3, 2001
|
||||
Document: draft-ietf-ipngwg-default-addr-select-04.txt May 14, 2001
|
||||
Category: Standards Track
|
||||
|
||||
Default Address Selection for IPv6
|
||||
|
|
@ -54,8 +54,8 @@ Abstract
|
|||
These addresses may also be "preferred" or "deprecated" [3]. Privacy
|
||||
considerations have introduced the concepts of "public addresses"
|
||||
|
||||
Draves Standards Track - Expires September 2001 1
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
Draves Standards Track - Expires December 2001 1
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
and "temporary addresses" [4]. The mobility architecture introduces
|
||||
|
|
@ -106,14 +106,14 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
transition scenarios, but they are certainly not a panacea.
|
||||
|
||||
The selection rules specified in this document MUST NOT be construed
|
||||
to override an application or upper-layer's explicit choice of
|
||||
destination or source address.
|
||||
to override an application or upper-layer's explicit choice of a
|
||||
legal destination or source address.
|
||||
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 2
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
Draves Standards Track - Expires December 2001 2
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
1.1. Conventions used in this document
|
||||
|
|
@ -132,27 +132,30 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
mechanism for administrative policy override.
|
||||
|
||||
In this implementation architecture, applications use APIs [8] like
|
||||
getaddrinfo() and getipnodebyname() that return a list of addresses
|
||||
to the application. This list might contain both IPv6 and IPv4
|
||||
addresses (sometimes represented as IPv4-mapped addresses). The
|
||||
application then passes a destination address to the network stack
|
||||
with connect() or sendto(). The application might use only the first
|
||||
address in the list, or it might loop over the list of addresses to
|
||||
find a working address. In any case, the network layer is never in a
|
||||
situation where it needs to choose a destination address from
|
||||
several alternatives. The application might also specify a source
|
||||
address with bind(), but often the source address is left
|
||||
unspecified. Therefore the network layer does often choose a source
|
||||
address from several alternatives.
|
||||
getaddrinfo() that return a list of addresses to the application.
|
||||
This list might contain both IPv6 and IPv4 addresses (sometimes
|
||||
represented as IPv4-mapped addresses). The application then passes a
|
||||
destination address to the network stack with connect() or sendto().
|
||||
The application might use only the first address in the list, or it
|
||||
might loop over the list of addresses to find a working address. In
|
||||
any case, the network layer is never in a situation where it needs
|
||||
to choose a destination address from several alternatives. The
|
||||
application might also specify a source address with bind(), but
|
||||
often the source address is left unspecified. Therefore the network
|
||||
layer does often choose a source address from several alternatives.
|
||||
|
||||
As a consequence, we intend that implementations of getaddrinfo()
|
||||
and getipnodebyname() will use the destination address selection
|
||||
algorithm specified here to sort the list of IPv6 and IPv4 addresses
|
||||
that they return. Separately, the IPv6 network layer will use the
|
||||
source address selection algorithm when an application or upper-
|
||||
layer has not specified a source address. Application of this
|
||||
framework to source address selection in an IPv4 network layer may
|
||||
be possible but this is not explored further here.
|
||||
will use the destination address selection algorithm specified here
|
||||
to sort the list of IPv6 and IPv4 addresses that they return.
|
||||
Separately, the IPv6 network layer will use the source address
|
||||
selection algorithm when an application or upper-layer has not
|
||||
specified a source address. Application of this framework to source
|
||||
address selection in an IPv4 network layer may be possible but this
|
||||
is not explored further here.
|
||||
|
||||
Well-behaved applications should iterate through the list of
|
||||
addresses returned from getaddrinfo() until they find a working
|
||||
addresses.
|
||||
|
||||
The algorithms use several criteria in making their decisions. The
|
||||
combined effect is to prefer destination/source address pairs for
|
||||
|
|
@ -161,19 +164,19 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
deprecated source addresses, avoid the use of transitional addresses
|
||||
when native addresses are available, and all else being equal prefer
|
||||
address pairs having the longest possible common prefix. For source
|
||||
address selection, temporary addresses [4] are preferred over public
|
||||
address selection, public addresses [4] are preferred over temporary
|
||||
addresses. In mobile situations [5], home addresses are preferred
|
||||
over care-of addresses. If an address is simultaneously a home
|
||||
address and a care-of address (indicating the mobile node is "at
|
||||
home" for that address), then the home/care-of address is preferred
|
||||
|
||||
Draves Standards Track - Expires December 2001 3
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
over addresses that are solely a home address or solely a care-of
|
||||
address.
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 3
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
The framework optionally allows for the possibility of
|
||||
administrative configuration of policy that can override the default
|
||||
behavior of the algorithms. The policy override takes the form of a
|
||||
|
|
@ -220,18 +223,18 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
assigned link-local scope. IPv4 private addresses [10], which have
|
||||
the prefixes 10/8, 172.16/12, and 192.168/16, are assigned site-
|
||||
local scope. IPv4 loopback addresses [11, section 4.2.2.11], which
|
||||
have the prefix 127/8, are assigned link-local scope. Other IPv4
|
||||
addresses are assigned global scope.
|
||||
have the prefix 127/8, are assigned link-local scope (analogously to
|
||||
the treatment of the IPv6 loopback address [9, section 4]). Other
|
||||
IPv4 addresses are assigned global scope.
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 4
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
IPv4 addresses should be treated as having "preferred" configuration
|
||||
status.
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 4
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
2.3. IPv6 Addresses with Embedded IPv4 Addresses
|
||||
|
||||
IPv4-compatible addresses [2] and 6to4 addresses [12] contain an
|
||||
|
|
@ -244,7 +247,7 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
2.4. Loopback Address and Other Format Prefixes
|
||||
|
||||
The loopback address should be treated as having link-local
|
||||
scope [9] and "preferred" configuration status.
|
||||
scope [9, section 4] and "preferred" configuration status.
|
||||
|
||||
NSAP addresses and other addresses with as-yet-undefined format
|
||||
prefixes should be treated as having global scope and "preferred"
|
||||
|
|
@ -281,15 +284,15 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
2002::/16 30 2
|
||||
::/96 20 3
|
||||
::ffff:0:0/96 10 4
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 5
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
|
||||
One effect of the default policy table is to prefer using native
|
||||
source addresses with native destination addresses, 6to4 [12] source
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 5
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
addresses with 6to4 destination addresses, and v4-compatible [2]
|
||||
source addresses with v4-compatible destination addresses. Another
|
||||
effect of the default policy table is to prefer communication using
|
||||
|
|
@ -340,14 +343,13 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
not in the candidate set for the destination, then the network layer
|
||||
MUST treat this is an error. If the application or upper-layer
|
||||
specifies a source address that is in the candidate set for the
|
||||
|
||||
Draves Standards Track - Expires December 2001 6
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
destination, then the network layer MUST respect that choice. If the
|
||||
application or upper-layer does not specify a source address, then
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 6
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
the network layer uses the source address selection algorithm
|
||||
specified in the next section.
|
||||
|
||||
|
|
@ -399,11 +401,9 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
Similarly, if SB is assigned to the interface that will be used to
|
||||
send to D and SA is assigned to a different interface, then prefer
|
||||
SB.
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 7
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
Draves Standards Track - Expires December 2001 7
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Rule 6: Prefer matching label.
|
||||
|
|
@ -411,15 +411,23 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
Similarly, if Label(SB) = Label(D) and Label(SA) <> Label(D), then
|
||||
choose SB.
|
||||
|
||||
Rule 7: Prefer temporary addresses.
|
||||
If SA is a temporary address and SB is a public address, then prefer
|
||||
SA. Similarly, if SB is a temporary address and SA is a public
|
||||
Rule 7: Prefer public addresses.
|
||||
If SA is a public address and SB is a temporary address, then prefer
|
||||
SA. Similarly, if SB is a public address and SA is a temporary
|
||||
address, then prefer SB.
|
||||
An implementation may support a per-connection configuration
|
||||
mechanism (for example, a socket option) to reverse the sense of
|
||||
this preference and prefer public addresses over temporary
|
||||
this preference and prefer temporary addresses over public
|
||||
addresses.
|
||||
|
||||
This rule avoids applications potentially failing due to the
|
||||
relatively short lifetime of temporary addresses or due to the
|
||||
possibility of the reverse lookup of a temporary address either
|
||||
failing or returning a randomized name. Implementations for which
|
||||
privacy considerations outweigh these application compatibility
|
||||
concerns MAY reverse the sense of this rule and by default prefer
|
||||
temporary addresses over public addresses.
|
||||
|
||||
Rule 8: Use longest matching prefix.
|
||||
If CommonPrefixLen(SA, D) > CommonPrefixLen(SB, D), then choose SA.
|
||||
Similarly, if CommonPrefixLen(SB, D) > CommonPrefixLen(SA, D), then
|
||||
|
|
@ -450,6 +458,12 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
the source address selection algorithm. Source address selection for
|
||||
IPv4 addresses is not specified in this document.
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 8
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
We say that Source(D) is undefined if there is no source address
|
||||
available for destination D. For IPv6 addresses, this is only the
|
||||
case if CandidateSource(D) is the empty set.
|
||||
|
|
@ -459,11 +473,6 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
result, then the remaining rules are not relevant and should be
|
||||
ignored. Subsequent rules act as tie-breakers for earlier rules.
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 8
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
Rule 1: Avoid unusable destinations.
|
||||
If there is no route to DB or if Source(DB) is undefined, then sort
|
||||
DA before DB. Similarly, if there is no route to DA or if Source(DA)
|
||||
|
|
@ -508,7 +517,11 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
Source(DB)), then sort DA before DB. Similarly, if
|
||||
CommonPrefixLen(DA, Source(DA)) < CommonPrefixLen(DB, Source(DB)),
|
||||
then sort DB before DA.
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 9
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Rule 9: Otherwise, leave the order unchanged.
|
||||
Sort DA before DB.
|
||||
|
||||
|
|
@ -517,11 +530,6 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
implementation somehow knows which destination addresses will result
|
||||
in the "best" communications performance.
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 9
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
6. Interactions with Routing
|
||||
|
||||
This specification of source address selection assumes that routing
|
||||
|
|
@ -550,36 +558,35 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
|
||||
The destination address selection algorithm needs information about
|
||||
potential source addresses. One possible implementation strategy is
|
||||
for getipnodebyname() and getaddrinfo() to call down to the IPv6
|
||||
network layer with a list of destination addresses, sort the list in
|
||||
the network layer with full current knowledge of available source
|
||||
addresses, and return the sorted list to getipnodebyname() or
|
||||
getaddrinfo(). This is simple and gives the best results but it
|
||||
introduces the overhead of another system call. One way to reduce
|
||||
this overhead is to cache the sorted address list in the resolver,
|
||||
so that subsequent calls for the same name do not need to resort the
|
||||
list.
|
||||
for getaddrinfo() to call down to the IPv6 network layer with a list
|
||||
of destination addresses, sort the list in the network layer with
|
||||
full current knowledge of available source addresses, and return the
|
||||
sorted list to getaddrinfo(). This is simple and gives the best
|
||||
results but it introduces the overhead of another system call. One
|
||||
way to reduce this overhead is to cache the sorted address list in
|
||||
the resolver, so that subsequent calls for the same name do not need
|
||||
to resort the list.
|
||||
|
||||
Another implementation strategy is to call down to the network layer
|
||||
to retrieve source address information and then sort the list of
|
||||
addresses directly in the context of getipnodebyname() or
|
||||
getaddrinfo(). To reduce overhead in this approach, the source
|
||||
address information can be cached, amortizing the overhead of
|
||||
retrieving it across multiple calls to getipnodebyname() and
|
||||
getaddrinfo(). In this approach, the implementation may not have
|
||||
knowledge of the outgoing interface for each destination, so it MAY
|
||||
use a looser definition of the candidate set during destination
|
||||
addresses directly in the context of getaddrinfo(). To reduce
|
||||
overhead in this approach, the source address information can be
|
||||
cached, amortizing the overhead of retrieving it across multiple
|
||||
calls to getaddrinfo(). In this approach, the implementation may not
|
||||
have knowledge of the outgoing interface for each destination, so it
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 10
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
MAY use a looser definition of the candidate set during destination
|
||||
address ordering.
|
||||
|
||||
In any case, if the implementation uses cached and possibly stale
|
||||
information in its implementation of destination address selection,
|
||||
or if the ordering of a cached list of destination addresses is
|
||||
possibly stale, then it should ensure that the destination address
|
||||
|
||||
Draves Standards Track - Expires September 2001 10
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
ordering returned to the application is no more than one second out
|
||||
of date. For example, an implementation might make a system call to
|
||||
check if any routing table entries or source address assignments
|
||||
|
|
@ -588,7 +595,7 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
underlying state is changed. By caching the current invalidation
|
||||
counter value with derived state and then later comparing against
|
||||
the current value, the implementation can detect if the derived
|
||||
state is stale.
|
||||
state is potentially stale.
|
||||
|
||||
8. Security Considerations
|
||||
|
||||
|
|
@ -605,16 +612,14 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
the attack does not specify a particular source address for its
|
||||
reply packets.) By using different addresses for itself, the
|
||||
unfriendly node can cause the target node to expose the target's own
|
||||
addresses. For example, the unfriendly node might correlate the
|
||||
target's current IPv6 temporary address with its IPv4 address by
|
||||
sending requests with a global source address and an IPv4-compatible
|
||||
source address.
|
||||
addresses.
|
||||
|
||||
9. Examples
|
||||
|
||||
This section contains a number of examples, first of default
|
||||
behavior and then demonstrating the utility of policy table
|
||||
configuration.
|
||||
configuration. These examples are provided for illustrative
|
||||
purposes; they should not be construed as normative.
|
||||
|
||||
9.1. Default Source Address Selection
|
||||
|
||||
|
|
@ -628,16 +633,15 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
Destination: 2001::1
|
||||
Sources: fe80::1 vs fec0::1
|
||||
Result: fec0::1 (prefer appropriate scope)
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 11
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Destination: fec0::1
|
||||
Sources: fe80::1 vs 2001::1
|
||||
Result: 2001::1 (prefer appropriate scope)
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 11
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
Destination: ff05::1
|
||||
Sources: fe80::1 vs fec0::1 vs 2001::1
|
||||
Result: fec0::1 (prefer appropriate scope)
|
||||
|
|
@ -659,12 +663,12 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
Result: 3ffe::2 (prefer home address)
|
||||
|
||||
Destination: 2002:836b:2179::1
|
||||
Sources: 2002:836b:2179::2 vs 2001::d5e3:7953:13eb:22e8 (temporary)
|
||||
Result: 2002:836b:2179::2 (prefer matching label)
|
||||
Sources: 2002:836b:2179::d5e3:7953:13eb:22e8 (temporary) vs 2001::2
|
||||
Result: 2002:836b:2179::d5e3:7953:13eb:22e8 (prefer matching label)
|
||||
|
||||
Destination: 2001::1
|
||||
Destination: 2001::d5e3:0:0:1
|
||||
Sources: 2001::2 vs 2001::d5e3:7953:13eb:22e8 (temporary)
|
||||
Result: 2001::d5e3:7953:13eb:22e8 (prefer temporary address)
|
||||
Result: 2001::2 (prefer public address)
|
||||
|
||||
9.2. Default Destination Address Selection
|
||||
|
||||
|
|
@ -687,15 +691,16 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
Result: 2001::1 (src 2001::2) then 10.1.2.3 (src 10.1.2.4) (prefer
|
||||
higher precedence)
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 12
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Sources: 2001::2 or fec0::2 or fe80::2
|
||||
Destinations: 2001::1 vs fec0::1 vs fe80::1
|
||||
Result: fe80::1 (src fe80::2) then fec0::1 (src fec0::2) then
|
||||
2001::1 (src 2001::2) (prefer smaller scope)
|
||||
|
||||
Draves Standards Track - Expires September 2001 12
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
|
||||
Sources: 2001::2 (care-of address) or 3ffe::1 (home address) or
|
||||
fec0::2 (care-of address) or fe80::2 (care-of address)
|
||||
Destinations: 2001::1 vs fec0::1
|
||||
|
|
@ -742,18 +747,18 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
|
||||
Sources: 2001::2 or fe80::1 or 169.254.13.78
|
||||
Destinations: 2001::1 vs 131.107.65.121
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 13
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Unchanged Result: 2001::1 (src 2001::2) then 131.107.65.121 (src
|
||||
169.254.13.78) (prefer matching scope)
|
||||
|
||||
Sources: fe80::1 or 131.107.65.117
|
||||
Destinations: 2001::1 vs 131.107.65.121
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 13
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
Unchanged Result: 131.107.65.121 (src 131.107.65.117) then 2001::1
|
||||
(src fe80::1) (prefer matching scope)
|
||||
|
||||
|
|
@ -801,17 +806,17 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
contracted for service with a special high-performance ISP. This is
|
||||
in addition to the normal Internet connection that both sites have
|
||||
with different ISPs. The high-performance ISP is expensive and the
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 14
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
two sites wish to use it only for their business-critical traffic
|
||||
with each other.
|
||||
|
||||
Each site has two global prefixes, one from the high-performance ISP
|
||||
and one from their normal ISP. Site A has prefix 2001:aaaa:aaaa::/48
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 14
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
from the high-performance ISP and prefix 2007:0:aaaa::/48 from its
|
||||
normal ISP. Site B has prefix 2001:bbbb:bbbb::/48 from the high-
|
||||
performance ISP and prefix 2007:0:bbbb::/48 from its normal ISP. All
|
||||
|
|
@ -853,6 +858,18 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
desired behavior via policy table configuration. For example, they
|
||||
can use the following policy table:
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 15
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Prefix Precedence Label
|
||||
::1 50 0
|
||||
2001:aaaa:aaaa::/48 45 5
|
||||
|
|
@ -864,12 +881,6 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
|
||||
This policy table produces the following behavior:
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 15
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
Sources: 2001:aaaa:aaaa::a or 2007:0:aaaa::a or fe80::a
|
||||
Destinations: 2001:bbbb:bbbb::b vs 2007:0:bbbb::b
|
||||
New Result: 2001:bbbb:bbbb::b (src 2001:aaaa:aaaa::a) then
|
||||
|
|
@ -900,48 +911,47 @@ References
|
|||
uration", RFC 2462 , December 1998.
|
||||
|
||||
4 T. Narten, R. Draves, "Privacy Extensions for Stateless Address
|
||||
Autoconfiguration in IPv6", draft-ietf-ipngwg-addrconf-privacy-
|
||||
01.txt, July 2000.
|
||||
Autoconfiguration in IPv6", RFC 3041, January 2001.
|
||||
|
||||
5 D. Johnson, C. Perkins, "Mobility Support in IPv6", draft-ietf-
|
||||
mobileip-ipv6-12.txt, April 2000.
|
||||
mobileip-ipv6-13.txt, November 2000.
|
||||
|
||||
6 S. Cheshire. "Dynamic Configuration of IPv4 Link-local
|
||||
Addresses", draft-ietf-zeroconf-ipv4-linklocal-01.txt, November
|
||||
2000.
|
||||
6 S. Cheshire, B. Aboba, "Dynamic Configuration of IPv4 Link-local
|
||||
Addresses", draft-ietf-zeroconf-ipv4-linklocal-02.txt, March
|
||||
2001.
|
||||
|
||||
7 S. Bradner, "Key words for use in RFCs to Indicate Requirement
|
||||
Levels", BCP 14, RFC 2119, March 1997.
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 16
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
|
||||
8 R. Gilligan, S. Thomson, J. Bound, W. Stevens, "Basic Socket
|
||||
Interface Extensions for IPv6", RFC 2553, March 1999.
|
||||
|
||||
9 S. Deering, B. Haberman, B. Zill. "IP Version 6 Scoped Address
|
||||
Architecture", draft-ietf-ipngwg-scoping-arch-01.txt, March 2000.
|
||||
9 S. Deering et. al, "IP Version 6 Scoped Address Architecture",
|
||||
draft-ietf-ipngwg-scoping-arch-02.txt, March 2001.
|
||||
|
||||
10 Y. Rekhter et. al, "Address Allocation for Private Internets",
|
||||
RFC 1918, February 1996.
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 16
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
|
||||
11 F. Baker, Editor. "Requirements for IP Version 4 Routers", RFC
|
||||
11 F. Baker, Editor, "Requirements for IP Version 4 Routers", RFC
|
||||
1812, June 1995.
|
||||
|
||||
12 B. Carpenter, K. Moore. "Connection of IPv6 Domains via IPv4
|
||||
Clouds", draft-ietf-ngtrans-6to4-07.txt, September 2000.
|
||||
12 B. Carpenter, K. Moore, "Connection of IPv6 Domains via IPv4
|
||||
Clouds", RFC 3056, February 2001.
|
||||
|
||||
Acknowledgments
|
||||
|
||||
The author would like to acknowledge the contributions of the IPng
|
||||
Working Group, particularly Steve Deering, Jun-ichiro itojun Hagino,
|
||||
M.T. Hollinger, Ken Powell, Markku Savela, Dave Thaler, and Mauro
|
||||
Tortonesi. Please let the author know if you contributed to the
|
||||
development of this draft and are not mentioned here.
|
||||
Working Group, particularly Marc Blanchet, Brian Carpenter, Matt
|
||||
Crawford, Steve Deering, Jun-ichiro itojun Hagino, Tony Hain, M.T.
|
||||
Hollinger, Erik Nordmark, Ken Powell, Markku Savela, Dave Thaler,
|
||||
and Mauro Tortonesi. Please let the author know if you contributed
|
||||
to the development of this draft and are not mentioned here.
|
||||
|
||||
Author's Address
|
||||
|
||||
|
|
@ -954,12 +964,28 @@ Author's Address
|
|||
|
||||
Revision History
|
||||
|
||||
Changes from draft-ietf-ipngwg-default-addr-select-03
|
||||
|
||||
Reversed the treatment of temporary addresses, so that unless an
|
||||
application specifies otherwise public addresses are preferred over
|
||||
temporary addresses.
|
||||
|
||||
Added text clarifying our expectation that applications should
|
||||
iterate through the list of possible destination addresses until
|
||||
finding a working address.
|
||||
|
||||
Removed references to getipnodebyname().
|
||||
|
||||
Changes from draft-ietf-ipngwg-default-addr-select-02
|
||||
|
||||
Changed scope treatment of IPv4-compatible and 6to4 addresses, so
|
||||
they are always considered to be global. Removed mention of IPX
|
||||
addresses.
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 17
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Changed home address rules to favor addresses that are
|
||||
simultaneously home and care-of addresses, over addresses that are
|
||||
just home addresses or just care-of addresses.
|
||||
|
|
@ -979,13 +1005,6 @@ Changes from draft-ietf-ipngwg-default-addr-select-01
|
|||
of source addresses and the source address selection rule that
|
||||
prefers source addresses of appropriate scope.
|
||||
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 17
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
Simplified the default policy table. Reordered the source address
|
||||
selection rules to reduce the influence of policy labels. Added more
|
||||
destination address selection rules.
|
||||
|
|
@ -1020,7 +1039,11 @@ Changes from draft-ietf-ipngwg-default-addr-select-00
|
|||
|
||||
Added a rule to source address selection to handle anonymous/public
|
||||
addresses.
|
||||
|
||||
|
||||
Draves Standards Track - Expires December 2001 18
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Added a rule to source address selection to handle home/care-of
|
||||
addresses.
|
||||
|
||||
|
|
@ -1039,11 +1062,7 @@ Changes from draft-draves-ipngwg-simple-srcaddr-01
|
|||
|
||||
Added mechanism to allow the specification of administrative policy
|
||||
that can override the default behavior.
|
||||
|
||||
Draves Standards Track - Expires September 2001 18
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
|
||||
|
||||
|
||||
Added section on routing interactions and TBD section on mobility
|
||||
interactions.
|
||||
|
||||
|
|
@ -1077,29 +1096,10 @@ Changes from draft-draves-ipngwg-simple-srcaddr-00
|
|||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 19
|
||||
draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
||||
Draves Standards Track - Expires December 2001 19
|
||||
draft-ietf-ipngwg-default-addr-select-04 May 14, 2001
|
||||
|
||||
|
||||
Full Copyright Statement
|
||||
|
|
@ -1156,4 +1156,4 @@ draft-ietf-ipngwg-default-addr-select-03 March 3, 2001
|
|||
|
||||
|
||||
|
||||
Draves Standards Track - Expires September 2001 20
|
||||
Draves Standards Track - Expires December 2001 20
|
||||
File diff suppressed because it is too large
Load diff
|
|
@ -1,228 +0,0 @@
|
|||
INTERNET-DRAFT Stuart Kwan
|
||||
James Gilroy
|
||||
Levon Esibov
|
||||
Microsoft Corp.
|
||||
March 2001
|
||||
<draft-skwan-utf8-dns-05.txt> Expires September 2001
|
||||
|
||||
|
||||
Using the UTF-8 Character Set in the Domain Name System
|
||||
|
||||
|
||||
Status of this Memo
|
||||
|
||||
This document is an Internet-Draft and is in full conformance
|
||||
with all provisions of Section 10 of RFC2026.
|
||||
|
||||
Internet-Drafts are working documents of the Internet Engineering
|
||||
Task Force (IETF), its areas, and its working groups. Note that
|
||||
other groups may also distribute working documents as
|
||||
Internet-Drafts.
|
||||
|
||||
Internet-Drafts are draft documents valid for a maximum of six
|
||||
months and may be updated, replaced, or obsoleted by other
|
||||
documents at any time. It is inappropriate to use Internet-
|
||||
Drafts as reference material or to cite them other than as
|
||||
"work in progress."
|
||||
|
||||
The list of current Internet-Drafts can be accessed at
|
||||
http://www.ietf.org/ietf/1id-abstracts.txt
|
||||
|
||||
The list of Internet-Draft Shadow Directories can be accessed at
|
||||
http://www.ietf.org/shadow.html.
|
||||
|
||||
|
||||
Abstract
|
||||
|
||||
The Domain Name System standard specifies that names are represented
|
||||
using the ASCII character encoding. This document expands that
|
||||
specification to allow the use of the UTF-8 character encoding, a
|
||||
superset of ASCII and a translation of the UCS-2 character encoding.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Expires September 2001 [Page 1]
|
||||
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS March 2001
|
||||
|
||||
1. Introduction
|
||||
|
||||
The Domain Name System standard [RFC1035] specifies that names are
|
||||
represented using the ASCII character encoding. This document expands
|
||||
that specification to allow the use of the UTF-8 character encoding
|
||||
[RFC2044], a superset of ASCII and a translation of the UCS-2
|
||||
character encoding.
|
||||
|
||||
Interpreting names as ASCII-only limits the utility of DNS in an
|
||||
international setting. The UTF-8 character set includes characters
|
||||
from most of the world's written languages, allowing a far greater
|
||||
range of possible names and allowing names to use characters that are
|
||||
relevant to a particular locality. UTF-8 is the recommended character
|
||||
set for protocols that are evolving beyond ASCII [RFC2130].
|
||||
|
||||
This document defines the technology for a richer character set in
|
||||
DNS. This document specifically does not define policy for the
|
||||
characters allowed in a name when used in a particular application.
|
||||
For example, some protocols place restrictions on the characters
|
||||
allowed in a name. In addition, names that are intended to be
|
||||
globally visible [RFC1958] should contain ASCII-only characters
|
||||
per [RFC1123].
|
||||
|
||||
|
||||
2. Protocol Description
|
||||
|
||||
A UTF-8-aware DNS server is a DNS server that can load and store DNS
|
||||
names that contain UTF-8 characters. Names are encoded in logical
|
||||
order as opposed to visual order (see [UNICODE 2.0]).
|
||||
|
||||
Uniform downcasing permits UTF-8-aware DNS implementations to
|
||||
interoperate with non-UTF-8-aware DNS implementations. Any binary
|
||||
string can be used in a DNS name [RFC2181], but names must be
|
||||
compared with case-insensitivity [RFC1035]. A non-UTF-8-aware DNS
|
||||
implementation is unable to perform a case-insensitive comparison
|
||||
on a name containing UTF-8 characters. However, if UTF-8 names are
|
||||
downcased before transmission, then binary comparisons will provide
|
||||
the desired result on non-UTF-8-aware servers without violating the
|
||||
case-insensitivity requirement.
|
||||
|
||||
The DNS protocol standard states that original case should be
|
||||
preserved when possible as data is entered into the system. This
|
||||
requirement is modified as follows: a UTF-8-aware DNS server must
|
||||
downcase all names containing UTF-8 characters in both record names
|
||||
and record data before transmitting those names in any message.
|
||||
A UTF-8-aware DNS client/resolver must downcase all names containing
|
||||
UTF-8 characters before transmitting those names in any message.
|
||||
|
||||
|
||||
|
||||
|
||||
Expires September 2001 [Page 2]
|
||||
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS March 2001
|
||||
|
||||
|
||||
For consistency, UTF-8-aware DNS servers must compare names that
|
||||
contain UTF-8 characters byte-for-byte, as opposed to using Unicode
|
||||
equivalency rules.
|
||||
|
||||
Applications should take care when allowing uppercase UTF-8 characters
|
||||
to be passed to the resolver, and DNS servers should take care when
|
||||
allowing uppercase UTF-8 characters to be entered in zone data.
|
||||
Downcasing in UTF-8 is locale-sensitive and the result may vary
|
||||
according to the locale of the code execution. The desired result will
|
||||
always be obtained if the application and server only accept lowercase
|
||||
characters.
|
||||
|
||||
Names encoded in UTF-8 must not exceed the size limits clarified in
|
||||
[RFC2181]. Character count is insufficient to determine size, since
|
||||
some UTF-8 characters exceed one octet in length.
|
||||
|
||||
|
||||
3. Interoperability Considerations
|
||||
|
||||
The UTF-8 character encoding is ideal for use with existing protocol
|
||||
implementations that expect US-ASCII characters. The representation
|
||||
of a US-ASCII characters in UTF-8 is byte for byte identical to the
|
||||
US-ASCII representation. Non-UTF-8-aware DNS clients always encode
|
||||
names in ASCII format and those names will always be correctly
|
||||
interpreted by a UTF-8-aware DNS server.
|
||||
|
||||
DNS server authors may wish to provide a configuration switch on the
|
||||
DNS server to allow/disallow the use of UTF-8 characters on a
|
||||
per-server or per-zone basis.
|
||||
|
||||
A non-UTF-8-aware DNS server may accept a zone transfer of a zone
|
||||
containing UTF-8 names, but it may not be able to write back those
|
||||
names to a zone file or reload those names from a zone file.
|
||||
Administrators should exercise caution when transferring a zone
|
||||
containing UTF-8 names to a non-UTF-8-aware DNS server.
|
||||
|
||||
|
||||
4. Security Considerations
|
||||
|
||||
The choice of character encoding for names does not impact the
|
||||
security of the DNS protocol.
|
||||
|
||||
|
||||
5. Acknowledgements
|
||||
|
||||
The authors of this document would like to thank the following people
|
||||
for their contribution to this specification: John McConnell,
|
||||
Cliff Van Dyke and Bjorn Rettig.
|
||||
|
||||
|
||||
|
||||
Expires September 2001 [Page 3]
|
||||
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS March 2001
|
||||
|
||||
|
||||
6. References
|
||||
|
||||
[RFC1035] P.V. Mockapetris, "Domain Names - Implementation and
|
||||
Specification," RFC 1035, ISI, Nov 1987.
|
||||
|
||||
[RFC2044] F. Yergeau, "UTF-8, a transformation format of Unicode
|
||||
and ISO 10646," RFC 2044, Alis Technologies, Oct 1996.
|
||||
|
||||
[RFC1958] B. Carpenter, "Architectural Principles of the
|
||||
Internet," RFC 1958, IAB, June 1996.
|
||||
|
||||
[RFC1123] R. Braden, "Requirements for Internet Hosts -
|
||||
Application and Support," STD 3, RFC 1123, January 1989.
|
||||
|
||||
[RFC2130] C. Weider et. al., "The Report of the IAB Character
|
||||
Set Workshop held 29 July - 1 March 1996",
|
||||
RFC 2130, Apr 1997.
|
||||
|
||||
[RFC2181] R. Elz and R. Bush, "Clarifications to the DNS
|
||||
Specification," RFC 2181, University of Melbourne and
|
||||
RGnet Inc, July 1997.
|
||||
|
||||
[UNICODE 2.0] The Unicode Consortium, "The Unicode Standard, Version
|
||||
2.0," Addison-Wesley, 1996. ISBN 0-201-48345-9.
|
||||
|
||||
|
||||
7. Author's Addresses
|
||||
|
||||
Stuart Kwan James Gilroy
|
||||
Microsoft Corporation Microsoft Corporation
|
||||
One Microsoft Way One Microsoft Way
|
||||
Redmond, WA 98052 Redmond, WA 98052
|
||||
USA USA
|
||||
<skwan@microsoft.com> <jamesg@microsoft.com>
|
||||
|
||||
Levon Esibov
|
||||
Microsoft Corporation
|
||||
One Microsoft Way
|
||||
Redmond, WA 98052
|
||||
USA
|
||||
<levone@microsoft.com>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Expires September 2001 [Page 4]
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
421
doc/draft/draft-skwan-utf8-dns-06.txt
Normal file
421
doc/draft/draft-skwan-utf8-dns-06.txt
Normal file
|
|
@ -0,0 +1,421 @@
|
|||
INTERNET-DRAFT Stuart Kwan
|
||||
James Gilroy
|
||||
Levon Esibov
|
||||
Microsoft Corp.
|
||||
May 2001
|
||||
<draft-skwan-utf8-dns-06.txt> Expires November 2001
|
||||
|
||||
|
||||
Using the UTF-8 Character Set in the Domain Name System
|
||||
|
||||
Status of this Memo
|
||||
|
||||
This document is an Internet-Draft and is in full conformance
|
||||
with all provisions of Section 10 of RFC2026.
|
||||
|
||||
Internet-Drafts are working documents of the Internet Engineering
|
||||
Task Force (IETF), its areas, and its working groups. Note that
|
||||
other groups may also distribute working documents as
|
||||
Internet-Drafts.
|
||||
|
||||
Internet-Drafts are draft documents valid for a maximum of six
|
||||
months and may be updated, replaced, or obsoleted by other
|
||||
documents at any time. It is inappropriate to use Internet-
|
||||
Drafts as reference material or to cite them other than as
|
||||
"work in progress."
|
||||
|
||||
The list of current Internet-Drafts can be accessed at
|
||||
http://www.ietf.org/ietf/1id-abstracts.txt
|
||||
|
||||
The list of Internet-Draft Shadow Directories can be accessed at
|
||||
http://www.ietf.org/shadow.html.
|
||||
|
||||
|
||||
Abstract
|
||||
|
||||
The Domain Names standard specifies that hostnames are represented
|
||||
using the ASCII character encoding. This document expands that
|
||||
specification to allow the use of the UTF-8 character encoding, a
|
||||
superset of ASCII and a translation of the UCS-2 character encoding.
|
||||
|
||||
|
||||
1. Introduction
|
||||
|
||||
The Domain Names standard [RFC1123] specifies that hostnames are
|
||||
represented using the ASCII character encoding. This document expands
|
||||
that specification to allow the use of the UTF-8 character encoding
|
||||
[RFC2044], a superset of ASCII and a translation of the UCS-2
|
||||
character encoding.
|
||||
|
||||
Interpreting names as ASCII-only limits the utility of DNS in an
|
||||
international setting. The UTF-8 character set includes characters
|
||||
from most of the world's written languages, allowing a far greater
|
||||
range of possible names and allowing names to use characters that are
|
||||
relevant to a particular locality. UTF-8 is the recommended character
|
||||
set for protocols that are evolving beyond ASCII [RFC2130].
|
||||
|
||||
Expires November 2001 [Page 1]
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS May 2001
|
||||
|
||||
|
||||
This document defines the technology for a richer character set in
|
||||
DNS. This document specifically does not define policy for the
|
||||
characters allowed in a name when used in a particular application.
|
||||
For example, some protocols place restrictions on the characters
|
||||
allowed in a name
|
||||
|
||||
|
||||
2. Protocol Description
|
||||
|
||||
2.1 Components and roles
|
||||
|
||||
Before the description of the protocol itself authors feel a need to
|
||||
clarify which components are involved in processing the hostnames and
|
||||
describe the usage of the hostnames by these components. The following
|
||||
list contains such information.
|
||||
|
||||
User.
|
||||
User could be a human or application. Its role is to specify (also
|
||||
known as "write") and retrieve (also known as "read") the hostname to
|
||||
and from an application. The examples of such operations include
|
||||
typing the hostname, writing it on a touch sensitive screen, reading
|
||||
the name from the monitor, listening to a voicemail, etc...
|
||||
|
||||
Application.
|
||||
Application's role is to
|
||||
- process the hostname specified by user or other local or remote
|
||||
application.
|
||||
- return to the user (for example display on a monitor screen) the
|
||||
hostname returned by DNS resolver.
|
||||
- call DNS name resolution APIs to request resolver to perform the
|
||||
name resolution
|
||||
|
||||
Resolver.
|
||||
Resolver's role is to
|
||||
- process the name resolution requests from an application and submit
|
||||
appropriate DNS query to the DNS servers
|
||||
- process the response from a DNS server and pass the response to the
|
||||
Application.
|
||||
|
||||
DNS server.
|
||||
The role of the DNS server is to store and maintain the DNS data,
|
||||
process the updates to its database, update the replica copies of the
|
||||
databases and perform the DNS name resolution through responding to
|
||||
the DNS queries.
|
||||
|
||||
|
||||
2.2 Protocol details
|
||||
|
||||
This section describes the modifications (if any) to each of these
|
||||
components and interfaces between the communicating components.
|
||||
|
||||
|
||||
|
||||
Expires November 2001 [Page 2]
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS May 2001
|
||||
|
||||
|
||||
2.2.1 Users
|
||||
|
||||
No modifications to the users are proposed in this document. At the
|
||||
same time support of this protocol by other components specified later
|
||||
in this section may enable users to start using in hostnames
|
||||
characters from wider set than one specified in [RFC1123].
|
||||
|
||||
|
||||
2.2.2 Interface between users and applications
|
||||
|
||||
User may use any character set or multiple character sets supported by
|
||||
the particular application. Specification of the allowed character
|
||||
sets supported by an application is outside of the scope of this
|
||||
document. The decision on which characters sets can be used to allow
|
||||
user to input and retrieve the hostnames is left to the implementers
|
||||
of the particular applications unless a protocol underlying specific
|
||||
application specifies the supported characters set. Thus this protocol
|
||||
does not affect the interface between users and applications.
|
||||
|
||||
|
||||
2.2.3 Applications
|
||||
|
||||
Storage format of the hostnames by the applications is outside of the
|
||||
scope of this protocol.
|
||||
|
||||
|
||||
2.2.4 Interface between applications and resolvers
|
||||
|
||||
This protocol does not specify the APIs that applications should use
|
||||
to request the resolver to perform the DNS name resolution of the
|
||||
internationalized hostnames. Instead it only specifies the format of
|
||||
the hostnames specified in the input and output of such APIs.
|
||||
|
||||
The applications supporting non-ASCII characters in hostnames MUST
|
||||
pass to the resolvers a hostname in ISO/IEC 10646 encoding. If the
|
||||
response returned by the resolver to the application contains the
|
||||
hostname, then the application should expect the hostname to be
|
||||
encoded using ISO/IEC 10646.
|
||||
|
||||
|
||||
2.2.5 Resolvers
|
||||
|
||||
Before sending the hostname in the query packet, the resolver MUST
|
||||
prepare each name part as specified in [NAMEPREP]. After the name
|
||||
preparation the resolver MUST convert the hostname to be encoded using
|
||||
UTF-8 as specified in [RFC2044].
|
||||
Names encoded in UTF-8 must not exceed the size limits clarified in
|
||||
[RFC2181]. Character count is insufficient to determine size, since
|
||||
some UTF-8 characters exceed one octet in length.
|
||||
|
||||
|
||||
|
||||
|
||||
Expires November 2001 [Page 3]
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS May 2001
|
||||
|
||||
|
||||
When resolver receives a response to the query from a DNS server, it
|
||||
MUST convert all of the hostnames from UTF-8 encoded format to the
|
||||
ISO/IEC 10646 encoding before passing these hostnames back to the
|
||||
application.
|
||||
|
||||
|
||||
2.2.6 DNS servers
|
||||
|
||||
DNS servers authoritative for the records containing the hostnames
|
||||
containing the characters not allowed by [RFC1123] MUST allow use of
|
||||
the namepreped UTF-8 format to store and transmit those parts of the
|
||||
hostnames.
|
||||
|
||||
According to existing standards, any binary string can be used in a
|
||||
DNS name [RFC2181], but names must be compared with case-insensitivity
|
||||
[RFC1035]. At the same time DNS protocol standard states that original
|
||||
case SHOULD be preserved when possible as data is entered into the DNS
|
||||
database. This requirement is modified as follows: a DNS server
|
||||
authoritative for the internationalized hostnames MUST nameprep and
|
||||
perform UTF-8 conversion on all names containing internationalized
|
||||
characters in both record names and record data before storing these
|
||||
hostnames and transmitting those names in any message. This new
|
||||
requirement guarantees case-insensitive comparison of the
|
||||
internationalized hostnames even by those DNS servers that do not
|
||||
support this protocol.
|
||||
|
||||
DNS servers must compare names that contain UTF-8 characters
|
||||
byte-for-byte, as opposed to using Unicode equivalency rules.
|
||||
|
||||
|
||||
3. Interoperability Considerations
|
||||
|
||||
If user continues using ASCII-only characters in the hostnames, then
|
||||
there is no need to upgrade any applications and/or resolvers.
|
||||
|
||||
As pointed in the previous section, there is no need to upgrade DNS
|
||||
servers, except possibly those that are authoritative for the zones
|
||||
containing internationalized hostnames.
|
||||
|
||||
The following interoperability issues should be taken into account
|
||||
|
||||
- A legacy application may not be able to process the hostnames
|
||||
containing non-ASCII characters returned by DNS resolvers. Effect of
|
||||
failure to process a name containing 7-bit needs to be separately
|
||||
investigated.
|
||||
- If other protocols decide to use the nameprep-UTF-8-encoding to
|
||||
represent internationalized hostnames in their wire packets, then a
|
||||
legacy application supporting such protocol that receives UTF-8
|
||||
encoded hostname from another application (for example, such as mail
|
||||
server or client) may fail to process such hostname. Effect of failure
|
||||
to process a name containing 7-bit needs to be separately investigate.
|
||||
|
||||
|
||||
Expires November 2001 [Page 4]
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS May 2001
|
||||
|
||||
|
||||
Thus hostnames that are intended to be globally usable [RFC1958] on
|
||||
legacy applications should still contain ASCII-only characters per
|
||||
[RFC1123].
|
||||
|
||||
- If an updated application runs on legacy resolver that rejects name
|
||||
resolution of the names containing any character not allowed by
|
||||
[RFC1123], then such resolvers will require an upgrade to enable name
|
||||
resolution of the internationalized hostnames.
|
||||
|
||||
- As specified above, DNS servers authoritative for the DNS records
|
||||
containing the internationalized hostnames must be able to save and
|
||||
load the hostnames containing napepreped-UTF-8-converted characters.
|
||||
If the DNS server doesn't satisfy this requirement, but needs to host
|
||||
such resource records, then it needs to be upgraded.
|
||||
|
||||
- Any DNS server involved in a name resolution process of the DNS
|
||||
records containing an internationalized hostname must not reject name
|
||||
resolution only because the hostname contains characters not allowed
|
||||
by [RFC1123]. This requirement does not mean that every DNS server in
|
||||
the name resolution path between the client and authoritative server
|
||||
must be able to store and load the DNS records containing the
|
||||
internationalized hostnames, but only means that the DNS server
|
||||
performing recursive resolution needs to be able to query for and
|
||||
cache such records, and that the DNS servers authoritative for the DNS
|
||||
names higher in the DNS name hierarchy than the internationalized
|
||||
names in query, need to be able to respond to such queries.
|
||||
Overwhelming majority of the DNS servers currently deployed on the
|
||||
Internet already satisfy this requirement. Authors are not aware of
|
||||
any implementation of the DNS server widely deployed on the Internet
|
||||
that doesn't satisfy this requirement.
|
||||
|
||||
Although most of the DNS servers may be capable of accepting a zone
|
||||
transfer of a zone containing UTF-8 encoded hostnames, some of them
|
||||
may not be able to store those names in a zone file or load those
|
||||
names from a zone file. Administrators should exercise caution when
|
||||
transferring a zone containing UTF-8 encoded hostnames to such DNS
|
||||
servers.
|
||||
|
||||
|
||||
|
||||
4. Security Considerations
|
||||
|
||||
Support for internationalized hostnames introduces a possibility of a
|
||||
new type of spoofing attacks that could be based on attacker's
|
||||
knowledge of misbehaving applications or resolvers that modifies the
|
||||
internationalized hostname that needs to be resolved. For example, if
|
||||
there is an application that modifies any character containing 7-bit
|
||||
in some predictable manner (for example by simply dropping the 7-bit),
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Expires November 2001 [Page 5]
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS May 2001
|
||||
|
||||
|
||||
then an attacker may register a DNS record mapping the derivative
|
||||
(i.e. modified by the misbehaving application or resolver) name to the
|
||||
data desired by attacker. In this scenario any user using such
|
||||
misbehaving application may receive as a result of name resolution the
|
||||
data (for example an IP address in A resource record) specified by the
|
||||
attacker without noticing that they are subjected to an attack even if
|
||||
the DNSSEC is used to verify the authenticity of the response.
|
||||
|
||||
Because this protocol depends on the procedures described in
|
||||
[NAMEPREP] and [RFC2044], the security issues identified in these
|
||||
document are also applicable to this protocol.
|
||||
|
||||
|
||||
5. Acknowledgements
|
||||
|
||||
The authors of this document would like to thank the following people
|
||||
for their contribution to this specification: John McConnell,
|
||||
Cliff Van Dyke and Bjorn Rettig.
|
||||
|
||||
|
||||
6. References
|
||||
|
||||
[RFC1035] P.V. Mockapetris, "Domain Names - Implementation and
|
||||
Specification," RFC 1035, ISI, Nov 1987.
|
||||
|
||||
[RFC2044] F. Yergeau, "UTF-8, a transformation format of Unicode
|
||||
and ISO 10646," RFC 2044, Alis Technologies, Oct 1996.
|
||||
|
||||
[RFC1958] B. Carpenter, "Architectural Principles of the
|
||||
Internet," RFC 1958, IAB, June 1996.
|
||||
|
||||
[RFC1123] R. Braden, "Requirements for Internet Hosts -
|
||||
Application and Support," STD 3, RFC 1123, January 1989.
|
||||
|
||||
[RFC2130] C. Weider et. al., "The Report of the IAB Character
|
||||
Set Workshop held 29 July - 1 March 1996",
|
||||
RFC 2130, Apr 1997.
|
||||
|
||||
[RFC2181] R. Elz and R. Bush, "Clarifications to the DNS
|
||||
Specification," RFC 2181, University of Melbourne and
|
||||
RGnet Inc, July 1997.
|
||||
|
||||
[UNICODE 2.0] The Unicode Consortium, "The Unicode Standard, Version
|
||||
2.0," Addison-Wesley, 1996. ISBN 0-201-48345-9.
|
||||
|
||||
[NAMEPREP] Paul Hoffman and Marc Blanchet, "Preparation of
|
||||
Internationalized Host Names",
|
||||
draft-ietf-idn-nameprep-*.txt.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Expires November 2001 [Page 6]
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS May 2001
|
||||
|
||||
|
||||
7. Author's Addresses
|
||||
|
||||
Stuart Kwan James Gilroy
|
||||
Microsoft Corporation Microsoft Corporation
|
||||
One Microsoft Way One Microsoft Way
|
||||
Redmond, WA 98052 Redmond, WA 98052
|
||||
USA USA
|
||||
skwan@microsoft.com jamesg@microsoft.com
|
||||
|
||||
Levon Esibov
|
||||
Microsoft Corporation
|
||||
One Microsoft Way
|
||||
Redmond, WA 98052
|
||||
USA
|
||||
levone@microsoft.com
|
||||
|
||||
|
||||
11. Intellectual Property Statement
|
||||
|
||||
The IETF takes no position regarding the validity or scope of any
|
||||
intellectual property or other rights that might be claimed to pertain
|
||||
to the implementation or use of the technology described in this
|
||||
document or the extent to which any license under such rights might or
|
||||
might not be available; neither does it represent that it has made any
|
||||
effort to identify any such rights. Information on the IETF's
|
||||
procedures with respect to rights in standards-track and standards-
|
||||
related documentation can be found in BCP-11. Copies of claims of
|
||||
rights made available for publication and any assurances of licenses to
|
||||
be made available, or the result of an attempt made to obtain a general
|
||||
license or permission for the use of such proprietary rights by
|
||||
implementors or users of this specification can be obtained from the
|
||||
IETF Secretariat.
|
||||
|
||||
The IETF invites any interested party to bring to its attention any
|
||||
copyrights, patents or patent applications, or other proprietary rights
|
||||
which may cover technology that may be required to practice this
|
||||
standard. Please address the information to the IETF Executive
|
||||
Director.
|
||||
|
||||
|
||||
12. Full Copyright Statement
|
||||
|
||||
Copyright (C) The Internet Society (2001). All Rights Reserved.
|
||||
This document and translations of it may be copied and furnished to
|
||||
others, and derivative works that comment on or otherwise explain it or
|
||||
assist in its implementation may be prepared, copied, published and
|
||||
distributed, in whole or in part, without restriction of any kind,
|
||||
provided that the above copyright notice and this paragraph are included
|
||||
on all such copies and derivative works. However, this document itself
|
||||
may not be modified in any way, such as by removing the copyright notice
|
||||
or references to the Internet Society or other Internet organizations,
|
||||
except as needed for the purpose of developing Internet standards in
|
||||
|
||||
Expires November 2001 [Page 7]
|
||||
|
||||
INTERNET-DRAFT UTF-8 DNS May 2001
|
||||
|
||||
|
||||
which case the procedures for copyrights defined in the Internet
|
||||
Standards process must be followed, or as required to translate it into
|
||||
languages other than English. The limited permissions granted above are
|
||||
perpetual and will not be revoked by the Internet Society or its
|
||||
successors or assigns. This document and the information contained
|
||||
herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE
|
||||
INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
|
||||
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."
|
||||
|
||||
Expires November 2001 [Page 8]
|
||||
Loading…
Reference in a new issue