<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
     "dtd/xml/4.1.2/docbookx.dtd"[
<!ENTITY % afnic_custom SYSTEM "../../lib/afnic-docbook.inc">
%afnic_custom;
]>
<!-- $Id: idn.db,v 1.6 2003-10-20 12:57:21 bortzmeyer Exp $ -->
<article>
  <articleinfo>
    <title>IDN (Internationalized Domain Names)</title>
    <author>
      <surname>Bortzmeyer</surname>
      <firstname>Stephane</firstname>
      <affiliation>
	<address><email>bortzmeyer@nic.fr</email></address>
      </affiliation>
    </author>
    <pubdate>$Date: 2003-10-20 12:57:21 $</pubdate>
  </articleinfo>
<section><title>Background</title>
<para><ulink url="http://www.i-d-n.net/">IDN</ulink> allows you to use all the characters (or even non-characters
      such as ideograms) of all human scripts in your
      domain names. The DNS protocol allows non-ASCII characters for a
      very long time but DNS usage reduced the set of acceptable
      characters to a subset of US-ASCII. </para>
<para>To allow more characters, thus enabling you to have domain names
      properly written in your language, the IETF decided not to
      change the DNS protocol but rather to ask applications to
      transform IDN into US-ASCII.</para>
<para>The repertoire of characters used by IDNA is <ulink
      url="http://www.unicode.org/">Unicode</ulink><footnote><para>There is a <ulink url="http://www.nic.fr/formation/supports/formation-unicode/">set of slides in French</ulink> about Unicode.</para>
      </footnote>.</para>
<para>The current IETF standard is represented by four RFCs:
    <itemizedlist>
      <listitem><para><rfc num="3490"/>, "Internationalizing Domain
      Names in Applications (IDNA)" sets the base protocol.</para><para>As its
      name says, all the work have to be done by the applications. On
      the wire, in the zone file, you will find only US-ASCII.</para>
      </listitem>
      <listitem><para><rfc num="3454"/>, "Preparation of
      Internationalized Strings ("stringprep")" and <rfc num="3491"/>,
      "Nameprep: A Stringprep Profile for Internationalized Domain
      Names (IDN)", describe the steps to take when receiving a domain
      name. IDNA applications will have to canonicalize the names to
      bring them to a common ("canonic") form before testing for
      unicity.</para>
	<para>For instance, in German, "maße" and "masse" will be the
      same name, after nameprep canonicalization. In French, "CAFÉ" and
      "café" will be the same (but not "CAFE", which is not a proper spelling).</para>
      </listitem>
      <listitem><para><rfc num="3492"/>, "Punycode: A Bootstring encoding of Unicode
       for Internationalized Domain Names in Applications (IDNA)"
       specifies the encoding used. Unicode names are transformed into
       US-ASCII names (ACE: ASCII Compatible Encoding) which start with the common prefix "xn--". For
       instance, <computeroutput>stéphane.org</computeroutput> becomes
       <computeroutput>xn--stphane-cya.org</computeroutput>.</para>
      </listitem>
    </itemizedlist>
You can try these transformations online at <ulink
	url="http://tac.eureg.org/idn.cgi">EUREG</ulink>, <ulink
	url="http://oss.software.ibm.com/cgi-bin/icu/idnademo">IBM</ulink>
	or <ulink url="http://josefsson.org/idn.php/">Josefsson</ulink>.
  </para>
  </section>
<section><title>Policy</title>
<para>For a registry, if you want to register IDN, you will have to
      address some policy issues. For instance:<itemizedlist>
	<listitem><para>Do you accept that two variants (names that
	    are more or less the same, according to the rules for a
	    given language, but are different according to <rfc
	      num="3490"/>, are registered by different registrants?</para>
	</listitem>
	<listitem><para>Do you accept all the characters of Unicode or
	just a subset which fits your local language(s)?</para>
	</listitem>
      </itemizedlist>
These policy issues are discussed in the <ulink
	url="http://www.imc.org/idn-reg-policy/index.html">idn-reg-policy
	mailing list</ulink>.
    </para>
  </section>
  <section>
    <title>Tools</title>
    <para>You can register IDN with no tools at all if you just store
    the ACE strings. But if you want to perform nameprep and punycode
    yourself, or if you want to implement bundles (the set of all
    names that are simple variants of the registered name), you will
    need to write some code.
    <itemizedlist>
      <listitem><para><ulink
      url="http://www.josefsson.org/libidn/">GNU libidn</ulink>, a
      free software implementation of IDN. Nothing to write, just use it.</para>
      </listitem>
    </itemizedlist></para>
  </section>
</article>
