From: tedu Date: Wed, 31 May 2017 08:30:22 +0000 (+0000) Subject: add a tiny, to be improved, man page for utf8 encoding. X-Git-Url: http://artulab.com/gitweb/?a=commitdiff_plain;h=0595762f273748026a530800ea7317a8749b8f94;p=openbsd add a tiny, to be improved, man page for utf8 encoding. ok stsp --- diff --git a/share/man/man7/Makefile b/share/man/man7/Makefile index f295d97fa4d..ef07e278691 100644 --- a/share/man/man7/Makefile +++ b/share/man/man7/Makefile @@ -1,4 +1,4 @@ -# $OpenBSD: Makefile,v 1.26 2017/05/29 12:13:50 tedu Exp $ +# $OpenBSD: Makefile,v 1.27 2017/05/31 08:30:22 tedu Exp $ # $NetBSD: Makefile,v 1.6 1994/12/22 10:50:05 cgd Exp $ # missing: term.7 @@ -7,6 +7,6 @@ MAN= airport.7 ascii.7 eqn.7 environ.7 glob.7 hier.7 hostname.7 intro.7 \ library-specs.7 \ man.7 mandoc_char.7 mdoc.7 mirroring-ports.7 \ operator.7 packages.7 packages-specs.7 pkgpath.7 ports.7 roff.7 \ - script.7 securelevel.7 tbl.7 + script.7 securelevel.7 tbl.7 utf8.7 .include diff --git a/share/man/man7/utf8.7 b/share/man/man7/utf8.7 new file mode 100644 index 00000000000..b0dacfb61c7 --- /dev/null +++ b/share/man/man7/utf8.7 @@ -0,0 +1,60 @@ +.\" $OpenBSD: utf8.7,v 1.1 2017/05/31 08:30:22 tedu Exp $ +.\" +.\" Copyright (c) 2017 Ted Unangst +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR +.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. +.\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, +.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT +.\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +.\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF +.\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +.\" +.Dd $Mdocdate: May 31 2017 $ +.Dt UTF8 7 +.Os +.Sh NAME +.Nm utf8 +.Nd UTF-8 text encoding +.Sh DESCRIPTION +UTF-8 is a multibyte encoding for Unicode text. +It is the preferred format for non ASCII text. +.Pp +The first byte of a sequence indicates the length in its high bits. +Continuation bytes all have the same format, with the top two bits set and +unset, respectively. +.Pp +Ranges: +.Bl -tag -width Ds +.It 0x0 - 0x7f +One byte. +0....... +.It 0x80 - 0x7ff +Two bytes. +110..... 10....... +.It 0x800 - 0xffff +Three bytes. +1110.... 10...... 10...... +.It 0x1000 - 0x10ffff +Four bytes. +11110... 10...... 10...... 10...... +.El +.Sh CAVEATS +Beware of overlong encodings. +.Sh STANDARDS +Unicode. +.Sh SEE ALSO +.Xr ascii 7