module UTF8:sig..end
The Module for UTF-8 encoded Unicode strings.
typet =string
exception Malformed_code
val validate : t -> unitvalidate s
Succeeds if s is valid UTF-8, otherwise raises Malformed_code.
Other functions assume strings are valid UTF-8, so it is prudent
to test their validity for strings from untrusted origins.val get : t -> int -> UChar.ucharget s n returns n-th Unicode character of s.
The call requires O(n)-time.val init : int -> (int -> UChar.uchar) -> tinit len f
returns a new string which contains len Unicode characters.
The i-th Unicode character is initialized by f ival length : t -> intlength s returns the number of Unicode characters contained in stypeindex =int
0val nth : t -> int -> indexnth s n returns the position of the n-th Unicode character.
The call requires O(n)-timeval last : t -> indexval look : t -> index -> UChar.ucharlook s i
returns the Unicode character of the location i in the string s.val substring : t -> int -> int -> tsubstring s i len returns the substring made of the Unicode locations i to i + len - 1 inclusive.
The string is always copiedval out_of_range : t -> index -> boolout_of_range s i
tests whether i is a position inside of s.val compare_index : t -> index -> index -> intcompare_index s i1 i2 returns
a value < 0 if i1 is the position located before i2,
0 if i1 and i2 points the same location,
a value > 0 if i1 is the position located after i2.val next : t -> index -> indexnext s i
returns the position of the head of the Unicode character
located immediately after i.
If i is inside of s, the function always successes.
If i is inside of s and there is no Unicode character after i,
the position outside s is returned.
If i is not inside of s, the behaviour is unspecified.val prev : t -> index -> indexprev s i
returns the position of the head of the Unicode character
located immediately before i.
If i is inside of s, the function always successes.
If i is inside of s and there is no Unicode character before i,
the position outside s is returned.
If i is not inside of s, the behaviour is unspecified.val move : t -> index -> int -> indexmove s i n
returns n-th Unicode character after i if n >= 0,
n-th Unicode character before i if n < 0.
If there is no such character, the result is unspecified.val iter : (UChar.uchar -> unit) -> t -> unititer f s
applies f to all Unicode characters in s.
The order of application is same to the order
of the Unicode characters in s.val compare : t -> t -> intcompare s1 s2 returns
a positive integer if s1 > s2,
0 if s1 = s2,
a negative integer if s1 < s2.module Buf:sig..end