Module iconv
The iconv
module provides a way to convert a string with
one encoding to a string with another encoding, for example from ASCII
to UTF-8. It is based on the POSIX iconv routines.
An exact list of the available encodings may depend on environment.
Typically the list includes ASCII, BIG5, KOI8R, LATIN8, MS-GREEK, SJIS,
and about 100 others. For a complete list, type iconv --list
on a
terminal.
Below is a list of all iconv
functions.
Name | Use |
---|---|
iconv.new() | Create an iconv instance |
iconv.converter() | Perform conversion on a string |
-
iconv.
new
(to, from)¶ Construct a new iconv instance.
Parameters: Return: a new iconv instance – in effect, a callable function
Rtype: userdata
If either parameter is not a valid name, there will be an error message.
Example:
tarantool> converter = require('iconv').new('UTF8', 'ASCII') --- ...
-
iconv.
converter
(input-string)¶ Convert.
param string input-string: the string to be converted (the “from” string) return: the string that results from the conversion (the “to” string) If anything in input-string cannot be converted, there will be an error message and the result string will be unchanged.
Example:
We know that the Unicode code point for “Д” (CYRILLIC CAPITAL LETTER DE) is hexadecimal 0414 according to the character database of Unicode. Therefore that is what it will look like in UTF-16. We know that Tarantool typically uses the UTF-8 character set. So make a from-UTF-8-to-UTF-16 converter, use string.hex(‘Д’) to show what Д’s encoding looks like in the UTF-8 source, and use string.hex(‘Д’-after-conversion) to show what it looks like in the UTF-16 target. Since the result is 0414, we see that iconv conversion works. (Different iconv implementations might use different names, for example UTF-16BE instead of UTF16BE.)
tarantool> string.hex('Д') --- - d094 ... tarantool> converter = require('iconv').new('UTF16BE', 'UTF8') --- ... tarantool> utf16_string = converter('Д') --- ... tarantool> string.hex(utf16_string) --- - '0414' ...