chsvlib
chsv helper source code

◆ u8towc_s()

errno_t Chusov::String::u8towc_s ( int *restrict  pStatus,
wchar_t *restrict  pUcs,
const char *restrict  pUtf,
rsize_t  cbUtf 
)
noexcept

Securely converts one multibyte UTF-8 code to a wide character assumed to be specified in the UCS format.

Parameters
[in]pStatusreceives a status of the conversion. If the conversion succeeds, pStatus receives the number of bytes occupied by the UTF-8 character pointed to by pUtf. If pUtf points to an invalid UTF-8 code, or (if pUcs is not NULL) to a code which, when converted to UCS, cannot fit into a variable of the wchar_t type, the pStatus parameter receives -1, and the function returns the EILSEQ code. If a runtime constraint is violated by the call, the value of pStatus is not changed.
[out]pUcsis an optional pointer to a buffer to receive a wide character which corresponds to a code specified by pUtf.
[in]pUtfis a pointer to a UTF-8 code to be converted to a wide character.
[in]cbUtfspecifies the maximal number of bytes to read from pUtf.
Returns
The function returns zero if successful or an appropriate non-zero errno code, if there is a runtime-constraint violation, or pUtf does not point to a valid multibyte character.

The function implements the conversion of a UTF-8 encoded character to the corresponding UCS-encoded form held by a wide character independently of the current locale. The result of the conversion depends upon the size of the wchar_t type. For instance, on Windows sizeof(wchar_t) equals 2 which is not enough to cover all possible Unicode code points. In this case a conversion to the UCS-2 will take place. On the other hand, some Linux compilers define the size of the wchar_t type as 4 which results in UCS-4 based conversion performed by the function.

If the pUtf UTF-8 code, when converted to wchar_t, results in a loss of data, the function returns EILSEQ setting pStatus to -1.

It is a secure variant of the u8towc function. Unlike other encoding conversion functions of chsvlib, u8towc_s does not have a standardized analogue and is introduced to verify that pUtf is not NULL and cbUtf is not greater than RSIZE_MAX. Additionally, unlike u8towc, the function verifies that the code point appertains to the set of valid codes defined for UTF-8 by the Unicode 11.0 standard.

See also
wctou8_s;
wcstou8s_s;
u8towc.