chsvlib
chsv helper source code

◆ wcstou8s_s()

errno_t Chusov::String::wcstou8s_s ( std::size_t *restrict  pcbUtf,
char *restrict  pUtf,
rsize_t  cbUtfMax,
const wchar_t *restrict  pUcs,
rsize_t  cbMaxConvert 
)
noexcept

Converts the specified sequence of UCS-encoded wide characters into a sequence of corresponding UTF-8 multibyte characters, and, if the specified output buffer is given by a non-NULL pointer, stores the specified number of bytes, constituting the converted UTF-8 multibyte string, to the output buffer terminating it with a null character.

Parameters
[out]pcbUtfis a mandatory pointer to an output buffer, that receives a number corresponding to a result of the conversion operation. If the pointer is NULL, there is a runtime-constraint violation. If the pointer is not NULL, and either any other runtime-constraint is violated, or an encoding error occurs, the value of pcbUtf is set to (size_t) -1. Otherwise, the function writes there a number of bytes of the UTF-8 multibyte string successfully retrieved from the conversion, not counting the terminating null character (if any).
[out]pUtfis an optional pointer to an output buffer receiving at most cbUtfMax bytes (see remarks) of the converted UTF-8 string. If a runtime-constraint violation occurs, and the pUtf pointer is not NULL, and cbUtfMax is greater than 0 and less or equal to RSIZE_MAX, pUtf[0] is set to the null character.
[in]cbUtfMaxis a maximum number of bytes to be written to the pUtf buffer. If the pUtf pointer is NULL the value of cbUtfMax must be zero. If pUtf is not NULL, cbUtfMax must not be greater than RSIZE_MAX or equal zero. If a number of wide characters, which corresponds to the cbMaxConvert bytes of the UTF-8 string, does not specify a null-terminated wide string pointed to by pUcs, cbUtfMax must be greater than cbMaxConvert. If any of these conditions are not met, there is a runtime-constraint violation (see below).
[in]pUcsis a mandatory pointer to a UCS-2 (or UCS-4 if objects of the wchar_t type are large enough to hold UCS-4 characters) wide string to be converted to its UTF-8 multibyte equivalent. No wide characters that follow the first found null wide character (which is converted into a null multibyte character) will be examined or converted. If the pUtf pointer to the output buffer is NULL, the pUcs string must be zero-terminated, because all of its multibyte characters are converted while the value of cbMaxConvert is ignored. If pUtf is not NULL, a maximal number of wide characters to be converted is defined so that the converted result is not larger than the value specified by cbMaxConvert in bytes. If the pUcs string is zero-terminated within the bound, that corresponds to cbMaxConvert bytes of the multibyte string, the null wide character is converted and written to the output buffer and the rest part of the wide string is ignored. If cbMaxConvert is greater than the maximal number of bytes to write to the output buffer, the input UCS string must be zero-terminated within the bound, given by the number of characters that corresponds to cbUtfMax bytes of the UTF-8 string. Otherwise, there is a runtime-constraint violation.
[in]cbMaxConvertis a maximal number of bytes to write to the output buffer. No more than that number of bytes of the buffer will be modified. If the pUtf pointer is NULL the parameter is ignored by the function.
Returns
The function returns zero if no runtime-constraint violation and no encoding error occurred. Otherwise, a nonzero error code is returned.

The function implements the conversion of the specified UCS-2 (or UCS-4, if a wchar_t object can hold UCS-4 values) wide string to the corresponding UTF-8 multibyte string. The function is built to perform the conversion independently of the current locale. The result of the conversion depends upon the size of the wchar_t type. For instance, on Windows sizeof(wchar_t) equals 2 which is not enough to cover all possible Unicode code points. In this case a conversion to the UCS-2 will take place. On the other hand, some Linux compilers define the size of the wchar_t type as 4 which results in UCS-4 based conversion performed by the function.

It is a secure variant of the wcstou8s function. The relation of wcstou8s_s to wcstou8s is similar to one of the analogous wcstombs_s function to its non-secure wcstombs counterpart. The secure function is defined by the C11 standard (Annex K) as well as the extension ISO/IEC TR 24731-1 to the C99 standard.

The function verifies adherence to the following runtime-constraints.

  1. Neither pcbUtf nor pUcs shall be a null pointer.
  2. If pUtf is not a null pointer, then neither cbMaxConvert nor cbUtfMax shall be greater than RSIZE_MAX.
  3. If pUtf is a null pointer, then cbUtfMax shall equal zero.
  4. If pUtf is not a null pointer, then cbUtfMax shall not equal zero.
  5. If pUtf is not a null pointer and cbMaxConvert is not less than cbUtfMax, then the conversion shall have been stopped because a terminating null wide character was reached or because an encoding error occurred.
  6. If there is a runtime-constraint violation, then the function does the following. If pcbUtf is not a NULL pointer, then *pcbUtf is set to (size_t)(-1). If pUtf is not a NULL pointer and cbUtfMax is greater than zero and less than RSIZE_MAX, then pUtf[0] is set to the null character.

From these runtime-constraints it is followed that if pUtf is not NULL the pcbUtf receives an actual number of bytes written to the pUtf, not counting the terminating null character.

Also, unlike the non-secure wcstou8s counterpart, the function verifies that all of the converted code points are valid as specified by the Unicode 11.0 standard.

Warning
The implementation adheres the requirements defined by the C11 standard and the ISO/IEC TR 24731-1 extension of C99 for the analogous wcstombs_s function, but differs from Microsoft wcstombs_s which does not conform the standard interface definition.

The differences are the following:

  1. In the case of successful conversion the standard demands that the output value of pcbUtf, i.e. the number of successfully converted bytes, should not include the null terminator, whereas Microsoft definition includes it.
  2. In case of an error the standard function does not necessarily set the errno code, while the Microsoft definition always does.
  3. The standard requires that neither cbUtfMax nor cbMaxConvert must be greater than RSIZE_MAX whereas Microsoft does not define this requirement.
  4. The standard considers equality of pcbUtf to NULL a runtime-constraint violation whereas Microsoft does not assert it.
See also
wcstou8s;
u8stowcs_s;
wctou8_s.