chsvlib
chsv helper source code

◆ u8stoucps_s()

errno_t Chusov::String::u8stoucps_s ( std::size_t *restrict  pcchConverted,
ucp_t *restrict  pUcp,
rsize_t  cchUcpMax,
const char *restrict  pszUtf,
rsize_t  cchUtf 
)
noexcept

Converts the specified sequence of UTF-8 characters into a sequence of corresponding unicode code points and, if the specified output buffer is given by a non-NULL pointer, stores at most the specified number of wide characters into the buffer terminating the output string by the zero code.

Parameters
[out]pcchConvertedis a mandatory pointer to an output buffer, which receives a number corresponding to a result of the conversion operation. If the pointer is NULL, there is a runtime-constraint violation. If the pointer is not NULL and either any other runtime constraint is violated or an encoding error occurs, the value of pcchConverted is set to (size_t) -1. Otherwise, the function writes there a number of multibyte UTF-8 characters of the pszUtf string, that are successfully converted to Unicode code points, not counting the terminating null character (if any).
[out]pUcpis an optional pointer to an output buffer receiving at most cchUcpMax code points (see remarks) of the resulting string. If a runtime-constraint violation occurs, and the pUcp pointer is not NULL, and cchUcpMax is greater than 0 and less or equal to RSIZE_MAX, pUcp[0] is set to null.
[in]cchUcpMaxis a maximal number of code points to be written to the pUcp buffer. If the pUcp pointer is NULL the value of cchUcpMax must be zero. If pUcp is not NULL, cchUcpMax must not be greater than RSIZE_MAX nor equal zero. If cchUtf characters of the UTF-8 string do not contain a null character, cchUcpMax must be greater than cchUtf. If any of these conditions are not met, there is a runtime-constraint violation (see below).
[in]pszUtfis a mandatory pointer to a UTF-8 string to be converted to the string of code points. No UTF-8 characters that follow a null character (which is converted into a null code point and, if pUcp is not NULL, stored) will be examined or converted. If the pUcp pointer to the output buffer is NULL, the UTF-8 string must be zero-terminated, because all of its multibyte characters are converted while the value of cchUtf is ignored. If pUcp is not NULL, a maximal number of multibyte characters to be converted is specified by cchUtf. If the UTF-8 string is zero-terminated within the bound of cchUtf characters, the null character is converted and written to the output buffer, and the rest part of the multibyte string is ignored. If cchUtf is greater than the maximal number of characters to write to the output buffer, the input UTF-8 string must be zero-terminated within the bound given by the value of cchUcsMax. Otherwise, there is a runtime-constraint violation.
[in]cchUtfis a maximal number of characters to be written to the buffer pointed to by the pUcp parameter. No more than that number of wide characters of the buffer will be modified. If the pUcp pointer is NULL, the value of cchUtf is ignored by the function.
Returns
The function returns zero if no runtime-constraint violation and no encoding error occurred. Otherwise, a corresponding non-zero error code is returned.

The function implements the conversion of the specified UTF-8 multibyte string to a string of corresponding Unicode 11.0 code points. The function is built in a portable way to perform the conversion of UTF-8 codes independently of the current locale.

It is a secure variant of the u8stoucps function. The relation of u8stoucps_s to u8stoucps is similar to one of the analogous mbstowcs_s function to its non-secure mbstowcs counterpart. The secure function is defined by the C11 standard (Annex K) as well as the extension ISO/IEC TR 24731-1 to the C99 standard.

The function verifies adherence to the following runtime-constraints.

  1. Neither pcchConverted nor pszUtf shall be a null pointer.
  2. If pUcp is not a null pointer, then neither cchUtf nor cchUcpMax shall be greater than RSIZE_MAX.
  3. If pUcp is a null pointer, then cchUcpMax shall equal zero.
  4. If pUcp is not a null pointer, then cchUcpMax shall not equal zero.
  5. If pUcp is not a null pointer and cchUtf is not less than cchUcpMax, then a null character shall occur within the first cchUcpMax multibyte characters of the array pointed to by pszUtf.
  6. If there is a runtime-constraint violation, then the function does the following. If pcchConverted is not a NULL pointer, then pcchConverted is set to (size_t)(-1). If pUcp is not a NULL pointer and cchUcpMax is greater than zero and less than RSIZE_MAX, then pUcp[0] is set to the null wide character.

From these runtime-constraints it follows, that if pUcp is not NULL, the pcchConverted receives an actual number of wide characters written to the pUcp, not counting the terminating null wide character.

Also, unlike the non-secure u8stoucps counterpart, the function verifies that all of the converted UTF-8 codes are valid and are represented by valid code points in accordance with the Unicode 11.0 standard.

Warning
The implementation adheres the requirements defined by the C11 standard and the ISO/IEC TR 24731-1 extension of C99 for the analogous mbstowcs_s function, but differs from Microsoft mbstowcs_s which does not conform the standard interface definition.

The differences are the following:

  1. In the case of successful conversion the standard demands that the output value of pcchConverted, i.e. the number of successfully converted characters, should not include the null terminator, whereas Microsoft definition includes it.
  2. In case of an error the standard function does not necessarily set the errno code, while the Microsoft definition always does.
  3. The standard requires that neither cchUcpMax nor cchUtf must be greater than RSIZE_MAX whereas Microsoft does not define this requirement.
  4. The standard considers equality of pcchConverted to NULL a runtime-constraint violation whereas Microsoft does not assert it.
See also
u8stoucps;
ucpstou8s_s;
u8toucp.