chsvlib
chsv helper source code

◆ ucptou8_s()

errno_t Chusov::String::ucptou8_s ( int *restrict  pStatus,
char *restrict  pUtf,
rsize_t  cbUtf,
ucp_t  ucp 
)
noexcept

Securely converts a given Unicode 11.0 code point to its UTF-8 representation.

Parameters
[out]pStatusis a mandatory pointer to an output buffer, that receives a number corresponding to a result of the conversion operation. The buffer is not modified if any of the runtime-constraints given below are violated. Otherwise, if the pUtf pointer is NULL, the function writes 0 to the buffer, pointed to by pStatus, to indicate a state-independence of the UTF-8 encoding. Otherwise, the function writes either -1 or a number of bytes written to the pUtf buffer depending on whether the code point ucp respectively does or does not correspond to a valid UTF-8 code. In no case will the signed integer pointed to by pStatus be set to a value greater than the UTF8_LEN_MAX macro.
[out]pUtfis a buffer where the multibyte UTF-8 equivalent of ucp can be written. No more than UTF8_LEN_MAX and cbUtf bytes are stored into the buffer. If pUtf is NULL, the function only returns 0 (to indicate that the UTF-8 conversion is stateless) and ignores other parameters except for pStatus.
[in]cbUtfis a maximum number of bytes that can be written to the buffer pointed to by the pUtf parameter. If pUtf is NULL, cbUtf must be zero. If pUtf is not NULL, cbUtf must not be greater than RSIZE_MAX or less than a number of bytes needed to represent ucp. If any of these conditions are not met, the function signals a runtime-constraint violation and interrupts the normal execution of code (see below). A value of UTF8_LEN_MAX is enough to represent any UTF-8 character.
[in]ucpis a Unicode 11.0 code point to encode into UTF-8 format. If pUtf is not NULL, the UTF-8 code is stored into the pUtf buffer.
Returns
The function returns zero if successful or an appropriate non-zero errno code, if there is a runtime-constraint violation, or ucp does not correspond to a valid multibyte character.

The function implements the conversion of a Unicode 11.0 code point into the corresponding UTF-8 multibyte character. The function is built in a portable way to perform the conversion to the UTF-8 format independently of the current locale.

It is a secure variant of the ucptou8 function. The relation of ucptou8_s to ucptou8 is similar to one of the analogous wctomb_s function to its non-secure wctomb counterpart. The secure function is defined by the C11 standard (Annex K) as well as the extension ISO/IEC TR 24731-1 to the C99 standard.

The function verifies that the following runtime-constraints are not violated by a call:

  1. Let n denote the number of bytes needed for UTF-8 representation of ucp.
  2. If pUtf is not NULL, then cbUtf must not be less than n, and cbUtf must not be greater than RSIZE_MAX. If pUtf is NULL, then cbUtf shall equal zero.
  3. If there is a runtime-constraint violation, the function does not modify the number pointed to by pStatus, and, if pUtf is not NULL, no more than cbUtf bytes of pUtf will be accessed.

Also, unlike the non-secure ucptou8 counterpart, the function verifies that the code point specified by ucp is valid and can be represented by a valid code in the UTF-8 code space according to Unicode 11.0.

Warning
The implementation adheres the requirements defined by the C11 standard and the ISO/IEC TR 24731-1 extension of C99 for the analogous wctomb_s function, but differs from Microsoft wctomb_s which does not conform the standard interface definition.

The differences are the following:

  1. In case of an error the standard function does not necessarily set the errno code, while the Microsoft definition always does.
  2. The standard requires that cbUtf must not be greater than RSIZE_MAX whereas Microsoft implementation defines the limit as INT_MAX.
See also
ucptou8;
ucpstou8s_s;
u8toucp_s.