chsvlib
chsv helper source code

◆ u8len()

int Chusov::String::u8len ( const char *  pUtf,
std::size_t  cbUtf 
)
noexcept

Determines a number of bytes contained in a UTF-8 character.

Parameters
[in]pUtfis a pointer to a multibyte character given in UTF-8 format. The pointer can be NULL to signal that the conversion is state-independent (to respect the same requirement to the analogous mblen function of the C standard).
[in]cbUtfis a maximum number of bytes for the function to inspect while reading the character.
Returns
If pUtf is NULL, the function returns 0 for the state-independent UTF-8 encoding. If pUtf is not NULL, the function either returns 0, if pUtf specifies a null multibyte character, or an actual number of bytes occupied by the multibyte UTF-8 character, or -1, if cbUtf bytes of the input buffer do not provide the function with valid and sufficient data to determine the byte size of the character. In the latter case the function also sets errno to EILSEQ.
Warning
A successful call of the function does not guarantee validity of the multibyte code in pUtf. Subsequent check whether cbUtf is not less than the returned value, and a call to the u8check with the actual byte length would verify the validity of the code.

If pUtf is not a null pointer, the u8len function determines the number of bytes expected in the pUtf buffer to encode a UTF-8 character.

A call to the function is equivalent to

u8towc((wchar_t *)0, (const char *)0, 0);
u8towc((wchar_t *)0, pUtf, cbUtf);

The function is built in a portable way to analyze the input UTF-8 character independently of the current C locale.

The interface is built similarly to the [wctomb] function specified by the C standard except for the maximum possible byte length of the UTF-8 characters which is UTF8_MAX_LEN and not MB_CUR_MAX or MB_LEN_MAX.

See also
u8check;
u8toucp;
ucptou8;
u8stoucps.