◆ u8blen() [1/3]

constexpr std::size_t Chusov::String::u8blen ( const char(&) pSymbol[N] )

constexprnoexcept

Returns a byte size of a UTF-8 character in the specified array of bytes.

Template Parameters

do_full_checks	specifies whether the function should perform all checks for validity of the input UTF-8 code as specified by the section 3.9 of the Unicode 11.0 standard. If the flag is `false`, the function only performs basic checks necessary for the operation, i.e. whether `cbSymbol` is greater or equal to 1 to read the first byte of the UTF-8 code with length-encoding prefix. The default value is `true`.
throw_on_failure	A boolean flag which specifies whether the function should throw an exception when the multi-byte value in the array specifies an invalid or incomplete UTF-8 character. If the flag is `false`, the function is marked `noexcept` and in case of failure returns std::size_t(-1) . The default value is `true`.
N	is a deducible size of the array `pSymbol`.

Parameters

pSymbol is an array of char elements to contain the UTF-8 code to read. Its size is not required be equal to the actual byte size of the UTF-8 code to obtain, but may be greater.

Returns: On success the function returns the number of bytes occupied by the UTF-8 character in pSymbol. On failure, if throw_on_failure is false, the function returns
std::size_t(-1)

.

The function is constexpr when compiled by a C++14 compiler.

Exceptions

Chusov::Exceptions::InvalidCharSequenceException The multi-byte character given by the parameters is not a valid or complete UTF-8 code. The exception is only thrown when throw_on_failure is true.

Note: Success of a call to the function only guarantees a validity of the UTF-8 character when do_full_checks is true. For performance reasons it might be preferable to avoid excessive full per-byte validation of UTF-8 characters and only do the validation once. To do that one may call the function with do_full_checks set to false and later validate the value using the u8_check function or just call the read_u8_char_data function to do the all the necessary checks, obtain the Unicode code point and the byte length of its UTF-8 representation at once.