Returns a byte length of a UTF-8 character addressed by an iterator.
- Template Parameters
-
do_full_checks | specifies whether the function should perform all checks for validity of the input UTF-8 code as specified by the section 3.9 of the Unicode 11.0 standard. If the flag is false , the function only performs basic checks necessary for the operation, i.e. whether cbSymbol is greater or equal to 1 to read the first byte of the UTF-8 code with length-encoding prefix. The default value is true . |
throw_on_failure | A boolean flag which specifies whether the function should throw an exception when the multi-byte value addressed by itSymbol specifies an invalid or incomplete UTF-8 character. If the flag is false , the function is marked noexcept and in case of failure returns. The default value is true . |
InputIterator | is a deducible parameter which is a type of itSymbol . |
- Parameters
-
itSymbol | is an iterator referencing the multi-byte character to obtain the size of. The iterator must meet the InputIterator requirements and its elements must be of the char type. |
cbSymbol | is a maximal number of bytes, i.e. elements addressed by itSymbol , to inspect. The default value is UTF8_MAX_LEN. |
- Returns
- On success the function returns the number of bytes occupied by the UTF-8 character addressed by
itSymbol
. On failure, if throw_on_failure
is false
, the function returns.
The function is constexpr
when compiled by a C++14 compiler.
- Exceptions
-
- Note
- Success of a call to the function only guarantees a validity of the UTF-8 character when
do_full_checks
is true
. For performance reasons it might be preferable to avoid excessive full per-byte validation of UTF-8 characters and only do the validation once. To do that one may call the function with do_full_checks
set to false
and later validate the value using the u8_check function or just call the read_u8_char_data function to do the all the necessary checks, obtain the Unicode code point and the byte length of its UTF-8 representation at once.