chsvlib
chsv helper source code

◆ u8_to_ucp() [2/2]

constexpr ucp_t Chusov::String::u8_to_ucp ( InputIterator  itSymbol,
std::size_t  cbSymbol = UTF8_MAX_LEN 
)
constexprnoexcept

Converts a UTF-8 code referenced by an iterator to a Unicode 11.0 code point.

Template Parameters
do_full_checksspecifies whether the function should perform all checks for validity of the input UTF-8 code as specified by the section 3.9 of the Unicode 11.0 standard. If the flag is false, the function only performs basic checks necessary for the operation and the validity of the returned code point is not guaranteed.
throw_on_failureA boolean flag which specifies whether the function should throw an exception when the multi-byte value addressed by itSymbol specifies an invalid or incomplete UTF-8 character. If the flag is false, the function is marked noexcept and in case of failure returns INVALID_UNICODE_CODE_POINT. The default value is true.
InputIteratoris a deducible parameter which is a type of itSymbol.
Parameters
itSymbolis an iterator referencing the multi-byte character to convert to a Unicode code point. The iterator must meet the [InputIterator] requirements and its elements must be of the char type.
cbSymbolis a maximal number of bytes, i.e. elements addressed by itSymbol, to inspect. The default value is UTF8_MAX_LEN.
Returns
On success the function returns a Unicode 11.0 code point corresponding to a multi-byte UTF-8 character addressed by itSymbol. On failure of conversion, when throw_on_failure is set to false, the function returns INVALID_UNICODE_CODE_POINT.

The function is constexpr when compiled by a C++14 compiler.

Exceptions
Chusov::Exceptions::InvalidCharSequenceExceptionThe multi-byte character given by the parameters is not a valid or complete UTF-8 code. The exception is only thrown when throw_on_failure is true.
Note
Success of a call to the function only guarantees a validity of the UTF-8 character when do_full_checks is true.
See also
u8_to_ucp_advance Performs the same conversion but also returns a copy of the iterator in the new position.
read_u8_char_data Obtains the code point, the byte size of the UTF-8 character and performs all of the validity checks at once.
ucp_to_u8 Converts a given Unicode code point to a sequence of bytes of the corresponding UTF-8 code.