◆ u8_to_ucp() [2/2]

constexpr ucp_t Chusov::String::u8_to_ucp	(	InputIterator	itSymbol,
		std::size_t	cbSymbol = `UTF8_MAX_LEN`
	)

constexprnoexcept

Converts a UTF-8 code referenced by an iterator to a Unicode 11.0 code point.

Template Parameters

do_full_checks	specifies whether the function should perform all checks for validity of the input UTF-8 code as specified by the section 3.9 of the Unicode 11.0 standard. If the flag is `false`, the function only performs basic checks necessary for the operation and the validity of the returned code point is not guaranteed.
throw_on_failure	A boolean flag which specifies whether the function should throw an exception when the multi-byte value addressed by `itSymbol` specifies an invalid or incomplete UTF-8 character. If the flag is `false`, the function is marked `noexcept` and in case of failure returns INVALID_UNICODE_CODE_POINT. The default value is `true`.
InputIterator	is a deducible parameter which is a type of `itSymbol`.

Parameters

itSymbol	is an iterator referencing the multi-byte character to convert to a Unicode code point. The iterator must meet the [InputIterator] requirements and its elements must be of the `char` type.
cbSymbol	is a maximal number of bytes, i.e. elements addressed by `itSymbol`, to inspect. The default value is UTF8_MAX_LEN.

Returns: On success the function returns a Unicode 11.0 code point corresponding to a multi-byte UTF-8 character addressed by itSymbol. On failure of conversion, when throw_on_failure is set to false, the function returns INVALID_UNICODE_CODE_POINT.

The function is constexpr when compiled by a C++14 compiler.

Exceptions

Chusov::Exceptions::InvalidCharSequenceException The multi-byte character given by the parameters is not a valid or complete UTF-8 code. The exception is only thrown when throw_on_failure is true.

Note: Success of a call to the function only guarantees a validity of the UTF-8 character when do_full_checks is true.

See also: u8_to_ucp_advance Performs the same conversion but also returns a copy of the iterator in the new position.
read_u8_char_data Obtains the code point, the byte size of the UTF-8 character and performs all of the validity checks at once.
ucp_to_u8 Converts a given Unicode code point to a sequence of bytes of the corresponding UTF-8 code.