chsvlib
chsv helper source code
chsvstring.h File Reference

Functions and elements used to manage strings, convert them to different encoding schemes and to manage associated memory. More...

#include "chsvprintf.h"
#include "chsvmem.h"
#include <memory>

Classes

struct  AutoCString< CharType >
 A template of a class specifying std::unique_ptr to manage multibyte and wide C strings allocated using one of the allocation functions of the chsvmem.h header, such as AllocateString, AllocateAndFormatString or AllocateAndFormatStringByTags. A class template instance deallocates corresponding memory using the FreeString function or its derivatives. More...
 

Namespaces

 Chusov
 Basic chsvlib namespace.
 
 Chusov::String
 A namespace of string manipulation functions and classes.
 

Macros

#define UTF8_MAX_LEN   4
 A number of bytes occupied by the longest UTF-8 code point.
 
#define INVALID_UNICODE_CODE_POINT   ((RESOLVE_NAMESPACE(Chusov::String) ucp_t) -1)
 A special invalid Unicode 11.0 code point.
 

Typedefs

typedef unsigned int ucp_t
 An type of a Unicode code point. It is an unsigned integral type capable for storing at least 4 bytes.
 

Functions

std::size_t strLen (const char *pszStr) noexcept
 Returns a size, in bytes, of a zero terminated string excluding the terminating zero. Implements the behaviour of the standard strlen function of the string.h header. More...
 
std::size_t wcsLen (const wchar_t *pszStr) noexcept
 Returns a size, in wide characters, of a zero terminated wide string excluding the terminating zero. Implements the behaviour of the standard wcslen function of the wchar.h header. More...
 
std::size_t strnLen_s (const char *pszStr, std::size_t cchStr) noexcept
 Returns a size, in bytes, of a string excluding the terminating zero, if any. The function implements the strnlen_s function of the ISO/IEC TR 24731-1 standard. More...
 
std::size_t wcsnLen_s (const wchar_t *pszStr, std::size_t cchStr) noexcept
 Returns a size, in wide characters, of a string excluding the terminating zero, if any. The function implements the wcsnlen_s function of the ISO/IEC TR 24731-1 standard. More...
 
char * strCpy (char *restrict pDest, const char *restrict pszSrc) noexcept
 Performs a copying of a zero-terminated string from a source buffer to a destination one. Implements the behaviour of the standard strcpy function of the string.h header. More...
 
wchar_t * wcsCpy (wchar_t *restrict pDest, const wchar_t *restrict pszSrc) noexcept
 Performs a copying of a zero-terminated wide string from a source buffer to a destination one. Implements the behaviour of the standard wcscpy function of the wchar.h header. More...
 
errno_t strCpy_s (char *restrict pDest, rsize_t cchDest, const char *restrict pszSrc) noexcept
 Copies a zero-terminated source string, including a terminating null character, to a destination buffer of the specified size. More...
 
errno_t wcsCpy_s (wchar_t *restrict pDest, rsize_t cchDest, const wchar_t *restrict pszSrc) noexcept
 Copies a zero-terminated source string, including a terminating null character, to a destination buffer of the specified size. More...
 
template<class CHAR_TYPE >
std::size_t tcsLen (const CHAR_TYPE *pszStr) noexcept
 Resolves to strLen, to wcsLen or to ucpsLen depending on the template parameter. More...
 
template<class CHAR_TYPE >
std::size_t tcsnLen_s (const CHAR_TYPE *pszStr, std::size_t cchStr) noexcept
 Resolves to strnLen_s, to wcsnLen_s or to ucpsnLen_s depending on the template parameter. More...
 
template<class CHAR_TYPE >
CHAR_TYPE * tcsCpy (CHAR_TYPE *restrict pszDest, const CHAR_TYPE *restrict pszSrc) noexcept
 Resolves to strCpy, to wcsCpy or to ucpsCpy depending on the template parameter. More...
 
template<class CHAR_TYPE >
errno_t tcsCpy_s (CHAR_TYPE *restrict pszDest, rsize_t cchDest, const CHAR_TYPE *restrict pszSrc) noexcept
 Resolves to strCpy_s, to wcsCpy_s or to ucpsCpy_s depending on the template parameter. More...
 
template<class CHAR_TYPE >
CHAR_TYPE * CstrLiteral (char *pszByteStr, wchar_t *pszWideStr)
 Returns an argument of CHAR_TYPE type. More...
 
template<class CHAR_TYPE >
const CHAR_TYPE * CstrLiteral (const char *pszByteStr, const wchar_t *pszWideStr)
 Returns an argument of const CHAR_TYPE type. More...
 
int GetUniqueStringA (char *lpStr, std::size_t cchStr) noexcept
 Generates an UUID as a C string and writes it into the output buffer. More...
 
int GetUniqueStringW (wchar_t *lpStr, std::size_t cchStr) noexcept
 Generates an UUID as a wide C string and writes it into the output buffer. More...
 
template<class CHAR_TYPE >
bool GetUniqueString (CHAR_TYPE *lpStr, std::size_t cchStr) noexcept
 Generates an UUID as a C string and writes it into the output buffer. More...
 
std::size_t ByteLengthMBS (const char *pszString, std::size_t cchInput) noexcept
 Returns a number of bytes occupied by the cchInput multibyte symbols of a string. More...
 
std::size_t CharacterLengthMBS (const char *pszString, std::size_t cbInput) noexcept
 Returns a number of multibyte symbols that are specified by the cbInput bytes of the pszString encoded as defined by the current locale. More...
 
std::size_t ConvertWideToMBS (char *restrict pStringMB, std::size_t cbStringMB, const wchar_t *restrict pStringW, std::size_t *pcchStringW) noexcept
 Converts a wide string to a multibyte equivalent and returns a number of bytes occupied by the converted multibyte string using the current locale. More...
 
std::size_t ConvertMBSToWide (wchar_t *restrict pStringW, std::size_t cchStringW, const char *restrict pStringMB, std::size_t cbStringMB) noexcept
 Converts a multibyte string to its wide equivalent and returns a number of successfully converted characters using the current locale. More...
 
std::size_t ByteLengthUTF8 (const char *pUtf, std::size_t cchUtf) noexcept
 Returns a number of bytes occupied by the specified number of multibyte symbols of the string specified in the UTF-8 format. More...
 
std::size_t CharacterLengthUTF8 (const char *pszUtf, std::size_t cbUtf) noexcept
 Returns a number of multibyte symbols that are composed from the specified number of bytes of a string encoded in the UTF-8 format. More...
 
std::size_t ConvertUCSToUTF8 (char *restrict pUtf, std::size_t cbUtf, const wchar_t *restrict pUcs, std::size_t *pcchUcs) noexcept
 Converts a wide UCS2/UCS-4 string to its multibyte UTF-8 equivalent and returns a number of bytes occupied by the converted multibyte string independently on the current C locale. More...
 
std::size_t ConvertUTF8ToUCS (wchar_t *restrict pStringW, std::size_t cchStringW, const char *restrict pStringMB, std::size_t cbStringMB) noexcept
 Converts a multibyte UTF-8 string into its wide UCS-2 (or UCS-4) equivalent and returns a number of successfully converted characters independently on the current C locale. More...
 
constexpr ucp_t operator""_ucp (char ch) noexcept
 A unicode code point literal denotion. More...
 
std::size_t ucpsLen (const ucp_t *pszStr) noexcept
 Returns a size, in code points, of a zero terminated string of Unicode code points excluding the terminating zero. Implements the behaviour of the standard strlen function of the string.h header for the type ucp_t. More...
 
std::size_t ucpsnLen_s (const ucp_t *pszStr, std::size_t cchStr) noexcept
 Returns a size, in code points, of a string excluding the terminating zero, if any. The function behaves identically to the strnlen_s function of the ISO/IEC TR 24731-1 standard but is implemented for the type ucp_t. More...
 
ucp_t * ucpsCpy (ucp_t *restrict pDest, const ucp_t *restrict pszSrc) noexcept
 Performs a copying of a zero-terminated string of Unicode code points from a source buffer to a destination one. Behaves identically to the standard strcpy function but implemented for the type ucp_t. More...
 
errno_t ucpsCpy_s (ucp_t *restrict pDest, rsize_t cchDest, const ucp_t *restrict pszSrc) noexcept
 Copies a zero-terminated source string of Unicode code points, including a terminating null code, to a destination buffer of a specified size. More...
 
int ucpsCmp (const ucp_t *pszStr1, const ucp_t *pszStr2) noexcept
 Lexicographically compares two strings of Unicode code points. More...
 
int ucpsnCmp (const ucp_t *pszStr1, const ucp_t *pszStr2, std::size_t count) noexcept
 Lexicographically compares two strings of Unicode code points with explicitly defined maximum number of characters to read and compare. More...
 
long ucpstol (const ucp_t *restrict str, ucp_t **restrict str_end, int base) noexcept
 Converts a number withing a string to the type "long". More...
 
long long ucpstoll (const ucp_t *restrict str, ucp_t **restrict str_end, int base) noexcept
 Converts a number withing a string to the type "long long". More...
 
unsigned long ucpstoul (const ucp_t *restrict str, ucp_t **restrict str_end, int base) noexcept
 Converts a number withing a string to the type "unsigned long". More...
 
unsigned long long ucpstoull (const ucp_t *restrict str, ucp_t **restrict str_end, int base) noexcept
 Converts a number withing a string to the type "unsigned long long". More...
 
float ucpstof (const ucp_t *restrict str, ucp_t **restrict str_end) noexcept
 Converts a floating point number withing a string to the type "float". More...
 
double ucpstod (const ucp_t *restrict str, ucp_t **restrict str_end) noexcept
 Converts a floating point number withing a string to the type "double". More...
 
long double ucpstold (const ucp_t *restrict str, ucp_t **restrict str_end) noexcept
 Converts a floating point number withing a string to the type "long double". More...
 
int u8len (const char *pUtf, std::size_t cbUtf) noexcept
 Determines a number of bytes contained in a UTF-8 character. More...
 
bool u8check (const char *pUtf8, std::size_t cbUtf8) noexcept
 Verifies validity of a given UTF-8 code according to the Unicode 11.0 standard. More...
 
bool u8check_cp (ucp_t ucp) noexcept
 Verifies that the given Unicode code point is representable in UTF-8 format as specified by the Unicode 11.0 standard. More...
 
int ucptou8 (char *pUtf, ucp_t ucp) noexcept
 Determines the number of bytes needed to represent a Unicode 11.0 code point as a multibyte UTF-8 code and optionally performs the conversion and stores the resulting UTF-8 code in the specified buffer. More...
 
int u8toucp (ucp_t *restrict pUcp, const char *restrict pUtf, std::size_t cbUtf) noexcept
 Inspects at most the given number of bytes of a buffer to determine how many bytes must be read from it in order to successfully perform a conversion of one UTF-8 code to a single Unicode code point and optionally writes the resulting code point into a provided buffer. More...
 
std::size_t u8stoucps (ucp_t *restrict pUcp, const char *restrict pUtf, std::size_t cchChars) noexcept
 Converts a sequence of UTF-8 characters into a sequence of corresponding Unicode 11.0 code points and writes at most the specified number of codes of the resulting sequence to a buffer. More...
 
std::size_t ucpstou8s (char *restrict pUtf, const ucp_t *restrict pUcp, std::size_t cbUtf) noexcept
 Converts a sequence of Unicode 11.0 code points to a sequence of UTF-8 multibyte characters, and writes these characters into a buffer. More...
 
errno_t ucptou8_s (int *restrict pStatus, char *restrict pUtf, rsize_t cbUtf, ucp_t ucp) noexcept
 Securely converts a given Unicode 11.0 code point to its UTF-8 representation. More...
 
errno_t u8toucp_s (int *restrict pStatus, ucp_t *restrict pUcp, const char *restrict pUtf, rsize_t cbUtf) noexcept
 Securely converts one multibyte UTF-8 code to a Unicode 11.0 code point. More...
 
errno_t u8stoucps_s (std::size_t *restrict pcchConverted, ucp_t *restrict pUcp, rsize_t cchUcpMax, const char *restrict pszUtf, rsize_t cchUtf) noexcept
 Converts the specified sequence of UTF-8 characters into a sequence of corresponding unicode code points and, if the specified output buffer is given by a non-NULL pointer, stores at most the specified number of wide characters into the buffer terminating the output string by the zero code. More...
 
errno_t ucpstou8s_s (std::size_t *restrict pcbUtf, char *restrict pUtf, rsize_t cbUtfMax, const ucp_t *restrict pUcp, rsize_t cbMaxConvert) noexcept
 Converts the specified sequence of Unicode 11.0 code points into a sequence of corresponding UTF-8 multibyte characters, and, if the specified output buffer is given by a non-NULL pointer, stores the specified number of bytes, constituting the converted UTF-8 multibyte string, to the output buffer terminating it with a null character. More...
 
int u8towc (wchar_t *restrict pUcs, const char *restrict pUtf, std::size_t cbUtf) noexcept
 Inspects at most the given number of bytes of a buffer to determine how many bytes must be read from it in order to successfully perform a conversion of one UTF-8 code to a single wide character in UCS format and optionally writes the resulting wide character into a provided buffer. More...
 
int wctou8 (char *pUtf, wchar_t chUcs) noexcept
 Determines the number of bytes needed to represent a wide character, expected to be in UCS format, in the UTF-8 format. Optionally, the function stores the representation in a buffer provided by the caller. More...
 
std::size_t u8stowcs (wchar_t *restrict pUcs, const char *restrict pUtf, std::size_t cchChars) noexcept
 Converts the specified sequence of UTF-8 characters into a sequence of corresponding wide characters in the UCS format and writes at most the specified number of wide characters to a buffer. More...
 
std::size_t wcstou8s (char *restrict pUtf, const wchar_t *restrict pUcs, std::size_t cbUtf) noexcept
 Converts a sequence of wide characters, assumed to be in the UCS format, to a sequence of UTF-8 multibyte characters, and writes these characters into a buffer. More...
 
errno_t wctou8_s (int *restrict pStatus, char *restrict pUtf, rsize_t cbUtf, wchar_t chUcs) noexcept
 Securely converts a given wide character, assumed to be encoded in the UCS format, to its UTF-8 representation. More...
 
errno_t u8towc_s (int *restrict pStatus, wchar_t *restrict pUcs, const char *restrict pUtf, rsize_t cbUtf) noexcept
 Securely converts one multibyte UTF-8 code to a wide character assumed to be specified in the UCS format. More...
 
errno_t u8stowcs_s (std::size_t *restrict pcchConverted, wchar_t *restrict pUcs, rsize_t cchUcsMax, const char *restrict pszUtf, rsize_t cchUtf) noexcept
 Converts the specified sequence of UTF-8 characters into a sequence of corresponding UCS-2 (or UCS-4) wide characters and, if the specified output buffer is given by a non-NULL pointer, stores at most the specified number of wide characters into the specified array terminating the output string by a null wide character. More...
 
errno_t wcstou8s_s (std::size_t *restrict pcbUtf, char *restrict pUtf, rsize_t cbUtfMax, const wchar_t *restrict pUcs, rsize_t cbMaxConvert) noexcept
 Converts the specified sequence of UCS-encoded wide characters into a sequence of corresponding UTF-8 multibyte characters, and, if the specified output buffer is given by a non-NULL pointer, stores the specified number of bytes, constituting the converted UTF-8 multibyte string, to the output buffer terminating it with a null character. More...
 
ucp_t ucp_tolower (ucp_t ch) noexcept
 Returns a lowercase form of a given Unicode character. More...
 
ucp_t ucp_toupper (ucp_t ch) noexcept
 Returns an uppercase form of a given Unicode character. More...
 
int ucp_isascii (ucp_t ch) noexcept
 Returns a flag indicating whether the given character belongs to the ANSI ASCII code space. More...
 
int ucp_isblank_ascii (ucp_t ch) noexcept
 Checks whether a given character is a space or a horizontal tab. More...
 
int ucp_isspace_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a whitespace character in the ASCII code space. More...
 
int ucp_iscntrl_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a control character in the ASCII code space. More...
 
int ucp_isalpha_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is an alphabetic character in the ASCII code space. More...
 
int ucp_islower_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a lowercase alphabetic character in the ASCII code space. More...
 
int ucp_isupper_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is an uppercase alphabetic character in the ASCII code space. More...
 
int ucp_isdigit_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a digit character in the ASCII code space. More...
 
int ucp_isxdigit_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a hexadecimal digit in the ASCII code space. More...
 
int ucp_ispunct_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a punctuational character in the ASCII code space. More...
 
int ucp_isalnum_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is an alphanumeric character in the ASCII code space. More...
 
int ucp_isgraph_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character belongs to the ASCII code space and has a nonempty graphical representation, i.e. any printable character except for whitespace characters. More...
 
int ucp_isprint_ascii (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is an ASCII code that can be displayed on a graphical output device, i.e. space or any alphanumeric or punctuational character. More...
 
int ucp_isspace (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a whitespace character in the Unicode code space. More...
 
int ucp_iscntrl (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a control code in the Unicode code space. More...
 
int ucp_isalpha (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is an alphabetic character in the Unicode code space. More...
 
int ucp_islower (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a lowercase alphabetic character in the Unicode code space. More...
 
int ucp_isupper (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is an uppercase alphabetic character in the Unicode code space. More...
 
int ucp_isdigit (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a digit character in the Unicode code space. More...
 
int ucp_ispunct (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is a punctuation character in the Unicode code space. More...
 
int ucp_isalnum (ucp_t ch) noexcept
 Returns a flag that indicates whether a given character is an alphanumeric character in the Unicode code space. More...
 
int ucp_isgraph (ucp_t ch) noexcept
 Returns a flag that indicates whether a given Unicode character has a nonempty graphical representation, i.e. any printable character except for whitespace characters. More...
 
int ucp_isprint (ucp_t ch) noexcept
 Returns a flag that indicates whether a given Unicode character can be displayed on an output device. More...
 

Memory allocation functions

Portable allocation functions

typedef AutoCString< char > AutoCStringA
 An alias for the Chusov::String::AutoCString template instantiated for multibyte strings.
 
typedef AutoCString< wchar_t > AutoCStringW
 An alias for the Chusov::String::AutoCString template instantiated for wide strings.
 
char * AllocateStringA (std::size_t cch) noexcept
 Allocates an uninitialized string of the specified size, in bytes, plus a room for a terminating zero. More...
 
wchar_t * AllocateStringW (std::size_t cch) noexcept
 Allocates an uninitialized string of the specified size, in wide symbols, plus a room for the terminating zero. More...
 
void FreeStringA (char *lpszString) noexcept
 The function frees memory occupied by the string buffer the parameter points to. The memory must be allocated by one of the Chusov::Memory or Chusov::String allocation functions. More...
 
void FreeStringW (wchar_t *lpszString) noexcept
 The function frees memory occupied by the string buffer the parameter points to. The memory must be allocated by one of the Chusov::Memory or Chusov::String allocation functions. More...
 
char * AllocateAndCopyStringA (const char *pszRight) noexcept
 Allocates and returns a copy of a string. More...
 
wchar_t * AllocateAndCopyStringW (const wchar_t *pszRight) noexcept
 Allocates and returns a copy of a wide string. More...
 
wchar_t * AllocateWideFromMBS (const char *pszRight, std::size_t cbRight) noexcept
 Allocates and returns a wide copy of a multibyte string. More...
 
char * AllocateMBSFromWide (const wchar_t *pszRight, std::size_t cchRight) noexcept
 Allocates and returns a multibyte version of a wide string. More...
 
wchar_t * AllocateUCSFromUTF8 (const char *pszRight, std::size_t cbRight) noexcept
 Allocates and returns a wide UCS-2 (UCS-4) copy of a multibyte UTF-8 string. More...
 
char * AllocateUTF8FromUCS (const wchar_t *pszRight, std::size_t cchRight) noexcept
 Allocates and returns a multibyte UTF-8 version of a wide string given in the UCS-2 (UCS-4) format. More...
 
char * AllocateUniqueStringA () noexcept
 Allocates and returns a unique string which is a representation of an UUID generated internally. More...
 
wchar_t * AllocateUniqueStringW () noexcept
 Allocates and returns a unique string which is a representation of an UUID generated internally. More...
 
char * AllocateTempPathA () noexcept
 Allocates and returns an absolute path for a directory used for storing temporary files. More...
 
wchar_t * AllocateTempPathW () noexcept
 Allocates and returns an absolute path for a directory used for storing temporary files. More...
 
char * AllocateUniqueTempFileNameA (const char *lpszExtension) noexcept
 Allocates and returns an absolute a unique file name based on UUID and that is to be located in the system temporary directory. More...
 
wchar_t * AllocateUniqueTempFileNameW (const wchar_t *lpszExtension) noexcept
 Allocates and returns an absolute a unique file name based on UUID and that is to be located in the system temporary directory. More...
 
void SecureFreeStringA (volatile char *pszString) noexcept
 Performs secure deallocation of a string buffer randomizing its contents. It is designed to be a more secure version of FreeStringA. More...
 
void SecureFreeStringW (volatile wchar_t *pszString) noexcept
 Performs secure deallocation of a string buffer randomizing its contents. It is designed to be a more secure version of FreeStringW. More...
 
template<class CHAR_TYPE >
CHAR_TYPE * AllocateString (std::size_t cch) noexcept
 Allocates an uninitialized string of the specified size plus a room for the terminating zero. More...
 
template<class CHAR_TYPE >
void FreeString (CHAR_TYPE *lpszString) noexcept
 The function frees memory occupied by the string buffer the parameter points to. The memory must be allocated by one of the Chusov::Memory or Chusov::String allocation functions. More...
 
template<class RETURN_CHAR_TYPE , class SOURCE_CHAR_TYPE >
RETURN_CHAR_TYPE * AllocateAndCopyString (const SOURCE_CHAR_TYPE *pszRight) noexcept
 Allocates and returns a copy of a string. More...
 
template<class CHAR_TYPE >
CHAR_TYPE * AllocateUniqueString () noexcept
 Allocates and returns a unique string which is a representation of an UUID generated internally. More...
 
template<class CHAR_TYPE >
CHAR_TYPE * AllocateTempPath () noexcept
 Allocates and returns an absolute path for a directory used for storing temporary files. More...
 
template<class CHAR_TYPE >
CHAR_TYPE * AllocateUniqueTempFileName (const CHAR_TYPE *lpszExtension) noexcept
 Allocates and returns an absolute a unique file name based on UUID and that is to be located in the system temporary directory. More...
 
template<class CHAR_TYPE >
void SecureFreeString (volatile CHAR_TYPE *pszString) noexcept
 Performs secure deallocation of a string buffer randomizing its contents. It is designed to be a more secure version of FreeString. More...
 

Detailed Description

Functions and elements used to manage strings, convert them to different encoding schemes and to manage associated memory.