What basic datatypes do I need to use?
If you are developing for a cross-platform or cross-compiler implementation, you need to pay attention to how you define a datatype that can contain the code units of your preferred Unicode encoding form in a portable way. For UTF-8 the cross-platform datatype is trivial, as compiler support for an 8-bit character datatype is universal. For UTF-16 or UTF-32, currently the best practice is to use your own typedefs for a 16-bit or 32-bit code unit datatype, and map that to a compiler-specific choice of actual integer data type in a header file. However, the C and C++ language standards have added support for datatypes of guaranteed length, (both 16 and 32 bit) and even for adding a way to declare that a particular datatype contains characters of the corresponding Unicode encoding form. Where vendors are supporting this scheme, you can use it effectively. Non-standard implementations, which support UTF-16 as a wchar_t are widely used, due to the fact that they make life easier for people