Visual
c++ character sets, Unicode, _MBCS
|
|
|
Q: I
have this simple function call:
MessageBox(NULL,
"Test message", "Title", MB_OK);
The compiler raises the following error and I
don't understand why.
error
C2664: 'MessageBoxW' : cannot convert parameter 2 from
'const char [13]' to 'LPCWSTR'
Types pointed to are unrelated; conversion requires
reinterpret_cast, C-style cast or function-style cast
A:
Simply answered, that happens because the project is built
for UNICODE.
Microsoft run-time library provides Microsoft-specific
generic-text mappings for many data types, routines and
other objects, mappings that are defined in TCHAR.h.
There are three supported character sets:[/list][*]ASCII
(single-byte character set – SBCS)[*]MBCS
(multi-byte character set)[*]Unicode[/list]
The use of one or another character
set is controlled by two pre-processor
directives:
-
_UNICODE:
if defined, Unicode
is the character set used
-
_MBCS:
if defined, MBCS
is used
-
If
neither of the above (mutually-exclusive) is defined,
ASCII is the character set used
The Windows API provides different version of each
function for Unicode and ASCII.
Q: How do I select the character
set?
A: You have to go to Project Properties
> Configuration Properties > General and change the
value of the Character Set
property. The three available options are:
-
Not Set
(neither _UNICODE
nor _MBCS are
defined)
-
Use Multi-byte
Character Set (_MBCS
is defined)
-
Use Unicode
Character Set (_UNICODE
is defined)
Q:
How exactly do the generic-text mapping directives affect
the data types and functions that I'm using?
A: C run-time library functions, such as
_itot, or Windows API functions, such are MessageBox,
aren't functions at all; they are macros.
The C run-time library provides functions for all
character sets and a macro to define one or another of
these functions depending on the used character set. For
instance macro _itot resolves to:
-
_itoa,
when _UNICODE is
not defined
-
_itow,
when _UNICODE is
defined
Similarly, TCHAR
resolves:
-
char,
when _UNICODE is
not defined
-
wchar_t,
when _UNICODE is
defined
You can read more about the mappings in MSDN.
On the other hand, the Windows API comes in two versions:
for Unicode and for ASCII/Multi-byte. If you read the MSDN
page for MessageBox it
says:
The MessageBox
function creates, displays, and operates a message box.
The message box contains an application-defined message
and title, plus any combination of predefined icons and
push buttons.
int MessageBox(
HWND hWnd,
LPCTSTR lpText,
LPCTSTR lpCaption,
UINT uType);
Actually,
MessageBox and LPCTSTR
are both macros. You can see how MessageBox it's defined
in WinUser.h:
#ifdef UNICODE
#define MessageBox MessageBoxW
#else
#define MessageBox MessageBoxA
#endif // !UNICODE
There are
two version of the function, actually: MessageBoxA for
ASCII & MBCS and
MessageBoxW for Unicode. When UNICODE (which is the same
with _UNICODE) is
defined then MessageBox resolves to MessageBoxW and
LPCTSTR to LPCWSTR (i.e. const whar_t*); otherwise
MessageBox resolves to MessageBoxA and LPCTSTR
to LPCSTR (i.e. const char*).
Q:
How do I write my program so that it builds for any of
these character sets without
modifying the code when the character set changes?
A: In a single-byte or multi-byte
character set the strings and characters are not prefixed
my anything ('string', 'c'). However, for Unicode strings
and characters required the suffix L,
such as L"string" and L'c'.
You can use the Microsoft-specific macros _T()
or _TEXT(). These macros are removed by
the pre-processor when _UNICODE
is not defined, and replaced with L when
_UNICODE is defined.
Unicode defined:
Q:
How do I fix the mention line of code?
A: It should be clear now:
MessageBox(NULL,
_T("Test message"), _T("Title"),
MB_OK);
|