2.4 Extended Conversion Tasks

The following are some common tasks you need to perform when using an extended converter:

2.4.1 Converting Bytes to Unicode with an Extended Converter

  1. If you have not already done so, call NWUXLoadByteUnicodeConverter. Specify the relevant code page for the codepage parameter.

  2. Supply a buffer of sufficient length to hold the Unicode output string. If length is important to your application and is not known follow the steps in

  3. Call NWUXByteToUnicode or NWUXLenByteToUnicode unless the string is a path. For path strings, call NWUXByteToUnicodePath or NWUXLenByteToUnicodePath

    • For the byteUniHandle parameter, pass the handle returned from the appropriate call to NWUXLoadByteUnicodeConverter. Note that it is possible to have called the function more than once to make conversions for separate code pages.
    • For the unicodeOutput parameter, pass a pointer to the Unicode output string buffer.
    • For the outputBufferLen parameter, pass the maximum length of the output buffer, including the NULL terminator.
    • For the byteInput parameter, pass a pointer to the input byte buffer.
    • For inLength of the *Len* functions, pass the length of the input string in bytes (might not include a NULL terminator).
    • For the actualLength parameter, pass the address of the pointer to the output length. Note that the length pointed to on return does not include the NULL terminator.
  4. If you have called NWUXByteToUnicode to determine the required length of the output buffer, call NWUXByteToUnicode a second time. This time use outLength from the returned previous call to determine the required length for the output buffer.

  5. Use the returned Unicode string as needed.

  6. When you no longer need the converter, free it by calling NWUXUnicodeRelease and passing the relevant converter handle.

For related information, see Converting Unicode to Bytes with an Extended Converter

2.4.2 Converting Unicode to Bytes with an Extended Converter

  1. If you have not already done so, call NWUXLoadByteUnicodeConverter. Specify the relevant code page for the codepage parameter.

  2. Supply a buffer of sufficient length to hold the byte output string. If you don’t know that length, see Determining Output String Length with an Extended Converter.

  3. Call NWUXUnicodeToByte with the following parameter specifications:

    • For the byteUniHandle parameter, pass the handle returned from the appropriate call to NWUXLoadByteUnicodeConverter. Note that it is possible to have called the function more than once to make conversions for separate code pages.
    • For the byteOutput parameter, pass a pointer to the byte output string buffer.
    • For the outputBufferLen parameter, pass the maximum length of the output buffer, including the NULL terminator.
    • For the unicodeInput parameter, pass a pointer to the input Unicode buffer.
    • For the outLength parameter, pass the address of the pointer to the output length. Note that the length pointed to on return does not include the NULL terminator.
  4. If you have called NWUXUnicodeToByte to determine the required length of the output buffer, call NWUXUnicodeToByte a second time. This time use outLength from the returned previous call to determine the required length for the output buffer.

  5. Use the returned byte string as needed.

  6. When you no longer need the converter, free it by calling NWUXUnicodeRelease and passing the relevant converter handle.

For related information, see Converting Bytes to Unicode with an Extended Converter

2.4.3 Converting Path Strings with an Extended Converter

  1. If you have not done so already, call NWUXLoadByteUnicodeConverter.

  2. Convert as explained in Converting Bytes to Unicode with an Extended Converter or Converting Unicode to Bytes with an Extended Converter, but call one of the following path-specific functions:

2.4.4 Determining Output String Length with an Extended Converter

NOTE:Although the example in this task is written to convert from byte to Unicode, the same general procedure can be used for determining output string length for any extended conversion.

  1. If you have not done so already, call the appropriate function to load the required converter:

    • NWUXLoadByteUnicodeConverter
    • NWUXLoadCaseConverter
  2. With the converter loaded, call the appropriate conversion function once to determine the required length of the output buffer. Pass a NULL for the output buffer parameter.

    NWUXByteToUnicode(converter, NULL, 0, inbuf, &actualLength);

    When the function returns, the actualLength parameter points to the output string length.

    NOTE:The length pointed to is only the number of characters in the string. It does not include a NULL terminator.

  3. Add one the returned actualLength for the NULL terminator. If the output is to be a Unicode string, multiply actualLength by sizeof(unicode) to get the required number of bytes. Then allocate memory.

    bufsiz=acutalLen+1;

    outbuf=(punicode)malloc(bufsize*sizeof(unicode));

  4. Do the real conversion.

    NWUXByteToUnicode(converter, outbuf, bufsize, inbuf, NULL);

  5. When the converter is no longer needed, free the output buffer and call NWUXUnloadConverter as explained in Unloading Converters.

For related information, see:

2.4.5 Handling Unmappable Characters with an Extended Converter

By default, the extended Unicode API uses the Default Conversion Behavior. However, the developer can change that process in several ways:

Change the NoMap action

Change the Substitution character

Change the handler function

Change scan/parse functions

  1. To make a change, then reset functions to pre-change behavior:

    • Before making the change, call the NWUXGet.. version of the relevant function (identified in steps 2 through 5 below), and save the return.
    • Call the NWUXSet.. version to make the change.
    • After performing conversions with the changed setting, restore pre-change settings by calling the NWUXSet.. version again and passing the values returned from the call to NWUXGet...
  2. To change the NoMap action, call NWUXSetNoMapAction, and set the noMapByteAction or noMapUniAction to

    • NWU_RETURN_ERROR
    • NWU_SUBSTITUTE or
    • NWU_CALL_HANDLER

    Set either of these parameters to NWU_UNCHANGED_ACTION if no change is needed.

  3. The change the substitution character, call NWUXSetSubByte or NWUXSetSubUni and set the substituteByte or substituteUni parameter to the new substitution character.

  4. To change the function handler, call NWUXSetByteFunctions or NWUXSetUniFunctions and set the noMapByteFunc or noMapUniFunc parameter to point to the new function.

  5. To return all settings to the system defaults, call NWUXResetConverter and pass the handle of the converter.

For related information, see:

2.4.6 Setting Substitution Characters with an Extended Converter

Substitution characters can be used as one option to replace unmappable characters in either Unicode-to-byte or byte-to-Unicode conversions. The other options are to return an error or to call a handler function.

  1. If you have not already done so, call NWUXLoadByteUnicodeConverter to initialize the converter and obtain a converter handle.

  2. If you need to save and restore the original substitution character, call the appropriate function to obtain current substitution character information:

    • For Unicode-to-byte conversions, call NWUXGetSubByte.
    • For byte-to-Unicode conversions, call NWUXGetSubUni.
    • Save the returned character for later restoration.
  3. Call the appropriate function to set a new substitution character:

    • For Unicode-to-byte conversions, call NWUXSetSubByte.
    • For byte-to-Unicode conversions, call NWUXSetSubUni.
    • For either function, pass the converter handle and the new substitution character.
  4. Convert the strings as needed.

  5. When the most recently set substitution characters are no longer needed, reset as appropriate:

  6. When the converter is not longer needed, call NWUXUnloadConverter, passing the appropriate converter handle.

For related information, see:

2.4.7 Setting Scan/Parse Functions with an Extended Converter

  1. If you have not already done so, call NWUXLoadByteUnicodeConverter. Specify the relevant code page for the codepage parameter.

  2. If you want to return to the current settings after a change, call the Get version of the relevant function and save the return before calling the Set function to make the change.

  3. The enable or disable scan/parse functions, call NWUXSetScanAction and set either the scanByteAction or the scanUniAction parameter to

    • NWU_ENABLED or
    • NWU_DISABLED

    Pass NWU_UNCHANGED to either parameter that requires no change.

    The default is disabled for Unicode-to-byte conversion and enabled for byte-to-Unicode conversion.

  4. To set either a scan or parse (or both) function other than the default system functions, call NWUXSetByteFunctions or NWUXSetUniFunctions and pass a pointer to the new function(s). Pass NWU_UNCHANGED_FUNCTION to pointers in these functions for which not change is needed.

For related information, see: