Visual Basic stores all strings as double wide characters (16 bits). This is no big deal if you are using standard ASCII characters (7 bits), as the first 9 bits are always zero. But when you need to use ANSI characters (8 bit), the Unicode conversion that VB does in the background creates a problem. For example, the string (shown as Hex):
31 81 32 82 33 83 34 84 35 85 36 86 37 87
gets stored in memory as:
31 00 81 00 32 00 1A 20 33 00 92 01 34 00 1E 20
35 00 26 20 36 00 20 20 37 00 21 20
The character &H82 gets changed to &H20 &H1A, as well as several others. To convert one of these strings to a byte array, I have been using the following code:
And to convert it back to a string, I have been using:
Looping through the first routine 10,000 times took an average of 71.7 ms with a spread of 16 ms. Looking for a more efficient way to do these conversions, I investigated the "RtlUnicodeStringToAnsiString" function in "ntdll.dll".
Looping through this routine 10,000 times took an average of 37.4 ms with a spread 16 ms. The advantage of this routine is that it not only returns the byte array, but also the corrected string. But there is a down side. If you pass an already corrected string through this routine again, it changes the corrected characters to &H3F ("?"). For example the corrected string:
31 81 32 82 33 83 34 84 35 85 36 86 37 87
gets converted to:
31 81 32 3F 33 3F 34 3F 35 3F 36 3F 37 3F
Even though the UniToAnsi routine is almost twice as efficient as the StrToByte routine, for me it was not worth the risk of doing a double conversion.
J.A. Coutts
31 81 32 82 33 83 34 84 35 85 36 86 37 87
gets stored in memory as:
31 00 81 00 32 00 1A 20 33 00 92 01 34 00 1E 20
35 00 26 20 36 00 20 20 37 00 21 20
The character &H82 gets changed to &H20 &H1A, as well as several others. To convert one of these strings to a byte array, I have been using the following code:
Code:
Public Function StrToByte(strInput As String) As Byte()
Dim lPntr As Long
Dim bTmp() As Byte
Dim bArray() As Byte
If Len(strInput) = 0 Then Exit Function
ReDim bTmp(LenB(strInput) - 1) 'Memory length
ReDim bArray(Len(strInput) - 1) 'String length
CopyMemory bTmp(0), ByVal StrPtr(strInput), LenB(strInput)
'Examine every second byte
For lPntr = 0 To UBound(bArray)
If bTmp(lPntr * 2 + 1) > 0 Then
bArray(lPntr) = Asc(Mid$(strInput, lPntr + 1, 1))
Else
bArray(lPntr) = bTmp(lPntr * 2)
End If
Next lPntr
StrToByte = bArray
End Function
Code:
Public Function ByteToStr(bArray() As Byte) As String
Dim lPntr As Long
Dim bTmp() As Byte
ReDim bTmp(UBound(bArray) * 2 + 1)
For lPntr = 0 To UBound(bArray)
bTmp(lPntr * 2) = bArray(lPntr)
Next lPntr
Let ByteToStr = bTmp
End Function
Code:
Option Explicit
Private Declare Function UnicodeToAnsi Lib "ntdll.dll" Alias "RtlUnicodeStringToAnsiString" (ByRef DestinationString As ANSI_STRING, ByVal SourceString As Long, Optional ByVal AllocateDestinationString As Byte) As Long
Private Declare Function AnsiToUnicode Lib "ntdll.dll" Alias "RtlAnsiStringToUnicodeString" (ByVal DestinationString As Long, ByRef SourceString As ANSI_STRING, Optional ByVal AllocateDestinationString As Byte) As Long
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long)
Private Type UNICODE_STRING
Len As Integer
MaxLen As Integer
Buffer As String
End Type
Private Type ANSI_STRING
Len As Integer
MaxLen As Integer
Buffer As Long
End Type
Private Function UniToAnsi(sUnicode As String) As Byte()
Dim UniString As UNICODE_STRING
Dim AnsiString As ANSI_STRING
Dim Buffer() As Byte
If Len(sUnicode) = 0 Then Exit Function
UniString.Buffer = sUnicode
UniString.Len = LenB(UniString.Buffer)
UniString.maxLen = UniString.Len + 2
AnsiString.Len = Len(UniString.Buffer)
AnsiString.maxLen = AnsiString.Len + 1
ReDim Buffer(AnsiString.Len) As Byte
AnsiString.Buffer = VarPtr(Buffer(0))
If UnicodeToAnsi(AnsiString, VarPtr(UniString)) = 0 Then
UniToAnsi = Buffer
ReDim Preserve UniToAnsi(UBound(Buffer) - 1)
sUnicode = ByteToStr(UniToAnsi)
End If
End Function
31 81 32 82 33 83 34 84 35 85 36 86 37 87
gets converted to:
31 81 32 3F 33 3F 34 3F 35 3F 36 3F 37 3F
Even though the UniToAnsi routine is almost twice as efficient as the StrToByte routine, for me it was not worth the risk of doing a double conversion.
J.A. Coutts