stristr, case insensitive string searchDownload source code and demo - 27 Kb
IntroductionI can't count how many times I have rambled over the lack of a case-insensitive implementation of A nice benefit is that not only the function can be used as is, it can now also be added to a MFC CString-derived class for instance, P/Invoked in a C# code, and so on. Anytime I had to perform a case-insensitive lookup in the past, I would do one of those following 2 kind of horrible things (and that's a euphemism) only because of the absence of such function or method :
All of this is OVER. What's odd is that not only the C run-time doesn't bring such function to the game, there is no such equivalent either in the MFC CString class, nor in the STL basicstring, nor in the C# String class. Simply put, amazing!
x86 assembly source codeWith this implentation of Here we go with the code :
#pragma warning(disable : 4035)
// stristr /////////////////////////////////////////////////////////
//
// performs a case-insensitive lookup of a string within another
// (see C run-time strstr)
//
// str1 : buffer
// str2 : string to search for in the buffer
//
// example char* s = stristr("Make my day","DAY");
//
// S.Rodriguez, Jan 11, 2004
//
char* stristr(char* str1, char* str2)
{
__asm
{
mov ah, 'A'
mov dh, 'Z'
mov esi, str1
mov ecx, str2
mov dl, [ecx]
test dl,dl ; NULL?
jz short str2empty_label
outerloop_label:
mov ebx, esi ; save esi
inc ebx
innerloop_label:
mov al, [esi]
inc esi
test al,al
je short str2notfound_label ; not found!
cmp dl,ah ; 'A'
jb short skip1
cmp dl,dh ; 'Z'
ja short skip1
add dl,'a' - 'A' ; make lowercase the current character in str2
skip1:
cmp al,ah ; 'A'
jb short skip2
cmp al,dh ; 'Z'
ja short skip2
add al,'a' - 'A' ; make lowercase the current character in str1
skip2:
cmp al,dl
je short onecharfound_label
mov esi, ebx ; restore esi value, +1
mov ecx, str2 ; restore ecx value as well
mov dl, [ecx]
jmp short outerloop_label ; search from start of str2 again
onecharfound_label:
inc ecx
mov dl,[ecx]
test dl,dl
jnz short innerloop_label
jmp short str2found_label ; found!
str2empty_label:
mov eax, esi // empty str2 ==> return str1
jmp short ret_label
str2found_label:
dec ebx
mov eax, ebx // str2 found ==> return occurence within str1
jmp short ret_label
str2notfound_label:
xor eax, eax // str2 nt found ==> return NULL
jmp short ret_label
ret_label:
}
}
#pragma warning(default : 4035)
Using it in your own source codeThe demo code provided in the zip package contains a benchmark suite that find occurences of a given string in an html file using a couple ways. The benchmark purpose is only to know if the additional code branches in I also provide a reference C-based case insensitive search. Since it's not written with assembly code, it won't be compiled into code that maximizes the use of registry rather than the data segment. But this is how CPU with a great amount of L2 cache would come into play.
Enjoy! Stephane Rodriguez- February 6, 2004. |
Home Blog |