Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I've tried to compile this program on an x64 computer:

#include <cstring>

int main(int argc, char* argv[])
{
  return ::std::strcmp(argv[0],
    "really really really really really really really really really"
    "really really really really really really really really really"
    "really really really really really really really really really"
    "really really really really really really really really really"
    "really really really really really really really really really"
    "really really really really really really really really really"
    "really really really really really really really really really"
    "really really really really really really really really really"
    "really really really really really really really long string"
  );
}

I compiled it like this:

g++ -std=c++11 -msse2 -O3 -g a.cpp -o a

But the resulting disassembly is like this:

   0x0000000000400480 <+0>:     mov    (%rsi),%rsi
   0x0000000000400483 <+3>:     mov    $0x400628,%edi
   0x0000000000400488 <+8>:     mov    $0x22d,%ecx
   0x000000000040048d <+13>:    repz cmpsb %es:(%rdi),%ds:(%rsi)
   0x000000000040048f <+15>:    seta   %al
   0x0000000000400492 <+18>:    setb   %dl
   0x0000000000400495 <+21>:    sub    %edx,%eax
   0x0000000000400497 <+23>:    movsbl %al,%eax
   0x000000000040049a <+26>:    retq 

Why is no SIMD used? I suppose it could be to compare, say, 16 chars at once. Should I write my own SIMD strcmp, or is it a nonsensical idea for some reason?

question from:https://stackoverflow.com/questions/26586060/why-is-strcmp-not-simd-optimized

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
239 views
Welcome To Ask or Share your Answers For Others

1 Answer

In a SSE2 implementation, how should the compiler make sure that no memory accesses happen over the end of the string? It has to know the length first and this requires scanning the string for the terminating zero byte.

If you scan for the length of the string you have already accomplished most of the work of a strcmp function. Therefore there is no benefit to use SSE2.

However, Intel added instructions for string handling in the SSE4.2 instruction set. These handle the terminating zero byte problem. For a nice write-up on them read this blog-post:

http://www.strchr.com/strcmp_and_strlen_using_sse_4.2


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...