Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

If I have an AVX register with 4 doubles in them and I want to store the reverse of this in another register, is it possible to do this with a single intrinsic command?

For example: If I had 4 floats in a SSE register, I could use:

_mm_shuffle_ps(A,A,_MM_SHUFFLE(0,1,2,3));

Can I do this using, maybe _mm256_permute2f128_pd()? I don't think you can address each individual double using the above intrinsic.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
820 views
Welcome To Ask or Share your Answers For Others

1 Answer

You actually need 2 permutes to do this:

  • _mm256_permute2f128_pd() only permutes in 128-bit chunks.
  • _mm256_permute_pd() does not permute across 128-bit boundaries.

So you need to use both:

inline __m256d reverse(__m256d x){
    x = _mm256_permute2f128_pd(x,x,1);
    x = _mm256_permute_pd(x,5);
    return x;
}

Test:

int main(){
    __m256d x = _mm256_set_pd(13,12,11,10);

    cout << x.m256d_f64[0] << "  " << x.m256d_f64[1] << "  " << x.m256d_f64[2] << "  " << x.m256d_f64[3] << endl;
    x = reverse(x);
    cout << x.m256d_f64[0] << "  " << x.m256d_f64[1] << "  " << x.m256d_f64[2] << "  " << x.m256d_f64[3] << endl;
}

Output:

10  11  12  13
13  12  11  10

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...