Visual Studio Code - A way to ensure std::vector is always aligned for optimal SIMD execution?

user19179144
January 7, 2024
336 views
3 votes
3 Answers

I want to have X amount of std::vectors of equal size, which I can be processed together in a for loop which goes from start to finish in a linear fashion. For example:

for (int i = 0; i < vector_length; i++)
    vector1[i] = vector2[i] + vector3[i] * vector4[i];

I want all this to take full advantage of SIMD instructions. For this to happen, the compiler should be able to assume that each of the vectors are aligned optimally for __m256 use. If the compiler can’t assume this, all sorts of non-optimal loops can be generated and used in the code.

How do I ensure this optimal alignment of std::vectors and optimal code generation for such aligned data?

It can be assumed that each vector has identical data structures inside, which can be added/multiplied together using standard SIMD instructions.

I’m using C++17.

MORE INFORMATION AS REQUESTED BY THE PEOPLE HERE:

32 bytes of alignment is good for my use.

I want to get this running on Intel Macs and PCs. (Xcode + Visual Studio) and later on ARM CPU Macs when I get one of those computers (Xcode again).

Tags: c#c++17 optimization simd stdvector

Answers

Chosen as BEST ANSWER
- user19179144
- January 7, 2024 at 5:10 pm
- 0 votes
0
As couple of people pointed out, there's a related question which can be used to first ensure properly aligned memory owned by the std::vector:

Modern approach to making std::vector allocate aligned memory

That combined with __attribute__((aligned(ALIGNMENT_IN_BYTES))) added to the method parameters (pointers) seems to do the trick. Example:
```
void Process(__attribute__((aligned(ALIGNMENT_IN_BYTES))) const uint8_t* p_source1,
             __attribute__((aligned(ALIGNMENT_IN_BYTES))) const uint8_t* p_source2,
             __attribute__((aligned(ALIGNMENT_IN_BYTES))) uint8_t*       p_destination,
             const int      count)
{
    for (int i = 0; i < count; i++)
        p_destination[i] = p_source1[i] + p_source2[i];
}
```
That seems to compile nicely (checked in Godbolt) so the compiler clearly assumes it can simply use large registers to process the data with SIMD instructions.

Thank you everyone!

(Edit)

- JVApen
- January 7, 2024 at 3:31 pm
- 0 votes
0
The only way to control the allocation of std::vector is by replacing the allocator. Boost has an implementation that ensures alignment: https://www.boost.org/doc/libs/1_84_0/doc/html/align/reference.html#align.reference.classes

Login or Signup to reply.

- ABaumstumpf
- January 7, 2024 at 5:29 pm
- 0 votes
0
Is the size of the data known beforehand or are you using any buffers? Cause then you could just us a normal array with alignas.
And for using SIMD instruction – you could use valarray. That and vector both internally use malloc wich in turn is guaranteed to respect the types alignment.

So std::vector<__m256i> mySIMDVector; is aligned.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Visual Studio Code – A way to ensure std::vector is always aligned for optimal SIMD execution?

Answers