We have run into a problem after updating to 2015.1.148
The combination of /arch:SSE3 and /O2 causes that in a vectorized loop where an array is filled the returned data is not updated after a 2nd call.
lowering the optimization to 1 or changing /arch to SSE4.1 or higher solves the problem.
the optrpt then also shows that vectorization is also done under the <Remainder> when things are ok.
We solved it by using /QaxAVX /arch:SSE4.1 which means we dropped support for older processors.