Currently we find SSE3 and SSE4.1 code mixed togehter along with
generic code in one file. This introduces the risk that the
compiler exidantly mixes SSE4.1 instructions into an SSE3, or
even worse into a generic code path.
This commit splits the SSE3 and SSE4.1 code into separate files
and compiles them with the matching target options.
Change-Id: I846e190e92f1258cd412d1b2d79b539e204e04b3
The current implementation can select the SSE support level during
compiletime only.
This commit adds functionality to automatically detect and switch
the SSE support level and automatically switch the Implementation
if the CPU does not support the required SSE level.
Change-Id: Iba74f8a6e4e921ff31e4bd9f0c7c881fe547423a
The non-sse implementation and the sse implementation of the convert
and convolve functions have different parameter lists. This makes it
difficult to use function pointers in order to select the right
function depending on the SSE-Level and CPU.
This commit uniformizes the parameter lists in preparation for
planned runtime cpu detection support
Change-Id: Ice063b89791537c4b591751f12f5ef5c413a2d27
An errant shuffle register value used in complex-complex convolution
causes distorted correlation peak-to-average values for certain TSC
values. The error effect varies for different TSC sequences with the
most noticeable effect of degraded detection on TSC 1 and no effect on
TSC 7.
Signed-off-by: Thomas Tsou <tom@tsou.cc>
Move x86 specific files into their own directory as this
area is about to get crowded with the addition of ARM
support.
Signed-off-by: Thomas Tsou <tom@tsou.cc>