Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libcperciva import #306

Merged
merged 3 commits into from
Feb 24, 2021
Merged

Libcperciva import #306

merged 3 commits into from
Feb 24, 2021

Conversation

gperciva
Copy link
Member

No description provided.

Old compilers (clang before 2016, gcc before 2019) didn't implement the
_mm_loadu_si64() function.

As for the compiler flags: in some cases, all we need is
    -maes -DBROKEN_MM_LOADU_SI64
but sometimes we might need
    -maes -Wno-cast-align -DBROKEN_MM_LOADU_SI64
or
    -maes -Wno-cast-align -Wno-cast-qual -DBROKEN_MM_LOADU_SI64
or even
    -maes -Wno-cast-align -Wno-cast-qual -Wno-missing-prototypes -DBROKEN_MM_LOADU_SI64

However, the more tests we add to cpusupport.sh, the longer it takes.
I figured that since this is only relevant to old compilers, we might as
well give it the full set of flags, so that newer compilers don't spend
time checking it.
Abstracting this function will be useful in the following commit.
As it happens, both intrinsics have the same "Operation" description in
the Intel Intrincs Guide [1] (provided that we interpret the "MAX" byte
as 127, which I think is fair).

_mm_loadu_si64:
    dst[63:0] := MEM[mem_addr+63:mem_addr]
    dst[MAX:64] := 0

_mm_load_sd:
    dst[63:0] := MEM[mem_addr+63:mem_addr]
    dst[127:64] := 0

[1] https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=SSE,SSE2&expand=3340,3340,3421,3421,3340,3421,3340&cats=Load

The actual assembly instructions vary, as do the "description" fields --
for _mm_load_si64, there's no textual definition of the upper 64 bits.

Interestingly, it looks like gcc7 and gcc8 both compile our load_64()
function into movq (which is _mm_loadu_si64), rather than movsd (which
is _mm_load_sd).
@cperciva cperciva merged commit 1376f0a into master Feb 24, 2021
@gperciva gperciva deleted the libcperciva-import branch February 24, 2021 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants