Commit Graph

2485 Commits

Author SHA1 Message Date
gnzlbg
45f1e63e15 remove unnecessary fixme 2017-12-13 10:19:09 -05:00
gnzlbg
878fd5b5d9 [avx] document intrinsics that don't correspond to an instruction 2017-12-13 10:19:09 -05:00
gnzlbg
baab3ad7f1 move __m256i to the v256 module 2017-12-13 10:19:09 -05:00
gnzlbg
ae6cff53c7 rework impl of __m64 and __m128i 2017-12-13 10:19:09 -05:00
gnzlbg
dd9a3f92ff move __m128i to the v128 module 2017-12-13 10:19:09 -05:00
gnzlbg
5fb068f74c move __m64 to the v64 module 2017-12-13 10:19:09 -05:00
gnzlbg
8c13c1e4a3 [ssse3] _mm_alignr_pi8 (#235) 2017-12-12 12:57:22 -06:00
Luca Barbato
baace2fc3f Initial PowerPC support
Rely mainly on parsing auxv since the cpuinfo information is incomplete.
2017-12-12 11:54:49 +01:00
Luca Barbato
f775bf3931 Extract the cpu capabilities from the auxiliary vector
Check for neon/asimd and pmull for arm and aarch64.
2017-12-12 11:54:49 +01:00
Luca Barbato
f49009e22c Unbreak detect_features for arm and aarch64 2017-12-12 11:54:49 +01:00
gwenn
4950bfed1a sse2: _mm_stream_* (#228)
* sse2: _mm_stream_si128,si32,pd,si64

* sse2: _mm_stream_* tests

* Disable assert_instr for _mm_stream_si64
2017-12-10 09:11:03 -06:00
gwenn
c2e4bb2e4c sse: __m64 related intrinsics (#230)
* sse: add missing aliases

_m_pextrw, _m_pinsrw, _m_pmovmskb, _m_pshufw

* sse: _mm_maskmove_si64, _m_maskmovq

* sse: _mm_mulhi_pu16, _m_pmulhuw

* sse: _mm_avg_pu8, _m_pavgb

* sse: _mm_avg_pu16, _m_pavgw

* sse: _mm_sad_pu8, _m_psadbw

* sse: _mm_cvtpi32_ps

* sse: _mm_cvtpi32x2_ps
2017-12-10 09:04:02 -06:00
gwenn
cbd52b05c1 Sse (#225)
* sse: _mm_cvt_pi2ps

* sse: _mm_extract_pi16

* sse: _mm_insert_pi16

* sse: _mm_movemask_pi8

* sse: _mm_shuffle_pi16

* sse: fix _mm_insert_pi16 and _mm_extract_pi16

* sse: add tests
2017-12-09 11:20:44 -06:00
gwenn
81630ea994 avx: _mm256_stream_si256, _mm256_stream_pd, _mm256_stream_ps (#227) 2017-12-09 11:20:30 -06:00
gwenn
0f53193641 sse2: _mm_movepi64_pi64, _mm_movpi64_epi64, _mm_cvtpd_pi32, _mm_cvttpd_pi32 2017-12-09 08:23:41 -05:00
gwenn
fcf106e685 ssse3 (#224)
* ssse3: _mm_abs_pi8 failing

Intrinsic has incorrect return type!
<8 x i8> (<8 x i8>)* @llvm.x86.ssse3.pabs.b

* Introduce a x86_mmx type

And make it compatible with i8x8 and u8x8.
Alex suggested to change the i8x8 declaration as:
```
struct i8x8(i64);
```
But I don't see how to make it compatible with the
existing code/macros.

* ssse3: _mm_abs_pi16, _mm_abs_pi32, _mm_shuffle_pi8

* ssse3: _mm_abs_pi16, _mm_abs_pi32, _mm_shuffle_pi8 tests

* Replace x86_mmx by __m64

* ssse3: _mm_sign_pi8, _mm_sign_pi16, _mm_sign_pi32

* ssse3: _mm_mulhrs_pi16

* ssse3: _mm_maddubs_pi16

* ssse3: _mm_hsub_pi16, _mm_hsub_pi32, _mm_hsubs_pi16

* ssse3: _mm_hadd_pi16, _mm_hadd_pi32, _mm_hadds_pi16

* Move some ssse3 intrinsics from i586 to i686
2017-12-03 11:53:36 -06:00
gnzlbg
6461312210 [ci] test i686-apple-darwin (#221)
* [ci] test i686-apple-darwin

* fix overflow on i686-apple-darwin
2017-11-28 17:09:38 -07:00
gnzlbg
8a92a566c9 [sse] _mm_stream_{ps,pi} (#219) 2017-11-28 07:48:26 -08:00
gnzlbg
288a30a93e add mmx module, mmx run-time detection, intrinsics (#220)
* [sse] _mm_cvtps_pi32, _mm_cvt_ps2pi

* [mmx] run-time detection support

* [x86] add mmx module

* [x86] make __m64 public

* [sse] add _mm_cvtps_pi{8,16}, _mm_cvttps_pi32, _mm_cvtt_ps2pi

* move new intrinsics from i586 to i686 module

* mmx requires i686
2017-11-28 07:45:41 -08:00
gnzlbg
ef847ac83b [sse] add _mm_{min, max}_{pi16, pu8} (#218)
* [sse] add _mm_{min, max}_{pi16, pu8}

* format docs
2017-11-27 14:54:28 -08:00
gnzlbg
b8a4b397ad update docs (#217)
* update docs

* cargo clean deletes previous docs

* remove stdsimd from coresimd examples

* use stdsimd instead of coresimd in core docs

* add stdsimd as a dev-dependency of coresimd
2017-11-27 10:47:23 -08:00
Tony Sifkarovski
40a0b1cc92 [avx2] add shuffle, insert/extract i128, permute* (#210)
* [x86][avx2] add _mm256_shuffle{hi,lo}_epi16
* [x86][avx2] add _mm256_{insert,extract}i128_si256
* [x86][avx2] add remaining permute intrinsics
2017-11-26 17:40:26 +01:00
gnzlbg
426621f021 Add FXSAVE/FXRSTOR, update Intel SDE, fix xsave tests (#205)
* [x86] add run-time detection for fxsr
* [x86] add i386 fxsr intrinsics: FXSAVE,FXRSTOR
* [x86_64] add x86_64 fxsr intrinsics: FXSAVE64/FXRSTOR64
* [x86-runtime]: document xsave detection further
* [x86] disable xsaves and xsaves64 tests
2017-11-22 15:25:15 +01:00
gnzlbg
20529701d8 Fix clippy and rust-fmt. 2017-11-22 13:42:58 +01:00
Alex Crichton
922345c005 Use workspaces and fix tests
* Enable a Cargo workspace for the repo
* Disable tests for proc-macro crates
* Move back to mounting source directory read-only
* Refactor test invocation to only test one crate with `--all`
2017-11-22 13:42:58 +01:00
gnzlbg
86fa377cea Only coresimd depends on stdsimd-test. 2017-11-22 13:42:58 +01:00
gnzlbg
cb9888f802 [ci] flag the documentation build bot 2017-11-22 13:42:58 +01:00
gnzlbg
b940d3311a fix doc script 2017-11-22 13:42:58 +01:00
gnzlbg
6a0a55f01a c_void -> *mut u8 2017-11-22 13:42:58 +01:00
gnzlbg
14d0903309 refactor no_std components into the coresimd crate 2017-11-22 13:42:58 +01:00
Adam Niederer
dc9f076480 Add AVX2 gathers (#202)
* Add _mm_[mask_]gatheri32_epi32
* Add _mm[256][_mask]_i32gather_{epi64, pd}
* Add _mm[256][_mask]_gather_ps
* Add _mm[256][_mask]_i64gather_{epi32, epi64, ps, pd}
2017-11-22 09:26:12 +01:00
gnzlbg
2faf11ab44 [readme] point always to latests docs (#206) 2017-11-21 15:05:46 -06:00
gnzlbg
0129d3be76 [nvptx] enable nvptx only when all other targets are disabled (#208)
Closes #207 .
2017-11-21 15:05:05 -06:00
Alex Crichton
8356754fe7 Fix hygiene in various macros (#204) 2017-11-21 12:54:06 -06:00
gnzlbg
bd629147a1 Upgrade to cupid 0.0.5 and cleanup duplicated code in x86 run-time (#203)
* [ci] upgrade to cupid 0.0.5

* [runtime x86] cleanup duplicated code
2017-11-21 08:46:36 -06:00
Jonathan Goodman
f236ef8f6b Fix comments for _mm_cvtepu8_epi{32, 64} (#200) 2017-11-20 16:55:48 -06:00
Alex Crichton
738312d17c Unconditionally flag as #![no_std] (#196)
This is more idiomatic for no-std-compatible crates where imports are
unconditionally rewritten to `core` and then only when necessary `std` is pulled
in explicitly.
2017-11-19 19:53:02 +01:00
gnzlbg
0d11a78a0e refactor the x86 module (#195)
* refactor the x86 module

* document the i686 check

* document strict and intel_sde feature

* document nvptx module
2017-11-19 19:51:53 +01:00
gnzlbg
ff1b88d721 [clippy] fix missing doc on pub item 2017-11-17 17:41:23 +01:00
gnzlbg
ceef91aaba [arm] runtime-detection support 2017-11-17 17:41:23 +01:00
gnzlbg
fe7da57403 [ci] add intel_sde feature 2017-11-17 17:41:23 +01:00
gnzlbg
9e7242ecad [cpuid] Improve docs, implement __get_cpuid_max
Closes #174 .
2017-11-17 17:41:23 +01:00
gnzlbg
00cf3c05eb [x86] cleanup run-time; add SSE4a, AVX-512, and xsave 2017-11-17 17:41:23 +01:00
gnzlbg
9ac630245d [x86] implement cpuid intrinsics 2017-11-17 17:41:23 +01:00
gnzlbg
2fc4c25972 [x86] implement __read/write eflags 2017-11-17 17:41:23 +01:00
gnzlbg
05dd98c643 [x86] implement xsave intrinsics 2017-11-17 17:41:23 +01:00
gnzlbg
7c593d9857 [stdsimd-test] testing conditional on more than one feature 2017-11-17 17:41:23 +01:00
gnzlbg
2136214934 add support for no_std 2017-11-17 16:52:05 +01:00
gnzlbg
fda2ead377 add nvptx architecture 2017-11-17 16:52:05 +01:00
André Oliveira
33e26c0b4a Formatting 2017-11-17 16:42:35 +01:00