Commit Graph

2485 Commits

Author SHA1 Message Date
André Oliveira
093029a6c3 Make test intrinsics use __m128i 2017-11-17 16:42:35 +01:00
André Oliveira
9d8c2639c1 Change _mm_mpsadbw_epu8 to work with unsigned integers 2017-11-17 16:42:35 +01:00
André Oliveira
7b0e7c6f52 Add _mm_mpsadbw_epu8 2017-11-17 16:42:35 +01:00
André Oliveira
b85d4f799c Add _mm_minpos_epu16 2017-11-17 16:42:35 +01:00
André Oliveira
b67e9dfe5d Add _mm_test_all_zeros, _mm_test_all_ones and _mm_test_mix_ones_zeros 2017-11-17 16:42:35 +01:00
André Oliveira
93c76381b7 Add documentation for testz, testc and testnzc 2017-11-17 16:42:35 +01:00
André Oliveira
4ce80f138b Add _mm_testz_si128, _mm_testc_si128 and _mm_testnzc_si128
This should work for any 128 bit sized vector, but it only accepts i64x2 for now
2017-11-17 16:42:35 +01:00
André Oliveira
38f6087b9a Add _mm_mul_epi32 and _mm_mullo_epi32 2017-11-17 16:42:35 +01:00
André Oliveira
613cacb317 Add remaining _mm_cvtep* intrinsics 2017-11-17 16:42:35 +01:00
André Oliveira
ac11d6941d Add _mm_cvtepu8_epi{16, 32, 64} 2017-11-17 16:42:35 +01:00
André Oliveira
48027e994b Add _mm_cvtepi32_epi64 and fix typo 2017-11-17 16:42:35 +01:00
Tony Sifkarovski
60c2608cce [avx2] add _mm_256_cvtepu{8,16,32}_epi{16,32,64} (#192) 2017-11-17 09:22:18 +01:00
crypto-universe
1842e36d00 [x86][sse4.1] Add phminposuw & pmul* instructions
pmulld is implemented via multiplication.
2017-11-16 07:12:14 -05:00
gnzlbg
955fd849ff implement missing std::ops 2017-11-13 06:42:49 -05:00
gnzlbg
6ed424a848 syn API breaking change (#189) 2017-11-11 23:35:00 +01:00
crypto-universe
bdaea04f2b [x86][sse4.1] Add pmin* instructions (#186) 2017-11-08 23:05:27 -06:00
Caio
545a2a8e2a Add _mm_unpackhi_pd and _mm_unpacklo_pd (#184)
* Add _mm_unpackhi_pd and _mm_unpacklo_pd
2017-11-08 11:22:21 +01:00
gnzlbg
20324666f5 [ci] fix formatting and clippy (#182) 2017-11-07 09:00:55 -06:00
Malo Jaffré
664395e25e Fix a confusing typo in a cast name. (#179) 2017-11-06 12:45:31 -06:00
André Oliveira
a05fb1b292 Add the necessary SIMD types for sign extend intrinsics 2017-11-06 07:17:27 -05:00
André Oliveira
bab1c7b16a Avoid using simd_cast directly 2017-11-06 07:17:27 -05:00
André Oliveira
866596cd53 Add _mm_cvtepi16_epi32 and _mm_cvtepi16_epi64 (commented) 2017-11-06 07:17:27 -05:00
André Oliveira
fa240f2477 Add commented implementation of _mm_cvtepi8_epi64 2017-11-06 07:17:27 -05:00
André Oliveira
37396f3471 Add _mm_cvtepi8_epi32
- This might be wrong since the cast and the shuffle nedded to be inverted
2017-11-06 07:17:27 -05:00
André Oliveira
f9caf376b2 Add _mm_cvtepi8_epi16 2017-11-06 07:17:27 -05:00
André Oliveira
d6c990967b Add _mm_packus_epi32 and _mm_cmpeq_epi64 intrinsics 2017-11-06 07:17:27 -05:00
Adam Niederer
a6d9d0c100 Fix mm256_round_epi* return types (#173)
From the Intel intrinsics manual (emphasis mine):

> Compute the absolute value of packed 16-bit integers in a, and store the
> *unsigned* results in dst.
2017-11-05 20:56:07 -06:00
gwenn
6d4ea09a21 Avx (#172)
* avx: _mm256_load_pd, _mm256_store_pd, _mm256_load_ps, _mm256_store_ps

* avx: _mm256_load_si256, _mm256_store_si256
2017-11-05 20:55:32 -06:00
Malo Jaffré
74870635e5 Add SSE2 trivial aliases and conversions. (#165)
`_mm_cvtsd_f64`, `_mm_cvtsd_si64x` and `_mm_cvttsd_si64x`.
See #40.
2017-11-02 14:10:50 -04:00
gnzlbg
542aac988a [ci] enable clippy (#62)
* [ci] enable clippy

* [clippy] fix clippy issues
2017-11-02 13:43:12 -04:00
gwenn
96111d548e Avx (#163)
* avx: _mm256_testnzc_si256

* avx: _mm256_shuffle_ps

8 levels of macro expansion takes too long to compile.

* avx: remove useless 0 in tests

* avx: _mm256_shuffle_ps

Macro expansion can be reduced to four levels

* avx: _mm256_blend_ps

Copy/paste from avx2::_mm256_blend_epi32
2017-11-01 08:47:40 -05:00
Alex Crichton
5cb3986530 Bump to 0.0.3 2017-10-30 15:53:07 -07:00
gnzlbg
d6aefaabea [aarch64] refactor AArch64 intrinsics into its own architecture module (#162) 2017-10-29 11:37:43 -05:00
gnzlbg
7f35e50563 [runtime-detection-x86] detect avx and avx2 only if osxsave is true (#154) 2017-10-28 16:34:09 -04:00
Mrowqa
0c9ac36595 x86: implemented roundings for SSE4.1 (#158)
* x86: implemented roundings for SSE4.1

* x86: sse41 roundings - added docs and fixed assert__* tests
2017-10-28 16:32:14 -04:00
gnzlbg
46c6e9beb6 [fmt] use cargo fmt --all (#161) 2017-10-28 16:29:52 -04:00
gnzlbg
69d2ad85f3 [ci] check formatting (#64)
* [ci] check formatting

* [rustfmt] reformat the whole library
2017-10-27 11:55:29 -04:00
Mrowqa
5869eca3e9 x86: implemented _mm{,256}_maskstore_epi{32,64} (#155)
* x86: implemented maskloads for avx2

* x86: added docs and tests for avx2 maskloads

* x86: refactor - changed `a` to `mem_addr` in avx2 mask loads for consistency

* x86: implemented _mm{,256}_maskstore_epi{32,64}
2017-10-27 11:40:48 -04:00
Henry de Valence
1c67fc00e7 avx2: add _mm256_shuffle_epi32 reusing _mm_shuffle_epi32 code (#156) 2017-10-27 11:10:11 -04:00
gnzlbg
ad48780fca [arm] vadd, vaddd, vaddq, vaddl 2017-10-26 10:18:00 -04:00
Mrowqa
ae0688c7fa x86: fixed testing equality of floating point numbers (#150)
* x86: fixed testing equality of floating point numbers

* x86: removed unused macro branch

* x86: marked assert_approx_eq as used only in tests
2017-10-25 09:57:35 -04:00
gwenn
ea51cbcf25 avx: fix *si256 methods (#145)
* avx: fix *si256 methods

* avx: _mm256_setr_m128

* avx: _mm256_setr_m128d

* avx: _mm256_setr_m128i

* avx: _mm256_loadu2_m128

* avx: _mm256_loadu2_m128d

* avx: _mm256_loadu2_m128i

* avx: _mm256_storeu2_m128

* sse2: _mm_storeu_pd

* avx: _mm256_storeu2_m128d

* sse2: _mm_undefined_si128

* avx: _mm256_storeu2_m128i

* Try to fix i586 build
2017-10-25 01:26:19 -04:00
Henry de Valence
0f33ca5518 avx2: add _mm256_unpack{hi,lo}_epi{8,16,32,64} (#147) 2017-10-24 20:12:23 -04:00
gnzlbg
3e1e52f413 update readme and crates.io badges, categories, etc. (#141)
* [readme] badges

* [crates.io] add badges, categories, etc.
2017-10-23 08:37:41 -05:00
Steven Fackler
6f134c3dfa Make vector constructors const functions (#137) 2017-10-23 08:35:43 -05:00
Thomas Schilling
8b6f5d183e Add some SSE _mm_cvt* instructions (#136)
* Add single output _mm_cvt[t]ss_* variants

The *_pi variants are currently blocked by
https://github.com/rust-lang-nursery/stdsimd/issues/74

* Add _mm_cvtsi*_ss

The _mm_cvtpi*_ps intrinsics are blocked by
https://github.com/rust-lang-nursery/stdsimd/issues/74

* Fix Linux builds

Also the si64 variants are only available on x86_64
2017-10-23 08:35:28 -05:00
Steven Fackler
76d9b89ab2 Implement _mm256_permute4x64_epi64 (#144) 2017-10-23 08:35:03 -05:00
gnzlbg
1f44e3166e Deny all warnings and fix errors (#135)
* [travis-ci] deny warnings

* fix all warnings
2017-10-22 12:30:26 -05:00
gnzlbg
8fa5e7bcf5 [travis-ci] allow testing on all branches (#134) 2017-10-22 07:43:48 -05:00
jneem
192c4ac4fd avx2: signed extensions (#132)
_mm256_cvtepi8_epi16
_mm256_cvtepi8_epi32
_mm256_cvtepi8_epi64
_mm256_cvtepi16_epi32
_mm256_cvtepi16_epi64
_mm256_cvtepi32_epi64
2017-10-21 15:00:13 -05:00