André Oliveira
093029a6c3
Make test intrinsics use __m128i
2017-11-17 16:42:35 +01:00
André Oliveira
9d8c2639c1
Change _mm_mpsadbw_epu8 to work with unsigned integers
2017-11-17 16:42:35 +01:00
André Oliveira
7b0e7c6f52
Add _mm_mpsadbw_epu8
2017-11-17 16:42:35 +01:00
André Oliveira
b85d4f799c
Add _mm_minpos_epu16
2017-11-17 16:42:35 +01:00
André Oliveira
b67e9dfe5d
Add _mm_test_all_zeros, _mm_test_all_ones and _mm_test_mix_ones_zeros
2017-11-17 16:42:35 +01:00
André Oliveira
93c76381b7
Add documentation for testz, testc and testnzc
2017-11-17 16:42:35 +01:00
André Oliveira
4ce80f138b
Add _mm_testz_si128, _mm_testc_si128 and _mm_testnzc_si128
...
This should work for any 128 bit sized vector, but it only accepts i64x2 for now
2017-11-17 16:42:35 +01:00
André Oliveira
38f6087b9a
Add _mm_mul_epi32 and _mm_mullo_epi32
2017-11-17 16:42:35 +01:00
André Oliveira
613cacb317
Add remaining _mm_cvtep* intrinsics
2017-11-17 16:42:35 +01:00
André Oliveira
ac11d6941d
Add _mm_cvtepu8_epi{16, 32, 64}
2017-11-17 16:42:35 +01:00
André Oliveira
48027e994b
Add _mm_cvtepi32_epi64 and fix typo
2017-11-17 16:42:35 +01:00
Tony Sifkarovski
60c2608cce
[avx2] add _mm_256_cvtepu{8,16,32}_epi{16,32,64} ( #192 )
2017-11-17 09:22:18 +01:00
crypto-universe
1842e36d00
[x86][sse4.1] Add phminposuw & pmul* instructions
...
pmulld is implemented via multiplication.
2017-11-16 07:12:14 -05:00
gnzlbg
955fd849ff
implement missing std::ops
2017-11-13 06:42:49 -05:00
gnzlbg
6ed424a848
syn API breaking change ( #189 )
2017-11-11 23:35:00 +01:00
crypto-universe
bdaea04f2b
[x86][sse4.1] Add pmin* instructions ( #186 )
2017-11-08 23:05:27 -06:00
Caio
545a2a8e2a
Add _mm_unpackhi_pd and _mm_unpacklo_pd ( #184 )
...
* Add _mm_unpackhi_pd and _mm_unpacklo_pd
2017-11-08 11:22:21 +01:00
gnzlbg
20324666f5
[ci] fix formatting and clippy ( #182 )
2017-11-07 09:00:55 -06:00
Malo Jaffré
664395e25e
Fix a confusing typo in a cast name. ( #179 )
2017-11-06 12:45:31 -06:00
André Oliveira
a05fb1b292
Add the necessary SIMD types for sign extend intrinsics
2017-11-06 07:17:27 -05:00
André Oliveira
bab1c7b16a
Avoid using simd_cast directly
2017-11-06 07:17:27 -05:00
André Oliveira
866596cd53
Add _mm_cvtepi16_epi32 and _mm_cvtepi16_epi64 (commented)
2017-11-06 07:17:27 -05:00
André Oliveira
fa240f2477
Add commented implementation of _mm_cvtepi8_epi64
2017-11-06 07:17:27 -05:00
André Oliveira
37396f3471
Add _mm_cvtepi8_epi32
...
- This might be wrong since the cast and the shuffle nedded to be inverted
2017-11-06 07:17:27 -05:00
André Oliveira
f9caf376b2
Add _mm_cvtepi8_epi16
2017-11-06 07:17:27 -05:00
André Oliveira
d6c990967b
Add _mm_packus_epi32 and _mm_cmpeq_epi64 intrinsics
2017-11-06 07:17:27 -05:00
Adam Niederer
a6d9d0c100
Fix mm256_round_epi* return types ( #173 )
...
From the Intel intrinsics manual (emphasis mine):
> Compute the absolute value of packed 16-bit integers in a, and store the
> *unsigned* results in dst.
2017-11-05 20:56:07 -06:00
gwenn
6d4ea09a21
Avx ( #172 )
...
* avx: _mm256_load_pd, _mm256_store_pd, _mm256_load_ps, _mm256_store_ps
* avx: _mm256_load_si256, _mm256_store_si256
2017-11-05 20:55:32 -06:00
Malo Jaffré
74870635e5
Add SSE2 trivial aliases and conversions. ( #165 )
...
`_mm_cvtsd_f64`, `_mm_cvtsd_si64x` and `_mm_cvttsd_si64x`.
See #40 .
2017-11-02 14:10:50 -04:00
gnzlbg
542aac988a
[ci] enable clippy ( #62 )
...
* [ci] enable clippy
* [clippy] fix clippy issues
2017-11-02 13:43:12 -04:00
gwenn
96111d548e
Avx ( #163 )
...
* avx: _mm256_testnzc_si256
* avx: _mm256_shuffle_ps
8 levels of macro expansion takes too long to compile.
* avx: remove useless 0 in tests
* avx: _mm256_shuffle_ps
Macro expansion can be reduced to four levels
* avx: _mm256_blend_ps
Copy/paste from avx2::_mm256_blend_epi32
2017-11-01 08:47:40 -05:00
Alex Crichton
5cb3986530
Bump to 0.0.3
2017-10-30 15:53:07 -07:00
gnzlbg
d6aefaabea
[aarch64] refactor AArch64 intrinsics into its own architecture module ( #162 )
2017-10-29 11:37:43 -05:00
gnzlbg
7f35e50563
[runtime-detection-x86] detect avx and avx2 only if osxsave is true ( #154 )
2017-10-28 16:34:09 -04:00
Mrowqa
0c9ac36595
x86: implemented roundings for SSE4.1 ( #158 )
...
* x86: implemented roundings for SSE4.1
* x86: sse41 roundings - added docs and fixed assert__* tests
2017-10-28 16:32:14 -04:00
gnzlbg
46c6e9beb6
[fmt] use cargo fmt --all ( #161 )
2017-10-28 16:29:52 -04:00
gnzlbg
69d2ad85f3
[ci] check formatting ( #64 )
...
* [ci] check formatting
* [rustfmt] reformat the whole library
2017-10-27 11:55:29 -04:00
Mrowqa
5869eca3e9
x86: implemented _mm{,256}_maskstore_epi{32,64} ( #155 )
...
* x86: implemented maskloads for avx2
* x86: added docs and tests for avx2 maskloads
* x86: refactor - changed `a` to `mem_addr` in avx2 mask loads for consistency
* x86: implemented _mm{,256}_maskstore_epi{32,64}
2017-10-27 11:40:48 -04:00
Henry de Valence
1c67fc00e7
avx2: add _mm256_shuffle_epi32 reusing _mm_shuffle_epi32 code ( #156 )
2017-10-27 11:10:11 -04:00
gnzlbg
ad48780fca
[arm] vadd, vaddd, vaddq, vaddl
2017-10-26 10:18:00 -04:00
Mrowqa
ae0688c7fa
x86: fixed testing equality of floating point numbers ( #150 )
...
* x86: fixed testing equality of floating point numbers
* x86: removed unused macro branch
* x86: marked assert_approx_eq as used only in tests
2017-10-25 09:57:35 -04:00
gwenn
ea51cbcf25
avx: fix *si256 methods ( #145 )
...
* avx: fix *si256 methods
* avx: _mm256_setr_m128
* avx: _mm256_setr_m128d
* avx: _mm256_setr_m128i
* avx: _mm256_loadu2_m128
* avx: _mm256_loadu2_m128d
* avx: _mm256_loadu2_m128i
* avx: _mm256_storeu2_m128
* sse2: _mm_storeu_pd
* avx: _mm256_storeu2_m128d
* sse2: _mm_undefined_si128
* avx: _mm256_storeu2_m128i
* Try to fix i586 build
2017-10-25 01:26:19 -04:00
Henry de Valence
0f33ca5518
avx2: add _mm256_unpack{hi,lo}_epi{8,16,32,64} ( #147 )
2017-10-24 20:12:23 -04:00
gnzlbg
3e1e52f413
update readme and crates.io badges, categories, etc. ( #141 )
...
* [readme] badges
* [crates.io] add badges, categories, etc.
2017-10-23 08:37:41 -05:00
Steven Fackler
6f134c3dfa
Make vector constructors const functions ( #137 )
2017-10-23 08:35:43 -05:00
Thomas Schilling
8b6f5d183e
Add some SSE _mm_cvt* instructions ( #136 )
...
* Add single output _mm_cvt[t]ss_* variants
The *_pi variants are currently blocked by
https://github.com/rust-lang-nursery/stdsimd/issues/74
* Add _mm_cvtsi*_ss
The _mm_cvtpi*_ps intrinsics are blocked by
https://github.com/rust-lang-nursery/stdsimd/issues/74
* Fix Linux builds
Also the si64 variants are only available on x86_64
2017-10-23 08:35:28 -05:00
Steven Fackler
76d9b89ab2
Implement _mm256_permute4x64_epi64 ( #144 )
2017-10-23 08:35:03 -05:00
gnzlbg
1f44e3166e
Deny all warnings and fix errors ( #135 )
...
* [travis-ci] deny warnings
* fix all warnings
2017-10-22 12:30:26 -05:00
gnzlbg
8fa5e7bcf5
[travis-ci] allow testing on all branches ( #134 )
2017-10-22 07:43:48 -05:00
jneem
192c4ac4fd
avx2: signed extensions ( #132 )
...
_mm256_cvtepi8_epi16
_mm256_cvtepi8_epi32
_mm256_cvtepi8_epi64
_mm256_cvtepi16_epi32
_mm256_cvtepi16_epi64
_mm256_cvtepi32_epi64
2017-10-21 15:00:13 -05:00