rust-lang/rust - rust - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Alex Crichton	30b1145ef7	Migrate the `i586::avx2` module to vendor types (#287 )	2018-01-19 10:32:16 -06:00
Alex Crichton	1ad6d5fa88	Migrate the x86_64 folder to vendor types (#284 )	2018-01-19 10:30:25 -06:00
messense	8deae9ce66	Update links in Cargo.toml to rust-lang-nursery/stdsimd (#288 )	2018-01-18 20:23:50 -06:00
Alex Crichton	c5afde07d2	Migrate the `i586::avx` module to vendor types (#286 ) Closes #285	2018-01-18 11:21:03 -06:00
Alex Crichton	5c8867c7c3	Update `target_feature` syntax (#283 ) This commit updates to the latest nightly's syntax where `#[target_feature = "+foo"]` is now deprecated in favor of `#[target_feature(enable = "foo")]`. Additionally `#[target_feature]` can only be applied to `unsafe` functions for now. Along the way this removes a few exampels that were just left around and also disables the `fxsr` modules as that target feature will need to land in upstream rust-lang/rust first as it's currently unknown to the compiler.	2018-01-17 09:45:02 -06:00
Josef Ippisch	8deead27f2	Implement addition aliases (#281 ) - `_m_paddb` for `_mm_add_pi8` - `_m_paddw` for `_mm_add_pi16` - `_m_paddd` for `_mm_add_pi32` - `_m_paddsb` for `_mm_adds_pi8` - `_m_paddsw` for `_mm_adds_pi16` - `_m_paddusb` for `_mm_adds_pu8` - `_m_paddusw` for `_mm_adds_pu16`	2018-01-13 12:08:53 -06:00
Josef Ippisch	50cf00372d	MMX subtraction instructions (#280 ) * Implement `_m_psubb` * Implement `_m_psubw` * Implement `_m_psubd` * Implement `_m_psubsb` * Implement `_m_psubsw` * Implement `_m_psubusb` * Implement `_m_psubusw` * Have the subtraction intrinsic naming consistent with the addition ones E.g. use `_mm_sub_pi8` instead of `_m_psubb` * Implement all subtraction aliases for the `_mm_*` variants - `_m_psubb` for `_mm_sub_pi8` - `_m_psubw` for `_mm_sub_pi16` - `_m_psubd` for `_mm_sub_pi32` - `_m_psubsb` for `_mm_subs_pi8` - `_m_psubsw` for `_mm_subs_pi16` - `_m_psubusb` for `_mm_subs_pu8` - `_m_psubusw` for `_mm_subs_pu16`	2018-01-12 17:10:51 -06:00
Alex Crichton	e77ebf194a	Migrate the i686 module to vendor types (#279 ) * Migrate `i686::sse` to vendor types * Migrate `i686::sse2` to vendor types * Migrate i686::sse41 to vendor types * Migrate i686::sse42 to vendor types	2018-01-12 14:08:20 -06:00
Alex Crichton	48a7490711	Make rustc's job a little esaier in sse42 (#277 ) Move all the casts from `__m128i` to `i8x16` outside the macro invocations so rustc only has to resolve a few function calls, not thousands!	2018-01-12 11:37:06 -06:00
Alex Crichton	feb8c2b152	Migrate i586::ssse3 to vendor types (#275 )	2018-01-11 23:18:35 -06:00
Alex Crichton	fde52cb334	Migrate i586::sse41 to vendor types (#276 )	2018-01-11 23:18:15 -06:00
Alex Crichton	3148881fa2	Move travis workaround earlier Try to get it used on OSX as well	2018-01-11 08:24:11 -08:00
Alex Crichton	5467c0a008	Migrate i586::sse3 to vendor types (#274 )	2018-01-11 10:13:26 -06:00
Alex Crichton	6d8d2f81e9	Migrate a bunch of i586::sse2 to native types (#273 )	2018-01-10 12:42:26 -06:00
Alex Crichton	baf9d0e7e0	Migrate the `i686::sse` module to vendor types (#269 ) This migrates the entire `i686::sse` module (and touches a few others) to the vendor types.	2018-01-09 13:38:09 -06:00
Jef	248f5441bb	Make `splat` a const fn	2018-01-09 18:38:47 +01:00
Alex Crichton	fd2cc3bc05	Migrate `_mm_add_ss` to `__m128` (#265 ) This commit starts the migration towards Intel's types one intrinsic at a time, starting with `_mm_add_ss`. This is mostly just to get a feel for what the tests will start to look like.	2018-01-09 09:49:08 -06:00
gnzlbg	58664a6f54	More run-time detection improvements (#242 ) * [core/runtime] use getauxval on non-x86 platforms * test coresimd::auxv against auxv crate * add test files from auxv crate * [arm] use simd_test macro * formatting * missing docs * improve docs * reading /proc/self/auxv succeeds only if reading all fields succeeds * remove cc-crate build dependency * getauxval succeeds only if hwcap/hwcap2 are non-zero * fix formatting * move getauxval to stdsimd * delete getauxval-wrapper.c * remove auxv crate dev-dependency from coresimd	2018-01-09 09:23:45 -06:00
Alex Crichton	94fe929a03	Update to a released syn/quote version	2018-01-08 10:10:52 -08:00
Josef Ippisch	705c34b4eb	Implement all addition MMX intrinsics (#266 ) * Implement `_mm_add_pi16` * Implement `_mm_add_pi8` * Implement `_mm_add_pi32` * Implement `_mm_adds_pi16` * Implement `_mm_adds_pi8` * Implement `_mm_adds_pu8` * Implement `_mm_adds_pu16`	2018-01-06 12:36:05 -06:00
Jake Goulding	4667c63113	Add RDTSC and RDTSCP intrinsics (#264 )	2018-01-05 13:30:26 -06:00
gnzlbg	4bb1ea5a05	Completes SSE and adds some MMX intrinsics (#247 ) * Completes SSE and adds some MMX intrinsics MMX: - `_mm_cmpgt_pi{8,16,32}` - `_mm_unpack{hi,lo}_pi{8,16,32}` SSE (is now complete): - `_mm_cvtp{i,u}{8,16}_ps` - add test for `_m_pmulhuw` * fmt and clippy * add an exception for intrinsics using cvtpi2ps	2018-01-04 10:15:23 -06:00
Alex Crichton	4f1f2bd550	Add an exception for vzeroall/vzeroupper on Windows These apparently blow the 20 intstruction limit with all the loads/stores.	2018-01-03 16:02:35 -08:00
Alex Crichton	3441968ffa	Turn down debug level on release mode Apparently helps fix errors about codeview registers on MSVC!	2018-01-03 15:59:31 -08:00
Alex Crichton	edbfae36c0	Lower the instruction limit to 20 (#262 ) Right now it's 30 which is a bit high, most of the intrinsics requiring all these instructions ended up needing to be fixed anyway.	2018-01-03 17:21:01 -06:00
Alex Crichton	07ebce51b8	Assert intrinsic implementations are inlined properly (#261 ) * assert_instr check for failed inlining * Fix `call` instructions showing up in some intrinsics The ABI of types like `u8x8` as they're defined isn't actually the underlying type we need for LLVM, but only `__m64` currently satisfies that. Apparently this (and the casts involved) caused some extraneous instructions for a number of intrinsics. They've all moved over to the `__m64` type now to ensure that they're what the underlying interface is. * Allow PIC-relative `call` instructions on x86 These should be harmless when evaluating whether we failed inlining	2018-01-03 16:37:45 -06:00
gwenn	acc8d3de10	Use llvm builtins where possible (#260 ) * Fix sse::_mm_cvtsi32_ss and sse::_mm_cvtsi64_ss By using LLVM builtins, the expected instruction is correctly generated on all platforms. * Use LLVM builtins for storeu* Just to make sure that the wrong instructions is not related to Rust code.	2018-01-03 15:18:34 -06:00
gwenn	983b72d189	Last missing avx and avx2 intrinsics (#258 ) * avx: _mm256_cvtss_f32, avx2: _mm256_cvtsd_f64, _mm256_cvtsi256_si32 * avx2: _mm256_slli_si256, _mm256_srli_si256 And aliases: _mm256_bslli_epi128 _mm256_bsrli_epi128	2018-01-02 14:33:02 -06:00
Alex Crichton	ec373ba107	Update to `syn` master	2018-01-02 12:32:27 -08:00
Alex Crichton	59ed27cc95	Fix stdsimd-verify for syn master	2017-12-31 09:52:16 -08:00
Alex Crichton	3403b6f06a	Fix compile with `syn` master	2017-12-31 09:19:44 -08:00
gwenn	802a379a4a	sse2: remove duplicates and move intrinsics to x86_64 file (#256 ) * sse2: remove duplicates from i686 file _mm_cvtsi64x_si128 _mm_cvtsi64_si128 _mm_cvtsi128_si64 _mm_cvtsi128_si64x * sse2: move _mm_cvtsi64_sd and _mm_cvtsi64x_sd to x86_64 file	2017-12-31 00:58:14 -06:00
Adam Niederer	9141a063c9	Add bswap (#257 )	2017-12-31 00:57:04 -06:00
gwenn	5ca8c0aa93	sse: _mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps (#255 ) * sse: _mm_cvtpi16_ps, _mm_cvtpu16_ps, _mm_cvtpi8_ps, _mm_cvtpu8_ps And mmx: _mm_cmpgt_pi8 _mm_cmpgt_pi16 _mm_unpackhi_pi16 _mm_unpacklo_pi8 _mm_unpacklo_pi16 * Fix: literal out of range	2017-12-30 11:19:44 -06:00
gwenn	17edf649af	Fix some assert_instr (#254 ) * Fix some assert_instr Missing assert_instr: - _mm_cvtsi32_si128 - _mm_cvtsi128_si32 - _mm_loadl_epi64 - _mm_storel_epi64 - _mm_move_epi64 - _mm_cvtsd_f64 - _mm_setzero_pd - _mm_load1_pd - _mm_load_pd1 - _mm_loaddup_pd Wrong intrusction used: - _mm_hsub_pi16 * Try to fix CI build by disabling some asserts * Exclude some assert_instr on (x86_64, linux)	2017-12-30 11:19:00 -06:00
Alex Crichton	be461b1377	Verify Intel intrinsics against upstream definitions (#251 ) This commit adds a new crate for testing that the intrinsics listed in this crate do indeed match the upstream definition of each intrinsic. A pre-downloaded XML description of all Intel intrinsics is checked in which is then parsed in the `stdsimd-verify` crate to verify that everything we write down is matched against the upstream definitions. Currently the checks are pretty loose to get this compiling but a few intrinsics were fixed as a result of this. For example: * `_mm256_extract_epi8` - AVX2 intrinsic erroneously listed under AVX * `_mm256_extract_epi16` - AVX2 intrinsic erroneously listed under AVX * `_mm256_extract_epi32` - AVX2 intrinsic erroneously listed under AVX * `_mm256_extract_epi64` - AVX2 intrinsic erroneously listed under AVX * `_mm_tzcnt_32` - erroneously had `u32` in the name * `_mm_tzcnt_64` - erroneously had `u64` in the name * `_mm_cvtsi64_si128` - erroneously available on 32-bit platforms * `_mm_cvtsi64x_si128` - erroneously available on 32-bit platforms * `_mm_cvtsi128_si64` - erroneously available on 32-bit platforms * `_mm_cvtsi128_si64x` - erroneously available on 32-bit platforms * `_mm_extract_epi64` - erroneously available on 32-bit platforms * `_mm_insert_epi64` - erroneously available on 32-bit platforms * `_mm256_extract_epi16` - erroneously returned i32 instead of i16 * `_mm256_extract_epi8` - erroneously returned i32 instead of i8 * `_mm_shuffle_ps` - the mask argument was erroneously i32 instead of u32 * `_popcnt32` - the signededness of the argument and return were flipped * `_popcnt64` - the signededness of the argument was flipped and the argument was too large bit-wise * `_mm_tzcnt_32` - the return value's sign was flipped * `_mm_tzcnt_64` - the return value's sign was flipped * A good number of intrinsics used `imm8: i8` or `imm8: u8` instead of `imm8: i32` which Intel was using. (we were also internally inconsistent) * A number of intrinsics working with `__m64` were instead working with i64/u64, so they're now corrected to operate with the vector types instead. Currently the verifications performed are: * Each name in Rust is defined in the XML document * The arguments/return values all agree. * The CPUID features listed in the XML document are all enabled in Rust as well. The type matching right now is pretty loose and has a lot of questionable changes. Future commits will touch these up to be more strict and require closer adherence with Intel's own types. Otherwise types like `i32x8` (or any integers with 256 bits) all match up to `__m256i` right now, althoguh this may want to change in the future. Finally we're also not testing the instruction listed in the XML right now. There's a huge number of discrepancies between the instruction listed in the XML and the instruction listed in `assert_instr`, and those'll need to be taken care of in a future commit. Closes #240	2017-12-29 11:52:27 -06:00
gwenn	44a168a0b8	sse2: implements last remaining intrinsics (#244 ) * sse2: __m64 related intrinsics _mm_add_si64 _mm_mul_su32 _mm_sub_si64 _mm_cvtpi32_pd _mm_set_epi64 _mm_set1_epi64 _mm_setr_epi64 * sse2: _mm_load_sd, _mm_loadh_pd, _mm_loadl_pd * sse2: _mm_store_sd, _mm_storeh_pd, _mm_storel_pd * sse2: _mm_shuffle_pd, _mm_move_sd * sse2: _mm_cast* _mm_castpd_ps _mm_castpd_si128 _mm_castps_pd _mm_castps_si128 _mm_castsi128_pd _mm_castsi128_ps * sse2: add some tests * Try to fix AppVeyor build * sse2: add more tests * sse2: fix assert_instr for _mm_shuffle_pd * Try to fix Travis build * sse2: try to fix AppVeyor build * sse2: try to fix AppVeyor build	2017-12-28 10:22:08 -06:00
Jonathan Goodman	3857c3e88a	fix sse4a _mm_stream_{ss, sd} tests and docs	2017-12-27 22:32:49 +01:00
Alex Crichton	9aa4e30859	Update to `syn` master	2017-12-27 07:56:38 -08:00
gnzlbg	42ec76a3ff	[sse4a] implement non-immediate-mode intrinsics (#249 )	2017-12-22 10:14:41 -06:00
gnzlbg	1db6841813	[fmt] --force rustfmt-nightly	2017-12-22 00:24:23 +01:00
gnzlbg	52cc1abe2c	[fmt] remove fn_call_width option (was removed upstream)	2017-12-22 00:24:23 +01:00
gnzlbg	5850282a1c	use repr(align) to ensure proper alignment in tests	2017-12-22 00:24:23 +01:00
gnzlbg	4fb9420acb	Fix rustfmt (#239 ) * [fmt] manually fix some formatting * [fmt] reformat with rustfmt-nightly * [clippy] fix clippy issues	2017-12-14 19:57:53 +01:00
gnzlbg	5ce0c13009	[ci] powerpc/powerpc64/powerpc64le (#237 ) * [ci] add powerpc/powerpc64 build bots * unbreak stdsimd builds for targets without run-time	2017-12-14 10:44:20 -06:00
Tony Sifkarovski	645008ef32	Add `unchecked` methods, fix _mm_extract_epi* return types (#223 ) * Adds extract_unchecked + replace_unchecked + len (#222 ) * [x86] Fixes the return types + uses extract_unchecked for: * _mm_extract_epi8 * _mm_extract_epi16 * _mm256_extract_epi8 * _mm256_extract_epi16 * Minor changes to the other extract_epi* intrinsics for style consistency These should now zero-extend the extracted int and behave appropriately. An old typo makes these a bit confusing, See this llvm issue.	2017-12-13 19:17:33 +01:00
gnzlbg	6e678ee678	fix clippy warnings	2017-12-13 10:19:09 -05:00
gnzlbg	84e2c7f8e4	fix __m64 imports	2017-12-13 10:19:09 -05:00
gnzlbg	9a81140e00	use i64s for the repr of __m{128,256}i and update casts	2017-12-13 10:19:09 -05:00
gnzlbg	1b987bd270	remove unnecessary mem::uninitialized	2017-12-13 10:19:09 -05:00

... 42 43 44 45 46 ...

2485 Commits