rust-lang/rust - rust - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
gwenn	17edf649af	Fix some assert_instr (#254 ) * Fix some assert_instr Missing assert_instr: - _mm_cvtsi32_si128 - _mm_cvtsi128_si32 - _mm_loadl_epi64 - _mm_storel_epi64 - _mm_move_epi64 - _mm_cvtsd_f64 - _mm_setzero_pd - _mm_load1_pd - _mm_load_pd1 - _mm_loaddup_pd Wrong intrusction used: - _mm_hsub_pi16 * Try to fix CI build by disabling some asserts * Exclude some assert_instr on (x86_64, linux)	2017-12-30 11:19:00 -06:00
Alex Crichton	be461b1377	Verify Intel intrinsics against upstream definitions (#251 ) This commit adds a new crate for testing that the intrinsics listed in this crate do indeed match the upstream definition of each intrinsic. A pre-downloaded XML description of all Intel intrinsics is checked in which is then parsed in the `stdsimd-verify` crate to verify that everything we write down is matched against the upstream definitions. Currently the checks are pretty loose to get this compiling but a few intrinsics were fixed as a result of this. For example: * `_mm256_extract_epi8` - AVX2 intrinsic erroneously listed under AVX * `_mm256_extract_epi16` - AVX2 intrinsic erroneously listed under AVX * `_mm256_extract_epi32` - AVX2 intrinsic erroneously listed under AVX * `_mm256_extract_epi64` - AVX2 intrinsic erroneously listed under AVX * `_mm_tzcnt_32` - erroneously had `u32` in the name * `_mm_tzcnt_64` - erroneously had `u64` in the name * `_mm_cvtsi64_si128` - erroneously available on 32-bit platforms * `_mm_cvtsi64x_si128` - erroneously available on 32-bit platforms * `_mm_cvtsi128_si64` - erroneously available on 32-bit platforms * `_mm_cvtsi128_si64x` - erroneously available on 32-bit platforms * `_mm_extract_epi64` - erroneously available on 32-bit platforms * `_mm_insert_epi64` - erroneously available on 32-bit platforms * `_mm256_extract_epi16` - erroneously returned i32 instead of i16 * `_mm256_extract_epi8` - erroneously returned i32 instead of i8 * `_mm_shuffle_ps` - the mask argument was erroneously i32 instead of u32 * `_popcnt32` - the signededness of the argument and return were flipped * `_popcnt64` - the signededness of the argument was flipped and the argument was too large bit-wise * `_mm_tzcnt_32` - the return value's sign was flipped * `_mm_tzcnt_64` - the return value's sign was flipped * A good number of intrinsics used `imm8: i8` or `imm8: u8` instead of `imm8: i32` which Intel was using. (we were also internally inconsistent) * A number of intrinsics working with `__m64` were instead working with i64/u64, so they're now corrected to operate with the vector types instead. Currently the verifications performed are: * Each name in Rust is defined in the XML document * The arguments/return values all agree. * The CPUID features listed in the XML document are all enabled in Rust as well. The type matching right now is pretty loose and has a lot of questionable changes. Future commits will touch these up to be more strict and require closer adherence with Intel's own types. Otherwise types like `i32x8` (or any integers with 256 bits) all match up to `__m256i` right now, althoguh this may want to change in the future. Finally we're also not testing the instruction listed in the XML right now. There's a huge number of discrepancies between the instruction listed in the XML and the instruction listed in `assert_instr`, and those'll need to be taken care of in a future commit. Closes #240	2017-12-29 11:52:27 -06:00
gwenn	44a168a0b8	sse2: implements last remaining intrinsics (#244 ) * sse2: __m64 related intrinsics _mm_add_si64 _mm_mul_su32 _mm_sub_si64 _mm_cvtpi32_pd _mm_set_epi64 _mm_set1_epi64 _mm_setr_epi64 * sse2: _mm_load_sd, _mm_loadh_pd, _mm_loadl_pd * sse2: _mm_store_sd, _mm_storeh_pd, _mm_storel_pd * sse2: _mm_shuffle_pd, _mm_move_sd * sse2: _mm_cast* _mm_castpd_ps _mm_castpd_si128 _mm_castps_pd _mm_castps_si128 _mm_castsi128_pd _mm_castsi128_ps * sse2: add some tests * Try to fix AppVeyor build * sse2: add more tests * sse2: fix assert_instr for _mm_shuffle_pd * Try to fix Travis build * sse2: try to fix AppVeyor build * sse2: try to fix AppVeyor build	2017-12-28 10:22:08 -06:00
Jonathan Goodman	3857c3e88a	fix sse4a _mm_stream_{ss, sd} tests and docs	2017-12-27 22:32:49 +01:00
Alex Crichton	9aa4e30859	Update to `syn` master	2017-12-27 07:56:38 -08:00
gnzlbg	42ec76a3ff	[sse4a] implement non-immediate-mode intrinsics (#249 )	2017-12-22 10:14:41 -06:00
gnzlbg	1db6841813	[fmt] --force rustfmt-nightly	2017-12-22 00:24:23 +01:00
gnzlbg	52cc1abe2c	[fmt] remove fn_call_width option (was removed upstream)	2017-12-22 00:24:23 +01:00
gnzlbg	5850282a1c	use repr(align) to ensure proper alignment in tests	2017-12-22 00:24:23 +01:00
gnzlbg	4fb9420acb	Fix rustfmt (#239 ) * [fmt] manually fix some formatting * [fmt] reformat with rustfmt-nightly * [clippy] fix clippy issues	2017-12-14 19:57:53 +01:00
gnzlbg	5ce0c13009	[ci] powerpc/powerpc64/powerpc64le (#237 ) * [ci] add powerpc/powerpc64 build bots * unbreak stdsimd builds for targets without run-time	2017-12-14 10:44:20 -06:00
Tony Sifkarovski	645008ef32	Add `unchecked` methods, fix _mm_extract_epi* return types (#223 ) * Adds extract_unchecked + replace_unchecked + len (#222 ) * [x86] Fixes the return types + uses extract_unchecked for: * _mm_extract_epi8 * _mm_extract_epi16 * _mm256_extract_epi8 * _mm256_extract_epi16 * Minor changes to the other extract_epi* intrinsics for style consistency These should now zero-extend the extracted int and behave appropriately. An old typo makes these a bit confusing, See this llvm issue.	2017-12-13 19:17:33 +01:00
gnzlbg	6e678ee678	fix clippy warnings	2017-12-13 10:19:09 -05:00
gnzlbg	84e2c7f8e4	fix __m64 imports	2017-12-13 10:19:09 -05:00
gnzlbg	9a81140e00	use i64s for the repr of __m{128,256}i and update casts	2017-12-13 10:19:09 -05:00
gnzlbg	1b987bd270	remove unnecessary mem::uninitialized	2017-12-13 10:19:09 -05:00
gnzlbg	45f1e63e15	remove unnecessary fixme	2017-12-13 10:19:09 -05:00
gnzlbg	878fd5b5d9	[avx] document intrinsics that don't correspond to an instruction	2017-12-13 10:19:09 -05:00
gnzlbg	baab3ad7f1	move __m256i to the v256 module	2017-12-13 10:19:09 -05:00
gnzlbg	ae6cff53c7	rework impl of __m64 and __m128i	2017-12-13 10:19:09 -05:00
gnzlbg	dd9a3f92ff	move __m128i to the v128 module	2017-12-13 10:19:09 -05:00
gnzlbg	5fb068f74c	move __m64 to the v64 module	2017-12-13 10:19:09 -05:00
gnzlbg	8c13c1e4a3	[ssse3] _mm_alignr_pi8 (#235 )	2017-12-12 12:57:22 -06:00
Luca Barbato	baace2fc3f	Initial PowerPC support Rely mainly on parsing auxv since the cpuinfo information is incomplete.	2017-12-12 11:54:49 +01:00
Luca Barbato	f775bf3931	Extract the cpu capabilities from the auxiliary vector Check for neon/asimd and pmull for arm and aarch64.	2017-12-12 11:54:49 +01:00
Luca Barbato	f49009e22c	Unbreak detect_features for arm and aarch64	2017-12-12 11:54:49 +01:00
gwenn	4950bfed1a	sse2: _mm_stream_* (#228 ) * sse2: _mm_stream_si128,si32,pd,si64 * sse2: _mm_stream_* tests * Disable assert_instr for _mm_stream_si64	2017-12-10 09:11:03 -06:00
gwenn	c2e4bb2e4c	sse: __m64 related intrinsics (#230 ) * sse: add missing aliases _m_pextrw, _m_pinsrw, _m_pmovmskb, _m_pshufw * sse: _mm_maskmove_si64, _m_maskmovq * sse: _mm_mulhi_pu16, _m_pmulhuw * sse: _mm_avg_pu8, _m_pavgb * sse: _mm_avg_pu16, _m_pavgw * sse: _mm_sad_pu8, _m_psadbw * sse: _mm_cvtpi32_ps * sse: _mm_cvtpi32x2_ps	2017-12-10 09:04:02 -06:00
gwenn	cbd52b05c1	Sse (#225 ) * sse: _mm_cvt_pi2ps * sse: _mm_extract_pi16 * sse: _mm_insert_pi16 * sse: _mm_movemask_pi8 * sse: _mm_shuffle_pi16 * sse: fix _mm_insert_pi16 and _mm_extract_pi16 * sse: add tests	2017-12-09 11:20:44 -06:00
gwenn	81630ea994	avx: _mm256_stream_si256, _mm256_stream_pd, _mm256_stream_ps (#227 )	2017-12-09 11:20:30 -06:00
gwenn	0f53193641	sse2: _mm_movepi64_pi64, _mm_movpi64_epi64, _mm_cvtpd_pi32, _mm_cvttpd_pi32	2017-12-09 08:23:41 -05:00
gwenn	fcf106e685	ssse3 (#224 ) * ssse3: _mm_abs_pi8 failing Intrinsic has incorrect return type! <8 x i8> (<8 x i8>)* @llvm.x86.ssse3.pabs.b * Introduce a x86_mmx type And make it compatible with i8x8 and u8x8. Alex suggested to change the i8x8 declaration as: ``` struct i8x8(i64); ``` But I don't see how to make it compatible with the existing code/macros. * ssse3: _mm_abs_pi16, _mm_abs_pi32, _mm_shuffle_pi8 * ssse3: _mm_abs_pi16, _mm_abs_pi32, _mm_shuffle_pi8 tests * Replace x86_mmx by __m64 * ssse3: _mm_sign_pi8, _mm_sign_pi16, _mm_sign_pi32 * ssse3: _mm_mulhrs_pi16 * ssse3: _mm_maddubs_pi16 * ssse3: _mm_hsub_pi16, _mm_hsub_pi32, _mm_hsubs_pi16 * ssse3: _mm_hadd_pi16, _mm_hadd_pi32, _mm_hadds_pi16 * Move some ssse3 intrinsics from i586 to i686	2017-12-03 11:53:36 -06:00
gnzlbg	6461312210	[ci] test i686-apple-darwin (#221 ) * [ci] test i686-apple-darwin * fix overflow on i686-apple-darwin	2017-11-28 17:09:38 -07:00
gnzlbg	8a92a566c9	[sse] _mm_stream_{ps,pi} (#219 )	2017-11-28 07:48:26 -08:00
gnzlbg	288a30a93e	add mmx module, mmx run-time detection, intrinsics (#220 ) * [sse] _mm_cvtps_pi32, _mm_cvt_ps2pi * [mmx] run-time detection support * [x86] add mmx module * [x86] make __m64 public * [sse] add _mm_cvtps_pi{8,16}, _mm_cvttps_pi32, _mm_cvtt_ps2pi * move new intrinsics from i586 to i686 module * mmx requires i686	2017-11-28 07:45:41 -08:00
gnzlbg	ef847ac83b	[sse] add _mm_{min, max}_{pi16, pu8} (#218 ) * [sse] add _mm_{min, max}_{pi16, pu8} * format docs	2017-11-27 14:54:28 -08:00
gnzlbg	b8a4b397ad	update docs (#217 ) * update docs * cargo clean deletes previous docs * remove stdsimd from coresimd examples * use stdsimd instead of coresimd in core docs * add stdsimd as a dev-dependency of coresimd	2017-11-27 10:47:23 -08:00
Tony Sifkarovski	40a0b1cc92	[avx2] add shuffle, insert/extract i128, permute* (#210 ) * [x86][avx2] add _mm256_shuffle{hi,lo}_epi16 * [x86][avx2] add _mm256_{insert,extract}i128_si256 * [x86][avx2] add remaining permute intrinsics	2017-11-26 17:40:26 +01:00
gnzlbg	426621f021	Add FXSAVE/FXRSTOR, update Intel SDE, fix xsave tests (#205 ) * [x86] add run-time detection for fxsr * [x86] add i386 fxsr intrinsics: FXSAVE,FXRSTOR * [x86_64] add x86_64 fxsr intrinsics: FXSAVE64/FXRSTOR64 * [x86-runtime]: document xsave detection further * [x86] disable xsaves and xsaves64 tests	2017-11-22 15:25:15 +01:00
gnzlbg	20529701d8	Fix clippy and rust-fmt.	2017-11-22 13:42:58 +01:00
Alex Crichton	922345c005	Use workspaces and fix tests * Enable a Cargo workspace for the repo * Disable tests for proc-macro crates * Move back to mounting source directory read-only * Refactor test invocation to only test one crate with `--all`	2017-11-22 13:42:58 +01:00
gnzlbg	86fa377cea	Only coresimd depends on stdsimd-test.	2017-11-22 13:42:58 +01:00
gnzlbg	cb9888f802	[ci] flag the documentation build bot	2017-11-22 13:42:58 +01:00
gnzlbg	b940d3311a	fix doc script	2017-11-22 13:42:58 +01:00
gnzlbg	6a0a55f01a	c_void -> *mut u8	2017-11-22 13:42:58 +01:00
gnzlbg	14d0903309	refactor no_std components into the coresimd crate	2017-11-22 13:42:58 +01:00
Adam Niederer	dc9f076480	Add AVX2 gathers (#202 ) * Add _mm_[mask_]gatheri32_epi32 * Add _mm[256][_mask]_i32gather_{epi64, pd} * Add _mm[256][_mask]_gather_ps * Add _mm[256][_mask]_i64gather_{epi32, epi64, ps, pd}	2017-11-22 09:26:12 +01:00
gnzlbg	2faf11ab44	[readme] point always to latests docs (#206 )	2017-11-21 15:05:46 -06:00
gnzlbg	0129d3be76	[nvptx] enable nvptx only when all other targets are disabled (#208 ) Closes #207 .	2017-11-21 15:05:05 -06:00
Alex Crichton	8356754fe7	Fix hygiene in various macros (#204 )	2017-11-21 12:54:06 -06:00

... 44 45 46 47 48 ...

2551 Commits