* Attempt to fix tests on master
* Make all doctests use items from the real `std` rather than this
crate, it's just easier
* Handle debuginfo weirdness by flagging functions as `no_mangle` that
we're looking for instructions within.
* Handle double undescores in symbol names
This commit:
* renames `coresimd` to `core_arch` and `stdsimd` to `std_detect`
* `std_detect` does no longer depend on `core_arch` - it is a freestanding
`no_std` library that only depends on `core` - it is renamed to `std_detect`
* moves the top-level coresimd and stdsimd directories into the appropriate
crates/... directories - this simplifies creating crate.io releases of these crates
* moves the top-level `coresimd` and `stdsimd` sub-directories into their
corresponding crates in `crates/{core_arch, std_detect}`.
We historically have run single-threaded verbose tests because we were
faulting all over the place due to bugs in rustc itself, primarily
around calling conventions and passing values around. Those bugs have
all since been fixed so we should be clear to run multithreaded tests
quietly on CI nowadays!
Closes#621
* Update representation of `v128`
* Rename everything with new naming convention of underscores and no
modules/impls
* Remove no longer necessary `wasm_simd128` feature
* Remove `#[target_feature]` attributes (use `#[cfg]` instead)
* Update `assert_instr` tests
* Update some implementations as LLVM has evolved
* Allow some more esoteric syntax in `#[assert_instr]`
* Adjust the safety of APIs where appropriate
* Remove macros in favor of hand-coded implementations
* Comment out the tests for now as there's no known runtime for these
yet
* fix _mm_castsi128_pd and _mm_castpd_si128 impls
The _mm_castX_Y SSE intrinsics are "reinterpreting" casts; LLVM's
simd_cast is a "converting" cast. Replace simd_cast with mem::transmute.
Fixes#55249
* Temporarily pin CI
* Fix i686 segfaults
* Fix wasm CI
Output of `wasm2wat` has changed!
* Fix AppVeyor with an older nightly
* Add wasm32 simd128 intrinsics
* test wasm32 simd128 instructions
* Run wasm tests like all other tests
* use modules instead of types to access wasm simd128 interpretations
* generate docs for wasm32-unknown-unknown
* fix typo
* Enable #[assert_instr] on wasm32
* Shell out to Node's `execSync` to execute `wasm2wat` over our wasm file
* Parse the wasm file line-by-line, looking for various function markers and
such
* Use the `elem` section to build a function pointer table, allowing us to map
exactly from function pointer to a function
* Avoid losing debug info (the names section) in release mode by stripping
`--strip-debug` from `rust-lld`.
* remove exclude list from Cargo.toml
* fix assert_instr for non-wasm targets
* re-format assert-instr changes
* add crate that uses assert_instr
* Fix instructions having extra quotes
* Add assert_instr for wasm memory intrinsics
* Remove hacks for git wasm-bindgen
* add wasm_simd128 feature
* make wasm32 build correctly
* run simd128 tests on ci
* remove wasm-assert-instr-tests
* Update to proc_macro2 0.4 and related
* Update to proc_macro2 0.4 and related
* Update to proc_macro2 0.4 and related
* Add proc_macro_gen feature
* Update to the new rustfmt cli
* A few proc-macro2 stylistic updates
* Disable RUST_BACKTRACE by default
* Allow rustfmt failure for now
* Disable proc-macro2 nightly feature in verify-x86
Currently this causes bugs on nightly due to upstream rustc bugs, this should be
temporary
* Attempt to thwart mergefunc
* Use static relocation model on i686
* add some powerpc/powerpc64 altivec/vsx intrinsics
* temporarily make IntoBits/FromBits inline(always)
* include powerpc64 module; use inline(always) from/into_bits only on powerpc
* fix build after stabilization of cfg_target_feature and target_feature
* fix doc tests
* fix spurious unused_attributes warning
* fix more unused attribute warnings
* More unnecessary target features
* Remove no longer needed trait imports
* Remove fixed upstream workarounds
* Fix parsing the #[assert_instr] macro
Following upstream proc_macro changes
* Fix form and parsing of #[simd_test]
* Don't use Cargo features for testing modes
Instead use RUSTFLAGS with `--cfg`. This'll help us be compatible with the
latest Cargo where a tweak to workspaces and features made the previous
invocations we had invalid.
* Don't thread RUSTFLAGS through docker
* Re-gate on x86 verification
Closes#411
* Prepare portable packed SIMD vector types for RFCs
This commit cleans up the implementation of the Portable Packed Vector Types
(PPTV), adds some new features, and makes some breaking changes.
The implementation is moved to `coresimd/src/ppvt` (they are
still exposed via `coresimd::simd`).
As before, the vector types of a certain width are implemented in the `v{width}`
submodules. The `macros.rs` file has been rewritten as an `api` module that
exposes the macros to implement each API.
It should now hopefully be really clear where each API is implemented, and which types
implement these APIs. It should also now be really clear which APIs are tested and how.
- boolean vectors of the form `b{element_size}x{number_of_lanes}`.
- reductions: arithmetic, bitwise, min/max, and boolean - only the facade,
and a naive working implementation. These need to be implemented
as `llvm.experimental.vector.reduction.{...}` but this needs rustc support first.
- FromBits trait analogous to `{f32,f64}::from_bits` that perform "safe" transmutes.
Instead of writing `From::from`/`x.into()` (see below for breaking changes) now you write
`FromBits::from_bits`/`x.into_bits()`.
- portable vector types implement `Default` and `Hash`
- tests for all portable vector types and all portable operations (~2000 new tests).
- (hopefully) comprehensive implementation of bitwise transmutes and lane-wise
casts (before `From` and the `.as_...` methods where implemented "when they were needed".
- documentation for PPTV (not great yet, but better than nothing)
- conversions/transmutes from/to x86 architecture specific vector types
- `store/load` API has been replaced with `{store,load}_{aligned,unaligned}`
- `eq,ne,lt,le,gt,ge` APIs now return boolean vectors
- The `.as_{...}` methods have been removed. Lane-wise casts are now performed by `From`.
- `From` now perform casts (see above). It used to perform bitwise transmutes.
- `simd` vectors' `replace` method's result is now `#[must_use]`.
* enable backtrace and nocapture
* unalign load/store fail test by 1 byte
* update arm and aarch64 neon modules
* fix arm example
* fmt
* clippy and read example that rustfmt swallowed
* reductions should take self
* rename add/mul -> sum/product; delete other arith reductions
* clean up fmt::LowerHex impl
* revert incorret doc change
* make Hash equivalent to [T; lanes()]
* use travis_wait to increase timeout limit to 20 minutes
* remove travis_wait; did not help
* implement reductions on top of the llvm.experimental.vector.reduction intrinsics
* implement cmp for boolean vectors
* add missing eq impl file
* implement default
* rename llvm intrinsics
* fix aarch64 example error
* replace #[inline(always)] with #[inline]
* remove cargo clean from run.sh
* workaround broken product in aarch64
* make boolean vector constructors const fn
* fix more reductions on aarch64
* fix min/max reductions on aarch64
* remove whitespace
* remove all boolean vector types except for b8xN
* use a sum reduction fallback on aarch64
* disable llvm add reduction for aarch64
* rename the llvm intrinsics to use llvm names
* remove old macros.rs file
With RFC 2325 looking close to being accepted, I took a crack at
reorganizing this repository to being more amenable for inclusion in
libstd/libcore. My current plan is to add stdsimd as a submodule in
rust-lang/rust and then use `#[path]` to include the modules directly
into libstd/libcore.
Before this commit, however, the source code of coresimd/stdsimd
themselves were not quite ready for this. Imports wouldn't compile for
one reason or another, and the organization was also different than the
RFC itself!
In addition to moving a lot of files around, this commit has the
following major changes:
* The `cfg_feature_enabled!` macro is now renamed to
`is_target_feature_detected!`
* The `vendor` module is now called `arch`.
* Under the `arch` module is a suite of modules like `x86`, `x86_64`,
etc. One per `cfg!(target_arch)`.
* The `is_target_feature_detected!` macro was removed from coresimd.
Unfortunately libcore has no ability to export unstable macros, so for
now all feature detection is canonicalized in stdsimd.
The `coresimd` and `stdsimd` crates have been updated to the planned
organization in RFC 2325 as well. The runtime bits saw the largest
amount of refactoring, seeing a good deal of simplification without the
core/std split.
This commit adds a new crate for testing that the intrinsics listed in this
crate do indeed match the upstream definition of each intrinsic. A
pre-downloaded XML description of all Intel intrinsics is checked in which is
then parsed in the `stdsimd-verify` crate to verify that everything we write
down is matched against the upstream definitions.
Currently the checks are pretty loose to get this compiling but a few intrinsics
were fixed as a result of this. For example:
* `_mm256_extract_epi8` - AVX2 intrinsic erroneously listed under AVX
* `_mm256_extract_epi16` - AVX2 intrinsic erroneously listed under AVX
* `_mm256_extract_epi32` - AVX2 intrinsic erroneously listed under AVX
* `_mm256_extract_epi64` - AVX2 intrinsic erroneously listed under AVX
* `_mm_tzcnt_32` - erroneously had `u32` in the name
* `_mm_tzcnt_64` - erroneously had `u64` in the name
* `_mm_cvtsi64_si128` - erroneously available on 32-bit platforms
* `_mm_cvtsi64x_si128` - erroneously available on 32-bit platforms
* `_mm_cvtsi128_si64` - erroneously available on 32-bit platforms
* `_mm_cvtsi128_si64x` - erroneously available on 32-bit platforms
* `_mm_extract_epi64` - erroneously available on 32-bit platforms
* `_mm_insert_epi64` - erroneously available on 32-bit platforms
* `_mm256_extract_epi16` - erroneously returned i32 instead of i16
* `_mm256_extract_epi8` - erroneously returned i32 instead of i8
* `_mm_shuffle_ps` - the mask argument was erroneously i32 instead of u32
* `_popcnt32` - the signededness of the argument and return were flipped
* `_popcnt64` - the signededness of the argument was flipped and the argument
was too large bit-wise
* `_mm_tzcnt_32` - the return value's sign was flipped
* `_mm_tzcnt_64` - the return value's sign was flipped
* A good number of intrinsics used `imm8: i8` or `imm8: u8` instead of `imm8:
i32` which Intel was using. (we were also internally inconsistent)
* A number of intrinsics working with `__m64` were instead working with i64/u64,
so they're now corrected to operate with the vector types instead.
Currently the verifications performed are:
* Each name in Rust is defined in the XML document
* The arguments/return values all agree.
* The CPUID features listed in the XML document are all enabled in Rust as well.
The type matching right now is pretty loose and has a lot of questionable
changes. Future commits will touch these up to be more strict and require closer
adherence with Intel's own types. Otherwise types like `i32x8` (or any integers
with 256 bits) all match up to `__m256i` right now, althoguh this may want to
change in the future.
Finally we're also not testing the instruction listed in the XML right now.
There's a huge number of discrepancies between the instruction listed in the XML
and the instruction listed in `assert_instr`, and those'll need to be taken care
of in a future commit.
Closes#240
* Enable a Cargo workspace for the repo
* Disable tests for proc-macro crates
* Move back to mounting source directory read-only
* Refactor test invocation to only test one crate with `--all`
This commit adds CI for a few more targets:
* i686-unknown-linux-gnu
* arm-unknown-linux-gnueabihf
* armv7-unknown-linux-gnueabihf
* aarch64-unknown-linux-gnu
The CI here is structured around using a Docker container to set up a test
environment and then QEMU is used to actually execute code from these platforms.
QEMU's emulation actually makes it so we can continue to just use `cargo test`,
as processes can be spawned from QEMU like `objdump` and files can be read (for
libbacktrace). Ends up being a relatively seamless experience!
Note that a number of intrinsics were disabled on i686 because they were failing
tests, and otherwise a few ARM touch-ups were made to get tests passing.