Wasm intrinsics look neat as a higher-level fixed size SIMD abstraction. I wonder how good the compilers can do if using them for AOT targets with libraries like simd-everywhere.
I'd like to be a little more sure that I'm not totally messing things up before doing that, but yes, eventually, that would be a nice outcome.
I've also only really tested wazero. I can't know for sure that this is a straight improvement for other runtimes and architectures.
For instance, the code delays using wasm_i8x16_bitmask as much as possible, because on Aarch64 it can be slower than not using SIMD at all, whereas it's plenty fast on x86-64.
Would you consider writing some blog posts or other resources about WASM? I was experimenting recently with WIT, and ran into a mountain of issues. There's a lot of jargon that could do with some untangling.
It took me a lot longer than it should have to put together this basic module, and even then there's this shared library I had to download to build it, and I couldn't figure out why this requires a libc:
I'm not that great at long form writing to be honest, it's always a bit of a chore, and I'm never happy with the result.
To answer your question, it needs a libc because you're including stdlib.h, and exporting and allocator (even if you're not otherwise using it). You need a libc for malloc.
This is generally a good idea, if you need to send anything beyond numbers across the API (e.g. you need an allocator if you want to send strings as pointers).
I never used WIT, so I have no idea if this a requirement for WIT.
It's generally used for techniques that apply SIMD principles within general-purpose registers and instructions.
Assume you've loaded a 64-bit register (a uint64_t) with 8 bytes (unsigned char) of data. Can you answer the question “is any of these 8 bytes zero (the NUL terminator)?”
If you find a cheap way to do it, you can make strlen go faster by consuming 8 bytes at a time.
It is still a bit early, but I'm majorly bullish on WASM for multiple use cases:
1. Client side browser polyglot "applets" (Java applets were ahead of their time IMO)
2. Server side polyglot "servlets" (Node.js, embedded runtimes, etc.)
3. Language interop/FFI (Lang A -> WASM -> Lang B, like wasm2c)
Why is #3 so interesting? The hardest thing in language conversion is the library calls. WASI standardizes that, so all the proprietary libs will eventually compile down to WASI as a sort of POSIX/libc like layer. In addition, WASM standardizes calling convention. The resulting new source code may not look like much, but it will solve the FFI calling convention/marshalling/library issues nicely.
I’m not sure how it solves the FFI problem. Lowest common denominator calling conventions don’t make it any easier to bridge languages than it already is.
C calling conventions are already the standard for FFI in native code, and that means dropping down to what can be expressed in C if you want to cross that boundary.
string.h is missing strstr(), there's an algorithm of similar complexity you might consider: http://0x80.pl/notesen/2016-11-28-simd-strfind.html
If there's interest, the set of implemented functions can definitely be extended.
I've also only really tested wazero. I can't know for sure that this is a straight improvement for other runtimes and architectures.
For instance, the code delays using wasm_i8x16_bitmask as much as possible, because on Aarch64 it can be slower than not using SIMD at all, whereas it's plenty fast on x86-64.
One of the nice things about Go is how much that's a solved issue out of the box, compared to almost everything else; certainly compared to C.
Pinging them in an issue: https://github.com/WebAssembly/wasi-libc/issues/580
It took me a lot longer than it should have to put together this basic module, and even then there's this shared library I had to download to build it, and I couldn't figure out why this requires a libc:
https://github.com/cedws/wasm-wit-test
To answer your question, it needs a libc because you're including stdlib.h, and exporting and allocator (even if you're not otherwise using it). You need a libc for malloc.
This is generally a good idea, if you need to send anything beyond numbers across the API (e.g. you need an allocator if you want to send strings as pointers).
I never used WIT, so I have no idea if this a requirement for WIT.
It's generally used for techniques that apply SIMD principles within general-purpose registers and instructions.
Assume you've loaded a 64-bit register (a uint64_t) with 8 bytes (unsigned char) of data. Can you answer the question “is any of these 8 bytes zero (the NUL terminator)?”
If you find a cheap way to do it, you can make strlen go faster by consuming 8 bytes at a time.
Et voilà:
1. Client side browser polyglot "applets" (Java applets were ahead of their time IMO)
2. Server side polyglot "servlets" (Node.js, embedded runtimes, etc.)
3. Language interop/FFI (Lang A -> WASM -> Lang B, like wasm2c)
Why is #3 so interesting? The hardest thing in language conversion is the library calls. WASI standardizes that, so all the proprietary libs will eventually compile down to WASI as a sort of POSIX/libc like layer. In addition, WASM standardizes calling convention. The resulting new source code may not look like much, but it will solve the FFI calling convention/marshalling/library issues nicely.
C calling conventions are already the standard for FFI in native code, and that means dropping down to what can be expressed in C if you want to cross that boundary.
It's not a panacea, though; it introduces other issues.