@karolherbst@chaos.social titelbild
@karolherbst@chaos.social avatar

karolherbst

@karolherbst@chaos.social

Linux Graphics Developer
Freedesktop Code of Conduct Enforcement team member

Mostly working on Rusticl and Nouveau
Implemented OpenCL in Rust for fun

🏳️‍🌈🏳️‍⚧️ are welcomed.

Nazis, $hitcoin cultists, Right-Libertarians, Longterminists, Tankies, techbros and other fascists not welcomed. This is a shithead free zone.

Private account, please direct all business inquiries to: https://twitter.com/karolherbst

#mesa
#nouveau
#nvk
#opencl
#rust #rustlang
#rusticl
#coc

Dieses Profil is von einem föderierten Server und möglicherweise unvollständig. Auf der Original-Instanz anzeigen

karolherbst , an Random Englisch
@karolherbst@chaos.social avatar

I need a -O-1 flag in clang

karolherbst OP ,
@karolherbst@chaos.social avatar

@VegaHarmonia also an idea. But the issue I was seeing was, that apparently an addition of two 8 bit values gives us casts to 32 and to do the addition with 32 bits instead of 8 🙃

karolherbst OP ,
@karolherbst@chaos.social avatar

@sanity @VegaHarmonia I'm not compiling against a CPU target

karolherbst OP ,
@karolherbst@chaos.social avatar

@oblomov @VegaHarmonia all opts are disabled, so my expectation is, that LLVM just leaves the original code as it is.

karolherbst OP ,
@karolherbst@chaos.social avatar

@a1ba @VegaHarmonia sure, I still want LLVM to not do such things if I use -O0

karolherbst OP ,
@karolherbst@chaos.social avatar

@a1ba @VegaHarmonia I'm compiling towards SPIR-V, so any additional instructions are just a waste of space and time in compilers consuming that.

karolherbst OP ,
@karolherbst@chaos.social avatar

@a1ba @VegaHarmonia anyway, it's just two inputs to a function being 8 bit, getting added and then stored through a pointer pointing to an 8 bit value.

void add(global char* out, char a, char b) {
out[0] = a + b;
}

karolherbst OP ,
@karolherbst@chaos.social avatar

@a1ba @VegaHarmonia yeah.. compiling with -O1 would solve the issue, sadly that could generate LLVM IR which the SPIRV-LLVM-Translator won't be able to consume, which we use atm becuase the SPIRV target isn't ready yet.

I'm sure this issue will be solved over time with the proper SPIRV target, but yeah...

I was considering running some LLVM opts, but most of them cause one or another issue.

thephd , an Random Englisch
@thephd@pony.social avatar

... Oh. Uh. How badly would everyone hate me if I wrote a paper that allowed

T foo[2];
and
T[2] foo;

to mean the same thing as far as type declarations go in C?

karolherbst ,
@karolherbst@chaos.social avatar

@thephd it depends on what "T[2] foo, bar;" would mean

karolherbst ,
@karolherbst@chaos.social avatar

@thephd in that case you get an approval from me!

18+ fclc , an Random Englisch
@fclc@mast.hpc.social avatar

I’m starting to believe that the easiest way for me to get a job at amd is to do a rocm based ai startup and then just get bought out

18+ karolherbst ,
@karolherbst@chaos.social avatar

@fclc what do you think why they created those startups in the first place :P

karolherbst , an Random Englisch
@karolherbst@chaos.social avatar

I'm currently looking into what's the best way to support SVM/USM in #rusticl and one thing I'm wondering about is, if there are any drawbacks doing:

mmap(some_chosen_address, ram_size, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED_NOREPLACE, 0, 0);

and reserve a lot of virtual memory, and then suballocate SVM allocations out of this region with "PROT_READ | PROT_WRITE" and "MAP_FIXED"?

Alternatively, I was considering allocating smaller heaps on demand.

karolherbst OP ,
@karolherbst@chaos.social avatar

@jhwgh1968 nah, FIXED_NOREPLACE just means the mmap call won't replace existing mappings.

karolherbst OP ,
@karolherbst@chaos.social avatar

@jhwgh1968 mhhh.. I can't really sub-allocate, because after munmap the original mapping is just gone as well...

though maybe I can make it work without reserving such a huge range, but I'm kinda worried about fragmentation, but... maybe that's fine...

karolherbst OP ,
@karolherbst@chaos.social avatar

@lina @jhwgh1968 yeah, it should.

Which just brings me back to the original question, if reserving that amount of VM space is a good idea and if there are other options.

I also wonder how well that would work with things like libasan, which just allocates 20TB of virtual memory here.

Anyway, it seems like Intel's CL stack just hopes nothing conflicts and maybe I just do the same, because it should be fine (tm).

karolherbst , an Random Englisch
@karolherbst@chaos.social avatar

so.. which distribution is not going to patch out this secret chromium extension?

karolherbst OP ,
@karolherbst@chaos.social avatar

@jhwgh1968 apparently chromium has a private extension users can't disable, which grants *.google.com domains special privileges and access to private APIs.

Apparently the same extension is also available on edge.

See https://fedi.simonwillison.net/@simon/112757810519145581

karolherbst , an Random Englisch
@karolherbst@chaos.social avatar

soooo.. in OpenCL C builtins like get_global_offset have to return defined values on out of bound accesses, but apparently Intel's driver also gets this wrong. Now I'm considering if I should keep it broken as well, or actually fix it as defined in the spec...

@bashbaug any thoughts on this? Maybe we should add OpenCL CTS tests for this and annoy everybody else with this?

karolherbst OP ,
@karolherbst@chaos.social avatar

@bashbaug oh, I wasn't aware that you are also using SPIR-V internally. Is that an old change or something more recent?

  • Alle
  • Abonniert
  • Moderiert
  • Favoriten
  • random
  • haupteingang
  • Alle Magazine