-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for processing strings/byte arrays within the gpu #25
Comments
The main thing stopping string handling is supporting bytes/ints/chars, right? This is blocked right now until we figure out better type inference. I'm not sure about regex'ing but slicing will probably never be supported in the Rust subset because it is practically impossible on GPUs. GPUs have different sections of memory - "global", "local", and "private." "global" is the most expensive to allocate and place data into and out of. "private" is the cheapest but is only registers. Registers are like slots but can only hold primitive types like And I don't think slices are really necessary. Once you have a slice you want to do one of 2 things.
The first is already possible (if you index directly into the data you are trying to take a slice of) and once we support for loops inside of the "kernel"/for loop body, the second will be possible too. Sorting could be implemented by hand as parallel bubble sort once we have support for if statements, modulo operator, variables (to add support we need to work on modifying this traversing code and ensure that the type-safety is not messed up. Sorting like this is could also maybe be implemented at some point. let mut x = vec![0.0; 1000];
// ...
// ...store random numbers in x...
// ...
gpu_do!(load(x));
gpu_do!(launch());
x.sort(); |
You can read the linked comment above for figuring out type inference. But basically the challenge is that for OpenACC that does what Emu does but for C/C++, they have stuff like this. int z = x + y; And they know the type is But we have Rust code like this. let z = x + y; And somehow, we need to figure out that this |
Wait, actually, sorting shouldn't be built in. It should be defined in some separate crate GPU-accelerated sorting. let mut x = vec![0.0; 1000];
// ...
// ...store random numbers in x...
// ...
gpu_do!(load(x));
x = sorting::sort(x); Regex'ing and slicing also won't be built in. All of these should be implemented manually. However, for these to be implement-able, the above things do still need to be supported. (variables, if/else, type inference, etc.) |
I did read a bit more into the "CUDA C PROGRAMMING GUIDE PG-02829-001_v10.1 | August 2019". In theory, the emu vectors could contain any of these types:
The "if" conditional is supported within cuda kernels. Although outside of the scope of your emu, it could be interesting to see support for GPUDirect RDMA within emu also: |
Actually my intent was not to mutate the input request vector itself. I would be passing along a second response vector itself which would contain a different structure of vector, but with similar type something like 8-bit unsigned integer "u8" also known as a byte which is what you would find within your typical memory location or file. If all goes well the actual response reference passed in is a direct mapping to an intended response file which could be local or remote. |
Yes. While
I also haven't added if statements because that would require adding
You can create a separate vector and mutate that instead. Emu lets you do that. The only big complication is adding the |
Is there any way to do value clamping without supporting if or bool? |
Not at the moment. I had plans for a rewrite system that would replace expressions with appropriate builtin functions in OpenCL (so an if statement would be replaced with a clamp). Nothing has materialized yet, unfortunately. |
Are there any plans to provide string or byte array handling capability from within emu kernels?
I believe it would be feasible if there was more support for integer types within emu.
I understand both cuda/opencl provide integer support within kernels.
Thank you for listening.
The text was updated successfully, but these errors were encountered: