feat: refactor InfiniCore cpu runtime to InfiniRT#8
Conversation
2e80f6b to
866fc8d
Compare
|
@voltjia 麻烦嘉成帮我看下修改后 InfiniCore 接入 InfiniRT cpu 运行时的整体思路是否正确,后续我会完善细节,感谢! |
我看了一下,基本上没啥问题,就是咱们这次重构有个原则:尽量复用 CUDA Runtime API 的接口,换句话说,有些接口需要查一下 CUDA Toolkit 里面有没有,比如我好像没查到 |
ok,关于 CUDA Runtime API 接口我也调研罗列一下 |
InfiniCore CPU Runtime API 迁移判定表CUDA API 来源:
|
866fc8d to
7fd37b5
Compare
| void Memcpy(void* dst, const void* src, std::size_t count, MemcpyKind kind); | ||
| int Memcpy(void* dst, const void* src, std::size_t count, MemcpyKind kind); | ||
|
|
||
| int GetMemInfo(Device device, std::size_t* free_bytes, |
There was a problem hiding this comment.
这个对应的 CUDA API 是 cudaMemGetInfo,所以应该改为 MemGetInfo。而且原 API 无 device 参数,这里也不应该有。
|
|
||
| int StreamSynchronize(void* stream); | ||
|
|
||
| int StreamWaitEvent(void* stream, void* event); |
There was a problem hiding this comment.
咱们的似乎缺少了 flags 参数,先加上但是不用就行了。
|
|
||
| int EventCreate(void** event); | ||
|
|
||
| int EventCreateWithFlags(void** event, uint32_t flags); |
|
|
||
| int EventRecord(void* event, void* stream); | ||
|
|
||
| int EventQuery(void* event, int* status); |
There was a problem hiding this comment.
这个好像跟 CUDA 的参数列表不一致,这个是为啥?
| int MallocHost(void** ptr, std::size_t size); | ||
|
|
||
| int FreeHost(void* ptr); | ||
|
|
||
| int MemcpyAsync(void* dst, const void* src, std::size_t count, MemcpyKind kind, | ||
| void* stream); | ||
|
|
||
| int MallocAsync(void** ptr, std::size_t size, void* stream); | ||
|
|
||
| int FreeAsync(void* ptr, void* stream); | ||
|
|
||
| int MemsetAsync(void* ptr, int value, std::size_t count, void* stream); |
There was a problem hiding this comment.
这几个挪到上面吧,就是上面普通 Memcpy 那些的后面。
| int GetDevice(Device* device); | ||
|
|
||
| void GetDeviceCount(int* count, Device::Type type); | ||
| int GetDeviceCount(int* count, Device::Type type); |
There was a problem hiding this comment.
这个函数比 CUDA 的多了个 type 参数,需要去掉。
| int SetDevice(Device device); | ||
|
|
||
| void GetDevice(Device* device); | ||
| int GetDevice(Device* device); |
There was a problem hiding this comment.
这两个的参数也改成 int 吧,我们这一层的接口后面就跟 CUDA 的完全对齐,一模一样即可。
There was a problem hiding this comment.
按照刚才讨论的内容,CPU 的先往后放一放,优先搞英伟达 GPU 的吧。
|
经过讨论,后面咱们这一层的 API 需要跟 CUDA Runtime API 完全对齐。所谓“完全对齐”,意思就是将 CUDA 中的 那么在 InfiniRT 里面就应该是: |
将 InfiniCore 中 CPU runtime 的实现调整为复用 InfiniRT 已有的 CPU runtime 接口,对应 InfiniCore 中更改见 InfiniTensor/InfiniCore#1342
单算子测试截图:

