[PHI][CINN] Fix cum kernel for big tensor #72562
Open
+101
−125
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Category
Operator Mechanism
PR Types
Bug fixes
Description
修复Cum Kernel在大Tensor下的非法配置和访存越界问题
Cum是一个组合算子,涉及到以下几个基础算子,本PR做的改造如下:
(因为去掉了shm)
注:MatrixTranspose本来可以用现成的TilingSwapDim1And2的,但是我看了下那个算子正确性也没法保证,还是先就地改造好了
对PaddleAPITest中的config进行了测试,均能运行,且与numpy.cumsum的精度比对通过;部分大shape下精度误差较大,跟reduce算法有关,有待后续改造
Pcard-85711