[Feat]: add NPU fused operators (RMSNorm, RoPE, SwiGLU, SDPA) by ys2025-AI · Pull Request #194 · modelscope/twinkle

ys2025-AI · 2026-05-18T12:12:25Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Extends Twinkle's NPU support from a basic MoE GMM patch to a full fused-operator suite (RMSNorm, RoPE, SwiGLU, SDPA) for Ascend hardware.

Experiment results

Atlas 900 A2 (8× NPU) | Qwen3-30B-A3B-Instruct-2507 | LoRA r=8, batch=16, 188 steps | Dataset GSM8K_ZH

Metric	Baseline	This PR	Delta
Total	544 s	503 s	+7.5%
Training (step 10–180)	465 s	404 s	+13.1%
Loss / GradNorm	—	—	<< 0.01

gemini-code-assist

Code Review

This pull request introduces comprehensive NPU hardware acceleration support for Ascend devices by implementing fused operators (RMSNorm, RoPE, SwiGLU, and SDPA) and monkey-patching logic for specific model families like Qwen. It also refactors the NPU patching mechanism to be applied automatically when an NPU device is detected. Review feedback focuses on improving error handling by logging tracebacks for broad exception catches and restoring type hints and assertions that were removed during the refactoring of the MoE grouped matrix multiplication functions.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

ys2025-AI

已改为在 kernel/init.py 中直接复用 Torch.is_npu_available()

ys2025-AI

已经通过pre-commit检查

ys2025-AI added 2 commits May 18, 2026 19:54

Update __init__.py

b5aedc7

Update monkey_patch_npu.py

85c677a

gemini-code-assist Bot reviewed May 18, 2026

View reviewed changes

Comment thread src/twinkle/kernel/__init__.py Outdated

Comment thread src/twinkle/kernel/__init__.py Outdated

Comment thread src/twinkle/kernel/monkey_patch_npu.py Outdated

Comment thread src/twinkle/kernel/monkey_patch_npu.py Outdated

ys2025-AI and others added 3 commits May 18, 2026 20:25

Update src/twinkle/kernel/__init__.py

1ef0e38

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update src/twinkle/kernel/__init__.py

924d17c

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Merge branch 'modelscope:main' into main

8472a43

tastelikefeet reviewed May 19, 2026

View reviewed changes

Comment thread src/twinkle/kernel/__init__.py Outdated

tastelikefeet approved these changes May 19, 2026

View reviewed changes

ys2025-AI and others added 7 commits May 19, 2026 13:00

Update __init__.py

3ae08df

Update npu.py

ccb4030

Update __init__.py

6d8d37e

Apply suggestions from code review

418cf00

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update npu.py

6ba8d34

Update __init__.py

b7adf8e

Update __init__.py

ae5f9c2

ys2025-AI commented May 19, 2026

View reviewed changes

Update __init__.py

0f91455

ys2025-AI force-pushed the main branch from 387ffc0 to f6fc407 Compare May 19, 2026 09:28

feat: Add NPU fused operators support (RMSNorm, RoPE, SwiGLU, SDPA)

0a5351b

ys2025-AI force-pushed the main branch from f6fc407 to 0a5351b Compare May 19, 2026 09:50

ys2025-AI commented May 19, 2026

View reviewed changes

tastelikefeet merged commit d82ebb6 into modelscope:main May 19, 2026
0 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat]: add NPU fused operators (RMSNorm, RoPE, SwiGLU, SDPA)#194

[Feat]: add NPU fused operators (RMSNorm, RoPE, SwiGLU, SDPA)#194
tastelikefeet merged 14 commits into
modelscope:mainfrom
ys2025-AI:main

ys2025-AI commented May 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ys2025-AI left a comment

Uh oh!

ys2025-AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ys2025-AI commented May 18, 2026

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ys2025-AI left a comment

Choose a reason for hiding this comment

Uh oh!

ys2025-AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants