Skip to content

Commit e238abc

Browse files
authored
Merge pull request #6 from MooreThreads/xd/agents.md
Add AGENTS.md
2 parents ad14d4a + e299f0c commit e238abc

File tree

1 file changed

+139
-0
lines changed

1 file changed

+139
-0
lines changed

AGENTS.md

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
# AGENTS.md
2+
3+
## Project Overview
4+
5+
**pymtml** is Python bindings for the Moore Threads Management Library (MTML) - a C-based API for monitoring and managing Moore Threads GPU devices. It provides:
6+
7+
1. **Native MTML bindings** - Direct Python wrappers for libmtml.so C library functions
8+
2. **NVML compatibility layer** - Drop-in replacement for NVIDIA's pynvml library
9+
10+
Moore Threads GPUs use **MUSA** (Meta-computing Unified System Architecture) as their compute platform, analogous to NVIDIA's CUDA.
11+
12+
### Key Files
13+
14+
- `pymtml.py` - Main library with all MTML bindings and NVML wrapper functions
15+
- `mtml_2.2.0.h` - C header file defining the MTML API (reference for adding new bindings)
16+
- `test_pymtml.py` - Tests for native MTML APIs
17+
- `test_pynvml.py` - Tests for NVML-compatible wrapper APIs
18+
- `test_sglang_compat.py` - Tests for sglang framework compatibility
19+
20+
### NVML Compatibility
21+
22+
Projects using pynvml can switch to pymtml with a single import change:
23+
24+
```python
25+
# Replace: import pynvml
26+
import pymtml as pynvml
27+
28+
# All pynvml.nvml* functions work the same
29+
pynvml.nvmlInit()
30+
device = pynvml.nvmlDeviceGetHandleByIndex(0)
31+
name = pynvml.nvmlDeviceGetName(device)
32+
pynvml.nvmlShutdown()
33+
```
34+
35+
## Build and Test Commands
36+
37+
```bash
38+
# Format code (isort + black)
39+
make format
40+
41+
# Lint code (flake8)
42+
make lint
43+
44+
# Run tests
45+
make test # pytest
46+
python test_pymtml.py # Native MTML API tests
47+
python test_pynvml.py # NVML wrapper tests
48+
python test_sglang_compat.py # sglang compatibility tests
49+
50+
# Build wheel package
51+
make build
52+
53+
# Clean build artifacts
54+
make clean
55+
56+
# Publish to PyPI
57+
make publish
58+
59+
# Run all (format, lint, test, build)
60+
make all
61+
```
62+
63+
## Code Style Guidelines
64+
65+
- **Formatter**: isort + black (not yapf)
66+
- **Linter**: flake8 with max-line-length=120
67+
- **Naming conventions**:
68+
- Native MTML functions: `mtmlXxx()` (e.g., `mtmlDeviceGetName`)
69+
- NVML wrapper functions: `nvmlXxx()` (e.g., `nvmlDeviceGetName`)
70+
- Constants: `MTML_XXX` or `NVML_XXX`
71+
- **ctypes patterns**: Use `c_uint`, `c_char`, `byref()`, `POINTER()` for C bindings
72+
- **Error handling**: Raise `MTMLError` (aliased as `NVMLError`) for all failures
73+
74+
## Adding New MTML Bindings
75+
76+
1. Find the function signature in `mtml_2.2.0.h`
77+
2. Define any new structs in pymtml.py using `_PrintableStructure`
78+
3. Implement the wrapper function following existing patterns:
79+
80+
```python
81+
def mtmlDeviceGetSomething(device):
82+
global libHandle
83+
c_result = c_uint()
84+
fn = _mtmlGetFunctionPointer("mtmlDeviceGetSomething")
85+
ret = fn(device, byref(c_result))
86+
_mtmlCheckReturn(ret)
87+
return c_result.value
88+
```
89+
90+
4. Add corresponding NVML wrapper if applicable
91+
5. Add test cases to `test_pymtml.py` and/or `test_pynvml.py`
92+
93+
## Testing Instructions
94+
95+
- **Always run tests before committing**: `python test_pymtml.py && python test_pynvml.py`
96+
- **Tests require Moore Threads GPU hardware** with driver and libmtml.so installed
97+
- **Test init/shutdown cycles**: The library supports multiple init/shutdown cycles
98+
- **Check for segfaults**: Library shutdown must not cause crashes
99+
100+
## Security Considerations
101+
102+
- This library loads `libmtml.so` dynamically via ctypes
103+
- No network operations or external data fetching
104+
- GPU operations require appropriate system permissions
105+
- Handle device handles carefully - don't use after shutdown
106+
107+
## Common Patterns
108+
109+
### Library lifecycle
110+
```python
111+
mtmlLibraryInit() # or nvmlInit()
112+
# ... use library ...
113+
mtmlLibraryShutDown() # or nvmlShutdown()
114+
```
115+
116+
### Device iteration
117+
```python
118+
count = mtmlLibraryCountDevice()
119+
for i in range(count):
120+
device = mtmlLibraryInitDeviceByIndex(i)
121+
# ... query device ...
122+
```
123+
124+
### Sub-component access (GPU, Memory, VPU)
125+
```python
126+
device = mtmlLibraryInitDeviceByIndex(0)
127+
gpu = mtmlDeviceInitGpu(device)
128+
memory = mtmlDeviceInitMemory(device)
129+
# ... use gpu/memory ...
130+
mtmlDeviceFreeGpu(gpu)
131+
mtmlDeviceFreeMemory(memory)
132+
```
133+
134+
## Known Issues
135+
136+
- `nvmlDeviceGetCudaComputeCapability()` returns `(0, 0)` unless torch_musa is available
137+
- Use `patch_torch_c_for_musa()` to patch torch._C with functions from torch_musa._MUSAC
138+
- PCI busId field may be empty from driver; library auto-fills from sbdf if needed
139+

0 commit comments

Comments
 (0)