Dev1/winskuo/gh /aihub remove code#2
Closed
winskuo-quic wants to merge 21 commits intomainfrom
Closed
Conversation
Summary: Attempt 3 to check numel and nbytes overflow. This time we defer checking dynamic sized inputs until their size is realized. Reviewed By: lucylq Differential Revision: D98148157
…inear (pytorch#19117) ## Summary The bilinear grid_sampler_2d portable kernel computes interpolation weights via subtractions like `(ix_se - ix)` where both operands are close integer-valued coordinates in pixel space. In fp16 (10 bits of mantissa) that's classic catastrophic cancellation — the result has only a handful of significant bits. The downstream weighted-sum accumulation then loses further precision. Measured on a unit test exercising interior grid points with fp16 inputs, the kernel drifts by ~0.1 absolute from an fp32 reference. That's visible as incorrect depth / flow output near non-integer sample points, which is most of them. ## Fix An `AccType<CTYPE>` trait mapping `Half` and `BFloat16` to `float`, leaving every other dtype unchanged. Used for intermediate coordinate, weight computation, and `out_val` accumulation. Loads cast `CTYPE -> ACC`; the final store casts `ACC -> CTYPE` once. Only internal math is promoted; memory layout / public API / tensor dtypes are unchanged. ```cpp template <typename CTYPE> using AccType = std::conditional_t< std::is_same_v<CTYPE, executorch::aten::Half> || std::is_same_v<CTYPE, executorch::aten::BFloat16>, float, CTYPE>; ``` ## Effects - **fp32 / Int / any non-half dtype**: `AccType<T>` is `T`, so the generated code is byte-identical. No behavior change. - **Half / BFloat16**: `max_abs` vs an fp32 reference drops from **~0.1 to 0** on the shapes I tested (N=1..2, C=7..64, H/W up to 96, both `align_corners` values). - **Perf**: a handful of fp16↔fp32 conversions per output element. Not measurable at op level; well within the portable kernel's scalar cost envelope. ## Scope Only touches the bilinear interpolation path. The nearest-mode path doesn't do weighted-sum accumulation and doesn't have the cancellation issue — left alone in this change. ## Test plan - [x] Builds clean for Android arm64 and host (Apple Clang 21). - [x] Verified numerically via a standalone harness that runs the kernel with matched fp32 / fp16 inputs and compares against an fp32-then-downcast reference. All shapes pass within a single fp16 ULP (or are bit-exact). fp32 tests remain bit-identical to the pre-change kernel. - [x] Existing `kernels/test/op_grid_sampler_2d_test.cpp` unit tests continue to pass (both fp32 shapes that were previously tested, and the fp16 path I'm specifically fixing). Happy to add an fp16-specific test case to `op_grid_sampler_2d_test.cpp` if useful for CI coverage here — just let me know the preferred approach. cc @larryliu0820 @manuelcandales
Differential Revision: D101728720 Pull Request resolved: pytorch#19052
### Summary Fixes pytorch#18924 Extends `aten.bitwise_not` support in the MLX delegate to handle integer tensors, not just boolean tensors. Previously the handler only dispatched to `LogicalNotNode` for `bool` and raised `NotImplementedError` for all other dtypes. This adds a dedicated `BitwiseInvertNode` backed by `mlx::core::bitwise_invert`, and updates the handler to dispatch based on dtype: - `bool` → `LogicalNotNode` (unchanged) - `int32`, `int64` → `BitwiseInvertNode` #### Changes: - `serialization/schema.fbs`: add `BitwiseInvertNode` table and append to `OpNode` union - `runtime/MLXInterpreter.h`: add `exec_bitwise_invert()` and dispatch case - `ops.py`: update `_bitwise_not_handler` to dispatch to `BitwiseInvertNode` for integers - `test/test_ops.py`: add `bitwise_not_int` test for `int32` and `int64` ### Test plan All tests were ran on a machine with an Apple M1 Pro CPU, macOS 26.4.1. - `python3 -m py_compile backends/mlx/ops.py backends/mlx/test/test_ops.py` - `python3 backends/mlx/serialization/generate.py` - `python3 -m executorch.backends.mlx.test.run_all_tests bitwise_not_int` ### Test output ``` ============================================================ TEST SUMMARY ============================================================ Passed: 6 Failed: 0 ============================================================ ``` cc @metascroy --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Scott Roy <161522778+metascroy@users.noreply.github.com>
Differential Revision: D102070375 Pull Request resolved: pytorch#19074
pytorch#19075 failed to cp to main
Add Closeable interface so Module can be used with try-with-resources. close() delegates to destroy(). Also make destroy() idempotent by checking mHybridData.isValid() before calling resetNative(), satisfying the Closeable contract. This commit was authored with the help of Claude.
TrainingModule: implement Closeable, replace Log.e + silent empty returns with IllegalStateException throws. Add checkNotDestroyed() guard on all public methods. SGD: throw IllegalStateException instead of bare RuntimeException when optimizer is destroyed. AsrModule: throw ExecutorchRuntimeException instead of bare RuntimeException on transcription failure. ExecuTorchRuntime.validateFilePath: throw IllegalArgumentException instead of bare RuntimeException, with descriptive message. JNI constructors: wrap ExecuTorchJni and ExecuTorchLlmJni constructor bodies in try-catch so C++ exceptions become ExecutorchRuntimeException instead of generic RuntimeException. This commit was authored with the help of Claude. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Differential Revision: D102425537 Pull Request resolved: pytorch#19128
…_nhwc operator + tests Differential Revision: D96507563 Pull Request resolved: pytorch#18479
…ch#19028 (pytorch#19133) ## Summary Reverts the following Android PRs: - pytorch#19099 — Android: consistent error types across all modules - pytorch#19124 — Android: Module implements Closeable - pytorch#19092 — Android: improve error diagnostics for LlmModule and exceptions - pytorch#19028 — Ignored Module tests: provide required input tensor Authored with Claude.
Differential Revision: D102488314 Pull Request resolved: pytorch#19134
Differential Revision: D102385104 Pull Request resolved: pytorch#19122
Differential Revision: D102493794 Pull Request resolved: pytorch#19136
…rch#19137) Differential Revision: D102493838 Pull Request resolved: pytorch#19137
…ualcomm/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
[PLEASE REMOVE] See CONTRIBUTING.md's Pull Requests for ExecuTorch PR guidelines.
[PLEASE REMOVE] If this PR closes an issue, please add a
Fixes #<issue-id>line.[PLEASE REMOVE] If this PR introduces a fix or feature that should be the upcoming release notes, please add a "Release notes: " label. For a list of available release notes labels, check out CONTRIBUTING.md's Pull Requests.
Test plan
[PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.