Highlights
Stars
- All languages
- Adblock Filter List
- Assembly
- AutoHotkey
- AutoIt
- Batchfile
- Bikeshed
- C
- C#
- C++
- CMake
- CSS
- Classic ASP
- Clojure
- Cuda
- Cython
- Dart
- Erlang
- F#
- GLSL
- Go
- HTML
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kaitai Struct
- Kotlin
- LLVM
- Lua
- M4
- MATLAB
- MDX
- Makefile
- Markdown
- Mojo
- Mustache
- NASL
- Nim
- Objective-C
- Objective-C++
- PHP
- POV-Ray SDL
- Pascal
- Perl
- PowerShell
- Python
- QML
- QMake
- R
- Rich Text Format
- Roff
- Ruby
- Rust
- SCSS
- SWIG
- Sage
- Scala
- Shell
- Smarty
- Solidity
- Swift
- SystemVerilog
- TypeScript
- V
- VBA
- Verilog
- Vim Script
- Visual Basic .NET
- Vue
- WebAssembly
- YARA
- Zig
DeepEP: an efficient expert-parallel communication library
Sample codes for my CUDA programming book
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
Learn CUDA Programming, published by Packt
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
Source code that accompanies The CUDA Handbook.
PopSift is an implementation of the SIFT algorithm in CUDA.
CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning
Alex Krizhevsky's original code from Google Code
FLAME GPU 2 is a GPU accelerated agent based modelling framework for CUDA C++ and Python
Parallel Simulated annealing in GPU using CUDA (used for floorplanning problem)




