Skip to content

[Do Not Merge] model : LFM2.5-Audio-1.5B#18641

Draft
tdakhran wants to merge 49 commits intoggml-org:masterfrom
tdakhran:tarek/feat/os-lfm2.5-audio-1.5b-upstream
Draft

[Do Not Merge] model : LFM2.5-Audio-1.5B#18641
tdakhran wants to merge 49 commits intoggml-org:masterfrom
tdakhran:tarek/feat/os-lfm2.5-audio-1.5b-upstream

Conversation

@tdakhran
Copy link
Copy Markdown
Contributor

@tdakhran tdakhran commented Jan 6, 2026

Liquid AI released LFM2.5-Audio-1.5B.

LFM2.5-Audio-1.5B is Liquid AI's updated end-to-end audio foundation model. Key improvements include a custom, LFM based audio detokenizer, llama.cpp compatible GGUFs for CPU inference, and better ASR and TTS performance.

This PR is intended to provide a functional implementation in llama.cpp until necessary infrastructure is implemented.
The plan is to split and merge it into upstream in smaller chunks, while keeping and tracking functional implementation here. It will be rebased from time to time.

GGUFs, precompiled runners, and instructions, live in https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-GGUF.

Merge plan:

Demo of capabilities (watch with audio on)

demo.mp4

Thank you, @ngxson for the help!

@github-actions github-actions bot added model Model specific examples python python script changes server labels Jan 6, 2026
@tdakhran tdakhran force-pushed the tarek/feat/os-lfm2.5-audio-1.5b-upstream branch from c275436 to e1a8fd1 Compare January 6, 2026 14:46
@tdakhran
Copy link
Copy Markdown
Contributor Author

tdakhran commented Jan 6, 2026

@ngxson @CISC is there a way to disable CI for this PR? There is no need to build it for each commit.

@CISC
Copy link
Copy Markdown
Member

CISC commented Jan 6, 2026

@ngxson @CISC is there a way to disable CI for this PR? There is no need to build it for each commit.

Only way I know is to have a merge conflict.

@ggerganov
Copy link
Copy Markdown
Member

If the string [no ci] is present anywhere in the commit message, it won't execute the CI

@CISC
Copy link
Copy Markdown
Member

CISC commented Jan 6, 2026

If the string [no ci] is present anywhere in the commit message, it won't execute the CI

Or that. We just have to remember to remove them all from the merge message. :)

Change is decoupled from ggml-org#18641.

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B)
needs streaming istft for generating output audio.

* add streaming ISTFT class (`mtmd_audio_streaming_istft`) with overlap-add for audio reconstruction
* replace global audio cache with per-instance cache, the model requires
  two independent caches, for preprocessing (audio input) and for istft
  (audio output).
* unified templated FFT/IFFT implementation supporting both forward and inverse transforms
… tarek/feat/os-lfm2.5-audio-1.5b-upstream

[no ci]
ngxson pushed a commit that referenced this pull request Jan 6, 2026
Change is decoupled from #18641.

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B)
needs streaming istft for generating output audio.

* add streaming ISTFT class (`mtmd_audio_streaming_istft`) with overlap-add for audio reconstruction
* replace global audio cache with per-instance cache, the model requires
  two independent caches, for preprocessing (audio input) and for istft
  (audio output).
* unified templated FFT/IFFT implementation supporting both forward and inverse transforms
@elfarolab
Copy link
Copy Markdown

@tdakhran

Hello Tarek,

I am trying to build your WIP PR.
I know it is a draft, it should be considered work in progress.

With the last commit: 'Read n_layer from gguf', using LTO, building fails at the very end of building here:

FAILED: bin/llama-liquid-audio-cli
: && /usr/bin/c++ -O3 -DNDEBUG  tools/liquid-audio/CMakeFiles/llama-liquid-audio-cli.dir/cli.cpp.o -o bin/llama-liquid-audio-cli  tools/liquid-audio/libliquid-audio.a  common/libcommon.a  /usr/lib/aarch64-linux-gnu/libcurl.so  tools/mtmd/libmtmd.a  src/libllama.a  ggml/src/libggml.a  ggml/src/libggml-cpu.a  /usr/lib/gcc/aarch64-linux-gnu/11/libgomp.a  /usr/lib/aarch64-linux-gnu/libpthread.a  ggml/src/ggml-blas/libggml-blas.a  /usr/lib/aarch64-linux-gnu/libopenblas.so.0  ggml/src/ggml-cuda/libggml-cuda.a  ggml/src/libggml-base.a  -lm  /usr/local/cuda-12.6/targets/aarch64-linux/lib/libcudart_static.a  /usr/local/cuda-12.6/targets/aarch64-linux/lib/libcublas_static.a  /usr/local/cuda-12.6/targets/aarch64-linux/lib/libcublasLt_static.a  /usr/local/cuda-12.6/targets/aarch64-linux/lib/libculibos.a  /usr/local/cuda-12.6/targets/aarch64-linux/lib/stubs/libcuda.so  -ldl  /usr/lib/aarch64-linux-gnu/librt.a && :
/usr/bin/ld: tools/mtmd/libmtmd.a(mtmd-helper.cpp.o):(.bss+0x28): multiple definition of `ma_atomic_global_lock'; tools/liquid-audio/CMakeFiles/llama-liquid-audio-cli.dir/cli.cpp.o:(.bss+0x0): first defined here
lto-wrapper: warning: using serial compilation of 17 LTRANS jobs
collect2: error: ld returned 1 exit status
[474/474] : && /usr/bin/c++ -O3 -DNDEBUG  tools/liquid-audio/CMakeFiles/llama-liquid-audio-server.dir/server.cpp.o -o bin/llama-liquid-audio-server  tools/liquid-audio/libliquid-audio.a  vendor/cpp-httplib/libcpp-httplib.a  common/libcommon.a  /usr/lib/aarch64-linux-gnu/libcurl.so  tools/mtmd/libmtmd.a  src/libllama.a  ggml/src/libggml.a  ggml/src/libggml-cpu.a  /usr/lib/gcc/aarch64-linux-gnu/11/libgomp.a  /usr/lib/aarch64-linux-gnu/libpthread.a  ggml/src/ggml-blas/libggml-blas.a  /usr/lib/aarch64-linux-gnu/libopenblas.so.0  ggml/src/ggml-cuda/libggml-cuda.a  ggml/src/libggml-base.a  -lm  /usr/local/cuda-12.6/targets/aarch64-linux/lib/libcudart_static.a  /usr/local/cuda-12.6/targets/aarch64-linux/lib/libcublas_static.a  /usr/local/cuda-12.6/targets/aarch64-linux/lib/libcublasLt_static.a  /usr/local/cuda-12.6/targets/aarch64-linux/lib/libculibos.a  /usr/local/cuda-12.6/targets/aarch64-linux/lib/stubs/libcuda.so  -ldl  /usr/lib/aarch64-linux-gnu/librt.a  /usr/lib/aarch64-linux-gnu/libssl.so  /usr/lib/aarch64-linux-gnu/libcrypto.so && :
lto-wrapper: warning: using serial compilation of 17 LTRANS jobs
ninja: build stopped: subcommand failed.

llama-server and llama-liquid-audio-server are succefully built, cli fails.

If there is anything I can do to help testing let me know.
I am building a system also with this model on Jetson Orin.

Thank you so much.

@tdakhran
Copy link
Copy Markdown
Contributor Author

tdakhran commented Jan 7, 2026

@elfarolab , mentioned commit didn't change anything related to compilation or LTO, could it be that there are stale object files somewhere?

Tested that the clean build in ubuntu:24.04 Docker image works

root@1641914992f4:/tmp/build# cmake /mnt -DLLAMA_CURL=OFF
root@1641914992f4:/tmp/build# make -j20 llama-liquid-audio-cli llama-liquid-audio-server
...
[ 98%] Built target liquid-audio
[100%] Built target llama-liquid-audio-cli
[100%] Built target llama-liquid-audio-server

UPD: it's related to miniaudio

cli defines implementation here https://github.com/ggml-org/llama.cpp/pull/18641/changes#diff-73f13371b37801825dc2cdbfacadf9af40aef9dca4770d9dacbbe3534c7a7dacR13 , another implementation is defined in mtmd audio.

try commenting this line

@elfarolab
Copy link
Copy Markdown

Before building I delete the building destination directory every time.
I am building with these options:

CMAKE_BUILD_TYPE=Release
CMAKE_INSTALL_PREFIX=$LLAMACPP_PREFIX_DIR
GGML_CUDA=ON
GGML_CUDA_FA=ON
GGML_CUDA_GRAPHS=ON
GGML_CUDA_FORCE_CUBLAS=ON
GGML_BLAS=ON
GGML_BLAS_VENDOR=OpenBLAS
BLAS_LIBRARIES="$OPENBLAS_LIB"
GGML_CUDA_USE_MMQ=ON
GGML_CUDA_FA_ALL_QUANTS=ON
GGML_AVX=OFF
GGML_AVX2=OFF
GGML_AVX512=OFF
GGML_SSE42=OFF
GGML_F16C=OFF
GGML_FMA=OFF
GGML_ACCELERATE=OFF
GGML_METAL=OFF
GGML_OPENCL=OFF
GGML_SYCL=OFF
GGML_HEXAGON=OFF
GGML_HIP=OFF
GGML_WEBGPU=OFF
GGML_VULKAN=OFF
GGML_LTO=ON
BUILD_SHARED_LIBS=OFF
GGML_STATIC=ON
CMAKE_CUDA_ARCHITECTURES=87
GGML_CUDA_F16=ON
GGML_CUDA_BF16=ON
BLA_STATIC=ON
LLAMA_BUILD_EXAMPLES=ON
LLAMA_BUILD_TESTS=OFF
LLAMA_OPENSSL=ON
LLAMA_CURL=ON
GGML_CUDA_JETSON_DEVICE=ON
GGML_CUDA_ENABLE_UNIFIED_MEMORY=ON
LLAMA_TOOLS_INSTALL=ON
GGML_BACKEND_DL=OFF
GGML_CPU_ALL_VARIANTS=OFF

I always build llama.cpp the same way with the options above, never get failures.
Also it is not the first time I build a PR.
I could try building without ninja.

@tdakhran
Copy link
Copy Markdown
Contributor Author

tdakhran commented Jan 7, 2026

@elfarolab , it should work now, there were two implementations of miniaudio

@elfarolab
Copy link
Copy Markdown

@elfarolab , it should work now, there were two implementations of miniaudio

rebuilding

@tdakhran tdakhran force-pushed the tarek/feat/os-lfm2.5-audio-1.5b-upstream branch from 4f1cc0c to 4bee388 Compare February 17, 2026 14:06
@tdakhran tdakhran force-pushed the tarek/feat/os-lfm2.5-audio-1.5b-upstream branch from 4bee388 to 39ff210 Compare February 17, 2026 14:06
[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```
tdakhran added a commit to tdakhran/llama.cpp that referenced this pull request Feb 18, 2026
[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```
CISC added a commit that referenced this pull request Feb 19, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in #18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@IceWreck
Copy link
Copy Markdown

IceWreck commented Mar 2, 2026

@tdakhran I see all 4 have been merged does this mean LFM2.5 Audio works on LlamaCPP?

ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Mar 3, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@tdakhran
Copy link
Copy Markdown
Contributor Author

tdakhran commented Mar 3, 2026

@tdakhran I see all 4 have been merged does this mean LFM2.5 Audio works on LlamaCPP?

It's yes and no. ASR part was merged long time ago, for speech output, changes to mtmd API and llama server are required.

@zcattacz
Copy link
Copy Markdown

zcattacz commented Mar 7, 2026

I tried to build from your branch at the last commit 006639c, but ended up with the following error:

cmake -B build -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS
cmake --build build --config Release
....
[ 71%] Linking CXX shared library ../../bin/libmtmd.so
[ 71%] Built target mtmd
[ 72%] Building C object tests/CMakeFiles/test-mtmd-c-api.dir/test-mtmd-c-api.c.o
In file included from /dev/shm/llama/llama.cpp-tarek-feat-os-lfm2.5-audio-1.5b-upstream/tests/test-mtmd-c-api.c:4:
/dev/shm/llama/llama.cpp-tarek-feat-os-lfm2.5-audio-1.5b-upstream/tools/mtmd/./mtmd.h:266:10: error: unknown type name ‘mtmd_output_modality’
  266 | MTMD_API mtmd_output_modality mtmd_get_output_modality(mtmd_context * ctx);
      |          ^~~~~~~~~~~~~~~~~~~~
/dev/shm/llama/llama.cpp-tarek-feat-os-lfm2.5-audio-1.5b-upstream/tools/mtmd/./mtmd.h:278:68: error: unknown type name ‘mtmd_output_modality’
  278 | MTMD_API void mtmd_set_output_modalities(mtmd_context * ctx, const mtmd_output_modality * ptr, size_t len);
      |                                                                    ^~~~~~~~~~~~~~~~~~~~
gmake[2]: *** [tests/CMakeFiles/test-mtmd-c-api.dir/build.make:76: tests/CMakeFiles/test-mtmd-c-api.dir/test-mtmd-c-api.c.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:3096: tests/CMakeFiles/test-mtmd-c-api.dir/all] Error 2

I chagned mtmd_output_modality in mtmd.h to

typedef enum mtmd_output_modality {
    MTMD_OUTPUT_MODALITY_TEXT,
    MTMD_OUTPUT_MODALITY_AUDIO,
    MTMD_OUTPUT_MODALITY_END,
} mtmd_output_modality;

Grabbed the tarball from the webpage. Hmm, had to -DLLAMA_BUILD_TESTS=OFF to finish.

[ 72%] Linking CXX executable ../bin/test-mtmd-c-api
/usr/bin/ld: ../bin/libmtmd.so.0.0.0: undefined reference to `common_init_result::~common_init_result()'
/usr/bin/ld: ../bin/libmtmd.so.0.0.0: undefined reference to `common_init_result::model()'
/usr/bin/ld: ../bin/libmtmd.so.0.0.0: undefined reference to `common_init_result::context()'
/usr/bin/ld: ../bin/libmtmd.so.0.0.0: undefined reference to `LLAMA_BUILD_NUMBER'
/usr/bin/ld: ../bin/libmtmd.so.0.0.0: undefined reference to `common_init_from_params(common_params&)'
/usr/bin/ld: ../bin/libmtmd.so.0.0.0: undefined reference to `LLAMA_COMMIT'
collect2: error: ld returned 1 exit status
gmake[2]: *** [tests/CMakeFiles/test-mtmd-c-api.dir/build.make:123: bin/test-mtmd-c-api] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:3096: tests/CMakeFiles/test-mtmd-c-api.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2

@tdakhran
Copy link
Copy Markdown
Contributor Author

tdakhran commented Mar 7, 2026

@zcattacz , this is a draft PR, not all targets are guaranteed to build.

This works

cmake --build build --target llama-server --target llama-liquid-audio-server --target llama-liquid-audio-cli

TimPietruskyRunPod added a commit to runpod-labs/a2go-llamacpp that referenced this pull request Apr 3, 2026
Remove PR ggml-org#12794 (OuteTTS 1.0) and PR ggml-org#18039 (Eagle-3 speculative
decoding) from the cherry-pick list. Neither is used by any model
in the registry. Only PR ggml-org#18641 (LFM2.5 audio) remains.
@ngxson
Copy link
Copy Markdown
Contributor

ngxson commented Apr 15, 2026

FYI @tdakhran , I had some discussions recently with nvidia team to bring their chatterbox to llama.cpp. I summarized the design choice in #18641

I'll try to take over this PR when I have time (and implement it as the reference for the new audio generation API in mtmd). Feel free to continue the discussion in the mentioned issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples model Model specific python python script changes server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants