Skip to content

Gguf chunking #397

@3inary

Description

@3inary

Describe the feature

After a laborious journey through Apple Notarization, I discovered that the only way to package and ship builds for mac os with larger LLMs is through gguf chunking.

Notarization fails for files larger ≈ 4GB
Loading chunked ggufs already works in LLMUnity/llama.cpp

If you could issue a warning or document this, it might save other OSX developers a lot of headaches.
Ideally, chunking via llama-gguf-split (part of llama.cpp tools) would be integrated into LLMUnity and offered via the LLM/Build Manager.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions