Tether has unveiled an innovative AI training framework, which aims to democratize access to AI model development by allowing fine-tuning on consumer-grade hardware, such as smartphones and non-Nvidia GPUs. This new system, part of the QVAC platform, leverages Microsoft’s BitNet architecture and LoRA techniques to significantly reduce the memory and computational demands typically associated with training large AI models.
Engineers at Tether demonstrated the capability to fine-tune models with up to 1 billion parameters on mobile devices in under two hours, with support for models as large as 13 billion parameters. The framework notably decreases VRAM requirements by up to 77.8% compared to conventional 16-bit models, enhancing performance on limited hardware.
This advancement also improves the inference speed of mobile GPUs running BitNet models, allowing for faster processing than traditional CPUs. Tether highlighted potential applications including on-device training and federated learning, which can reduce the dependency on centralized cloud infrastructure.