Open-source platform to train, download, and run LLMs on device.
Kolosal is designed to be fast, lightweight, and sustainable. It’s only 20 MB in size, yet it can run LLMs just as quickly or even faster than other competitors.
Hello, everyone!
Meet Kolosal AI, an open-source platform that lets you run LLMs right on your own device—whether you’re using a CPU or GPU. With Kolosal, your privacy stays protected and energy consumption remains low, making it more eco-friendly compared to large-scale AI systems like ChatGPT or Gemini.
Kolosal is designed to be fast, lightweight, and sustainable. It’s only 20 MB in size (that’s just 0.1–0.2% of the size of similar platforms like Ollama or LMStudio), yet it can run LLMs just as quickly or even faster than its competitors. Soon (once we fix a few bugs), Kolosal will also be able to run on other devices, including smartphones and single-board computers like the Raspberry Pi or Jetson.
We’re a team of passionate students committed to optimizing on-device LLM training and deployment.
Try running it locally with Kolosal. Check it out at https://kolosal.ai/.
@rifky_bujana_bisri Impressive how this tiny 20 MB model can handle LLMs! It’s a perfect example of how technology can pack so much power into such a compact and efficient system.
@rifky_bujana_bisri suggesting that Kolosal is just 20MB is pretty misleading, since the models are often still in the 10s of GBs on disk.
That it's Windows-only is also somewhat a surprising choice?
@chrismessina i guess i should have made disclaimer that the size of the application of it, sorry for that, but still ollama or lmstudio took 2-4gb of disk without any model. As this is still our first mvp built in weeks, it still lack in features, and os support. But we're working on the linux and mac support (we've been using framework that is os agnostic, so the transition wouldn't be painful), we choose windows for now because it's the most accessible for development as my device is windows.
@kay_arkain Hi! Thank you for your interest with Kolosal, in terms of performance, it would depends on what model. The bigger the model, the better the performance. Model like DeepSeek R1 600B ie, might need multiple gpus to run, but model like DeepSeek R1 8B can run on RTX 3080 easily, but it requires bigger computer to run it. But, your data will stay within your computer, and you can even run it fully offline without any internet!
Kolosal AI