An LPU Inference Engine, with LPU standing for Language Processing Unit™, is a new type of end-to-end processing unit system that provides the fastest inference at ~500 tokens/second.
Wow, love it. We are heavily relying on LLMs and the slowness of our agents is a constant annoyance. A 14x speed-up would be a real game changer. Can't wait to see LPUs in action and at scale.
Keep going!