NVIDIA
p/nvidia
The official handle for NVIDIA.
Chris Messina
NVLM 1.0 — Open frontier-class multimodal LLMs
Featured
16
A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 405B and InternVL 2).
Replies
Chris Messina
Top Hunter
Hunter
📌
This is a big deal for the open source LLM ecosystem: Nvidia’s release of NVLM 1.0 marks a pivotal moment in AI development. By open-sourcing a model that rivals proprietary giants, Nvidia isn’t just sharing code—it’s challenging the very structure of the AI industry.
Owen Far
@chrismessina this is definitely a smart move from Nvidia. Unlike other companies, their profit comes from selling hardware (and dedicated AI hardware). Apple did the same with their OS eventually, and they can anyways find ways to make profit from this later on as well. Nice to see this!
Johan Steneros
The future of AI will be Open Source. It makes so much sense that NVIDIA is pushing for this.
Jonathan Viet Pham
Congrats to the NVLM team on the launch of 1.0! Does NVLM offer any unique advantages for specific industries or applications compared to other leading models?
Yash Chudasama
Future is OpenSource, more multimodal LLMs like this should be produces.
Grace Phillips
This model family sounds promising especially with claims of rivaling top proprietary and open access models. @chrismessina But how does it perform on edge cases particularly with noisy or ambiguous inputs in vision language tasks? Does the model degrade gracefully or does it struggle in those scenarios?
hxj tsu
I have the same problem with facing same issue but no response from anyone and couldn't find this topic troubleshooting in search engine. The solution worked for me thanks to the community and the members for the solution.
hxj tsu
I have the same problem with facing same issue but no response from anyone and couldn't find this topic troubleshooting in search engine. https://www.gm-socrates.com
Kyrylo Silin
How is the model's performance on specific vision-language tasks compared to proprietary models?
Olena Variacheva
I recommend NVLM 1.0! It is an open series of multimodal language models that demonstrates outstanding results in visualization and language-related tasks.
Charlie Greene
Congratulations on the launch! It’s exciting to see such innovation in vision-language tasks, and I can’t wait to see how they compete with the leading models. Great work!
Zishan Iqbal
Launching soon!
@chrismessina Congratulations for producing these cutting-edge multimodal LLMs! Can these models be optimized for certain industries or applications?
Sage Wang
What stands out most is the model’s ability to outperform its LLM backbone on text-only tasks after multimodal training.
Glenn Max
Just checked out the new model, and it looks really impressive! Excited to see how it compares to others.
henry
Congrats on launching this groundbreaking family of multimodal LLMs. @chrismessina Achieving state of the art results in vision language tasks and competing with both proprietary and open access models is no small feat. The ability to rival models like GPT-4o and Llama 3-V 405B is truly impressive. Wishing you all the success.
John William
Amazing to see such progress in multimodal LLMs. I had an idea that could make it even better what about adding modular components for different tasks like vision heavy or language dominant workloads? Allowing users to customize the model for specific use cases could increase its versatility and adoption.