Product Hunt – The best new products in tech.

Start new thread

Topic Forums

p/general

p/product-recommendations

Product Forums

p/nvidia

The official handle for NVIDIA.

Visit Product

NVLM 1.0 — Open frontier-class multimodal LLMs

Chris Messina

Featured

•

4mo ago

A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 405B and InternVL 2).

Replies

Chris Messina

Top Hunter

Hunter

📌

This is a big deal for the open source LLM ecosystem: Nvidia’s release of NVLM 1.0 marks a pivotal moment in AI development. By open-sourcing a model that rivals proprietary giants, Nvidia isn’t just sharing code—it’s challenging the very structure of the AI industry.

4mo ago

Owen Far

Sava OS

@chrismessina this is definitely a smart move from Nvidia. Unlike other companies, their profit comes from selling hardware (and dedicated AI hardware). Apple did the same with their OS eventually, and they can anyways find ways to make profit from this later on as well. Nice to see this!

4mo ago

Johan Steneros

The future of AI will be Open Source. It makes so much sense that NVIDIA is pushing for this.

4mo ago

Jonathan Viet Pham

Diaflow

Congrats to the NVLM team on the launch of 1.0! Does NVLM offer any unique advantages for specific industries or applications compared to other leading models?

4mo ago

Yash Chudasama

Video SDK

Future is OpenSource, more multimodal LLMs like this should be produces.

4mo ago

Grace Phillips

This model family sounds promising especially with claims of rivaling top proprietary and open access models. @chrismessina But how does it perform on edge cases particularly with noisy or ambiguous inputs in vision language tasks? Does the model degrade gracefully or does it struggle in those scenarios?

4mo ago

hxj tsu

I have the same problem with facing same issue but no response from anyone and couldn't find this topic troubleshooting in search engine. The solution worked for me thanks to the community and the members for the solution.

4mo ago

hxj tsu

I have the same problem with facing same issue but no response from anyone and couldn't find this topic troubleshooting in search engine. https://www.gm-socrates.com

4mo ago

Kyrylo Silin

Flag Match

How is the model's performance on specific vision-language tasks compared to proprietary models?

4mo ago

Olena Variacheva

MagiScan AI 3D Scanner app

I recommend NVLM 1.0! It is an open series of multimodal language models that demonstrates outstanding results in visualization and language-related tasks.

4mo ago

Charlie Greene

Congratulations on the launch! It’s exciting to see such innovation in vision-language tasks, and I can’t wait to see how they compete with the leading models. Great work!

4mo ago

Zishan Iqbal

Launching soon!

@chrismessina Congratulations for producing these cutting-edge multimodal LLMs! Can these models be optimized for certain industries or applications?

4mo ago

Sage Wang

TabsMagic

What stands out most is the model’s ability to outperform its LLM backbone on text-only tasks after multimodal training.

4mo ago

Glenn Max

Just checked out the new model, and it looks really impressive! Excited to see how it compares to others.

4mo ago

henry

Congrats on launching this groundbreaking family of multimodal LLMs. @chrismessina Achieving state of the art results in vision language tasks and competing with both proprietary and open access models is no small feat. The ability to rival models like GPT-4o and Llama 3-V 405B is truly impressive. Wishing you all the success.

4mo ago

John William

Amazing to see such progress in multimodal LLMs. I had an idea that could make it even better what about adding modular components for different tasks like vision heavy or language dominant workloads? Allowing users to customize the model for specific use cases could increase its versatility and adoption.

4mo ago