Kern AI refinery
p/kern-ai-refinery
Treating training data as if it was source code
Johannes Hötter
refinery — Treating training data as if it was source code
Featured
41
refinery is the data-centric sibling of your favorite programming IDE. It provides an easy-to-use interface for weak supervision and data management, neural search, and monitoring to ensure that the quality of your training data is as good as possible.
Replies
Sushant Sharma
great work!!
Johannes Hötter
@sushant_sharma3 Thank you very much Sushant! :)
Rahul Nair
Wow, I love this idea, and very useful. Congrats on your launch!
Johannes Hötter
@rahul_nair93 Thanks Rahul, we're only getting started, but thanks for the kind words :)
Ravi Sojitra
Super duper.... I love the open source solutions. Also online playground is awesome... Congrats. :)
Johannes Hötter
@ravisojitra Thanks a lot! Yes, we're big believers of OS too, and we believe that it just makes it so much easier for our users to give it a try on their local machine. And if not - that's what the online playground is for :)
Jonathan Reimer
At crowd.dev we‘ve been loving using refinery for managing our test data! Congrats to the whole team 👏
Jens Wittmeyer
@jonathan_reimer1 Thanks Jonathan, always a pleasure to have your support 😊
Cat Yung
I think this is so cool ! It will be a great tool for NLP dev !
Jens Wittmeyer
@catyung Thanks a lot, Cat. If there are any questions please do reach out to us. 😊
Riddhi Dagli
@johannes_hoetter Congratulations on the launch! This is an amazing product!! All the best :D
Johannes Hötter
@riddhi_dagli thank you so much for the kind words! :)
Bela W.
Hey There, Bela here. After our investment last year in the pre-seed, I have worked closely with the team in the last 9 months and important milestones have been reached in that time. It was a pleasure supporting the team building such an amazing product. I’m very happy to have been able to join the journey from such an early stage and am excited that the product can now reach a wider audience this way. With the kern.ai refinery, engineers will be enabled to work with data, like they work with code. kern.ai does all the heavy lifting for you so you can concentrate on building the data-centric AI use cases the world deserves :) I look forward to watching the continued journey of Kern.ai now from the outside. Much love to the whole team.
Congrats on your launch!
Daniel Van den Berghe
Really cool, congrats on the launch!
Johannes Hötter
@daniel_berghe Thank you so much Daniel!
Farhan Aslam
Good luck.
Shrishty Pandey
Congrats on the amazing product launch @johannes_hoetter !
Leonard Püttmann
Hello, people of producthunt! I'm Leo, a Data Scientist at Kern. The Kern Refinery is a wonderful tool and I am convinced that it will have a great impact on the world of data, because it is so incredibly fun to use and delivers stunning results in a short time. If you are new to the world of Data Science and Python, you can check out our YouTube channel, where you'll find amazing videos about how to get started or about cool projects made with Kern Refinery. For me personally, working in Data Science is amazing because it allows you to see the world from a different perspective. I am convinced that Data Science and Artificial Intelligence can make our world a better place and I am happy to be a small part of this change. When I am not lifting heavy data as a Data Scientist, I like to lift weights in the gym and cook delicious food with friends and I especially enjoy drinking high-quality tea. Feel free to reach out to me if you have any questions about our product, or if you want to talk about data or tea. :-)
Johannes Hötter
Hey ProductHunt community, I'm Johannes, a data engineer + co-maker of refinery. We've built refinery with the belief that working on training data should, at some point, feel like you're programming. Why? Because we believe that otherwise, engineers and scientists with great ideas (and businesses with core processes, too) are limited in what they can build. When we think of natural language interfaces of the future, we're sure they simply can't be built with today's tools. And this is what we aim to change. What does that mean? - We believe that developers should be able to debug and document training data - Building training data must be easy and quick, such that you can build prototypes with ease - On the other hand, if you see that your use case is working, building training data is no one-time job, so you should be able to improve the data in a structured manner We believe in open-source and communities, so we've published the source code on GitHub, and we have a community on Discord. Also, we have an online playground. Check it out :) We're getting there step-by-step, with the goal to, at some point, not only treat training data as if it was source code but to essentially make complex NLP problems easier. Community, we're so excited to share this with you today. Feel free to leave a comment below or on GitHub, join our Discord, or reach out via Twitter. Cheers! 🙏🏻
Johannes Hötter
@fares_aktouf Thanks so much! It's been a hell of a ride so far :)
Jens Wittmeyer
Hello everyone 👋 I'm Jens, the CTO of kern.ai. Johannes and I met about two jobs back in a totally different environment - SAP Data Migration. Fun times indeed, however, as life goes we went our different paths after a few years of pushing data from left to right. He to study and start his first company - me working as a lecturer and creating one or two games you might find online. Back then we wouldn't or better couldn't even imagine where we are now. So enough about the foreplay, let's talk about the product 🙂 With the app, you have many possible applications. One option: you can optimize your AI label workflow. Let me give you an example: I always wanted to have a personalized newsfeed of different sources matched to my specific taste. But who has the time to scroll through 1.000s of articles every day? So let's slap some AI on that problem, right? Now some of you might know: AI works best with a lot of training data. But again... who has the time to label a bunch of old articles? I didn't so no dice I guess. Enter refinery. Not only did it help me to get a better overview of my data points (e.g. by using embedding-based similarity) but it also helped me to extrapolate the given information through a combination of heuristics, active learning, and weak supervision. I know, I know a lot of technical terms to throw around but it's an application for you, the data scientist. To keep it simple: Instead of manually labeling 2.500 articles, I scraped from different websites I achieved good results after the first 50 manual labels or so. Even better after some data exploration but that's going a bit too far for now. If you have further questions please don't hesitate to reach out 😀 So please let us know what your favorite features are (or what you'd love to see added) in the comments 👇 Bonus points for the first to find the easter egg I've hidden in the application. Without spoiling too much in our team the current high score on medium difficulty is 81 😉
Simon Degraf
Hi there, my name is Simon and I am a dev here at kern. In my bachelor's studies in the business information systems field, I learned about the importance of high-quality data for the future of successful companies. Now we are giving our best to enable our community to make great use of their data and I love to be a part of this journey. I also like to lift a lot of differently shaped weights and eat a lot of delicious foods. For questions about the journey we are creating, the weights I lift or the food I eat reach out to me!
Felix Kirsch
Hello everyone 😊 I am Felix, developer at kern.ai. Since day one as dev, I love working with open source. I want to build applications the way I want, without being forced to follow a prescribed path. OS enables me to do this and my personal goal with refinery is to allow data scientists to do so as well. Customisability is a key feature of refinery. For that, we introduced IDE-based interfaces into the application. You can write labeling functions and active learning algorithms the way you want. But if you prefer to follow a template, refinery provides you with it too. The best is that we want to extend this approach to further features of the application, next up is embedding creation. I am very curious about your thoughts on refinery 😄
Lina Lumburovska
Hey people! My name is Lina and I am a developer at kern.ai with a primary focus on the front-end. My main focus at kern is working with Angular,Typescript and Tailwind and with those skills I love turning even the craziest Figma mockup into reality and placing every element perfectly. At Kern I work on the software’s design, general front-end issues and constantly try to improve the code quality. Besides that I love to broaden my skill set by learning Python and deepen my understanding of programming. And guess what? Kern constantly gives me that opportunity to learn more and expand my knowledge, so recently I had the chance to write my first lines of Python. The refinery product gives the users an option to use an IDE for data-centric NLP, label their data, write their labeling functions, active learning and run a zero-shot classification. The refinery goes in a really good direction, and if I were you I would “stay tuned” for the upcoming features and possibilities. Looking forward to your feedback!
Gaurav Goyal
Awesome. Can definitely help in a lot of NLP problems that we are solving. @johannes_hoetter @anton_pullem1 @jens_wittmeyer @felix_kirsch @leonard_puttmann @linalumburovska Maker
Leonard Püttmann
@gauravgoyal_gg Thanks Gaurav, we do hope so!
Victor Cucu
Nice product, well done!
Johannes Hötter
@victorcucu7 Thanks Victor!!
Cyril Dubson
Amazing! Congrats on the launch!
Leonard Püttmann
@cyril_dubson Thank you Cyril! :-)