@blakehunsicker yes, it can be fine-tuned at a rate of ~5000 tokens/second, which should be sufficient for small-to-medium-size datasets. Fine tuning instructions are here: https://github.com/kingoflolz/me...
Yep! Doors have been OPENED 🤯 An open-source cousin of GPT-3 is here 😇
- Performs on par with 6.7B GPT-3
- Performs better and decodes faster than GPT-Neo
- repo + colab + free web demo
Got to know about it through Towards Data Science article: https://towardsdatascience.com/c...
More details in @arankomatsuzaki's article: https://arankomatsuzaki.wordpres...
@pallpakk some results were definitely weird but overall, it works great! Negative sentiment, foul language, etc are context specific outputs. So if an input is negative/abusive itself, the output is bound to reinforce the same sentiment.
*GPT-J is just as good as GPT-3.* It is more efficient, but with more quirks. In our JPRED scores, it did better with simple TCS tasks, but lost with the more complex tasks.
By removing the Jordan Algorithm: Our next proposed change to a probability model is removing the Jordan Algorithm. The Jordan Algorithm is a special procedure used for simple TCS tasks that allows for fast analysis of different sequence pairs, as well as being able to easily analyze simple n-gram (aka word) models.It is more efficient, but with more quirks. In our JPRED scores, it did better with simple TCS tasks, but lost with the more complex tasks.
...
Replies
Product Hunt
Bookmarks
Product Hunt
Product Hunt
LLC Toolkit
SaaS Blocks by Apideck
Poppins
Product Hunt
Product Hunt
Ricotta
Zappi Ad Predictor
Product Hunt
This Song Plants Trees
Product Hunt
linke
Product Hunt
Notion2Email
Product Hunt
Product Hunt
Lila
Eugris
Bardeen