Hey @contextjunkie!
I'm not familiar with SPARQL - How's it different from sql? Do people need to know it to use dataworld?
Also - saw dwsql, which syntax already looks very familiar/identical to sql. What are the differences? What's each used for?
@mscccc@contextjunkie
Hi Mike,
Where SQL is the language of choice for tabular and relational data, SPARQL is more well suited to pattern matching across linked data (RDF, Semantic Web, etc). The languages look somewhat similar, but serve distinct purposes. We believe that linked data is an important part of the future of open data. We've put together an awesome SPARQL tutorial for those who want to learn more: https://docs.data.world/tutorial...
We created dwSQL to make the powerful querying and joining capabilities of linked data accessible to anyone who knows SQL. Our implementation is quite full featured. We support the vast majority of SELECT style queries: including joins, aggregation, sorting, limits, etc. You can learn more here: https://docs.data.world/tutorial...
This is exciting : now that it's easier to build neural networks thanks to tensorflow, finding big enough amount of data is the real pain to implement AI as a small company.
Do you have / intend to add any mechanism that could help to add labels on data?
Hey @oelmekki, excellent question! The rise of ML and deep learning frameworks like tensor flow is super exciting. One of the cool things about linked data is how robustly datasets can be joined and extended to do things like add labels. Definitely stay tuned for more on that topic!
data.world is also a collaborative place where the definition of "dataset" goes beyond just the data itself. We love to see people sharing their own analysis, questions, projects, and even labels along with data and using dataset discussions to compare notes. If lots of folks do this for the same dataset, the dataset becomes an even more valuable resource for everyone. I've spoken to data scientists who were frustrated by the fact that labels aren't as widely shared as unlabeled data, so I'm excited to see that start to happen!
@oelmekki Thanks Olivier, there is no doubt that data.world will eventually become the foundation for ML/AI projects - it has been a part of our vision since the beginning. We hope you decide to become a part of the largest public works project in history on data - that is how we are going to bridge to the "Star Trek" future.
@arlogilbert Thanks so much, Arlo - we appreciate your kind words. We are launching a lot of functionality each and every week, so it will keep getting better and better quickly!
Love the concept! Data collaboration is still in the dark ages (email, sftp, csv, etc.). Do you have (or are you planning to have) an API to grab data sets? We're working on an web-based data prep/etl tool and love the idea of helping others connect to and use opendata (or pipe into data.world).
@wanderslth We are absolutely working to make using our data available externally. We will likely have a number of different API's in the coming months. We've already open sourced a JDBC driver (https://github.com/datadotworld/...) which allows users to connect and query both SPARQL and SQL against their datasets. Expect to see more APIs in the coming months. And if you have thoughts on what this should look like, please reach out directly to help@data.world. Thanks :)
@shadr Great to see; thanks! Launching our own initial public API next week, so this is quite revelant. Wishing you guys much success in tackling this market!
Hey guys - I am loving this product. One question though - how do you plan on supporting the platform for the future? Are there particular ideas you have for monetization?
@plurnt Excellent question. data.world is 100% free during preview (beta) release. Most users will never pay us for anything. When we move into general release (no hard date on that yet), the small subset of users who require private datasets, or an abnormally large amount of storage or processing, will be asked to pay an extremely reasonable fee.
@be3d Great - I'm glad you have plans. Want to make sure such a good service stays financially viable. I look forward to seeing what features you guys can build in before release (I would like some more settings to customize what notifications I receive).
Data.world has built an amazing technology on open data and very easy to use compared to their competitors. I have used it personally and I have also asked a popular machine language platform guys to play with data.world and there were very impressed with what they have done so far and how easy it is to integrate. The number of data sets they have is growing day by day and Its very easy to integrate with private data.
Hey Hunters! Joe from data.world here.
As a special treat for you, we've compiled the most comprehensive PH dataset ever released.(https://data.world/producthunt/p...). 2 years of posts, votes, taglines, and more! Dig in - we can't wait to see what insights you find!
And a bit about the platform...
We're all about helping data people solve problems faster, so we've built a collaboration platform to address a glaring, urgent need...
With hundreds of killer visualization and analysis tools out there, why are we stuck in the stone age when it comes to the most frustrating and time-consuming parts of any data project: finding, understanding, preparing, and sharing data?
data.world tackles this issue by helping you discover, explore, contribute, and share/publish---better, faster, easier, and all in one place.
Discover:
Browse thousands of open datasets contributed by organizations and data people from all over the world.
Explore:
See the data's "story" alongside the data itself. Preview the data before you dive in. Query within and across datasets, and create exploratory visualizations with just a few clicks.
Contribute:
Join the discussion with an international community of data people. Post hunches, share analysis techniques and insights, and find new collaborators.
Share / Publish:
Upload from your computer or pull down from the cloud. Automatically enhance your data, make it instantly queryable and joinable to other datasets. Showcase your work and build out your data work portfolio.
Please don't hesitate to share any questions or feedback right here. We'll be online 😎
Thanks, Hunters, and welcome to the social network for data people!
-Joe Boutros
https://data.world/jboutros
This couldn't be more useful for guys like myself getting started out with trying to implement data & data driven decisions into an organisation. Thank you!
Thanks for the love @seffa121, we definitely think the world will benefit by more easily finding, understanding, and being able to collaborate in the world's data which currently is very fragmented, siloed, and lacks the capture of context and improvements that others have already done with it.
Hey Hunters, the intelligence of dog breeds dataset has been pretty popular - like Major the bulldog. I'm curious what other dog datasets we might be able to combine this with. Any ideas? Ask to become a contributor, I'd love to work on this with others. https://data.world/len/intellige...
FB LIVE in 30! @contextjunkie will take us on a wild ride through our top 5 features, and answer your questions. I'm trying to get him to wear a black turtleneck but no promises. See you soon!
https://www.facebook.com/datadot...
Awesome product, this is like a github for data!
Curious how you've thought about the opportunity to accelerate learning for the next generation of people just starting to get into the data world.
@andrewaward This question couldn't be more astute. The key to our ongoing success will heavily rely the next generation of data people. To that end, we are active in forming relationships at universities via speaking engagements, hosted class projects, and capstone sponsorships.
Github provided a great platform for coders to create a portfolio of their work; we are that portfolio and repository for data folks. This is especially important as people start their careers in data.
We are also building partnerships with organizations like Data Society and Coding4TX, which focus on an even younger, emerging data people crowd.
My favorite data.world feature is the New Knowledge visualizations; I love how I can easily play around with the sample queries and look at the data in a variety of graph formats, then copy the embed code to paste the viz I created in my dataset summary or discussion threads. Here's a quick gif showing how to copy/paste your viz to spice up your dataset: https://data.world/gswider/embed...
The post-truth world in which we seem to have found ourselves is pretty scary. So I'm happy we're doing what we're doing to enable evidence-based solutions to the world's problems.
@kevando_ thanks for the kind words! There are a lot of other great platforms approaching these types of 'first mile' data problems, Datazar among them. What became super clear to us last year when we started data.world is that the time is right for data science to undergo the same transformation effect that open source software did with the rise of github.
This is one of my favorite data set collections on the web. Here I can find real-life data for my educational projects. Thank you, team, for making such a wonderful place to discover data from.
This is huge! I second the effort to partner with research universities and other groups tackling tough problems. Using data.world could make a big difference, and the infrastructure to utilize complex data sets is already there (I would have loved to have this on many of the projects I was working on). Thank you!
@renastake Thank you so much, Rena! We have visited many universities to help them with data.world and it feels really good to help in that way as a Public Benefit Corporation and B Corporation.
I've been doing some "lazy Sunday analysis" (read: only easy stuff) on the Product Hunt data over at data.world (https://data.world/producthunt/p...).
Here's an interesting tidbit: the most common names across all Product Hunt users, in rank order, are Michael, Alex, Chris, David, Matt, Daniel, Nick, Andrew, Adam, Kevin, Ben, Ryan, Sam, James, Jason, John, Eric, Mark, Josh, and Max (after combining Michael + Mike). Granted, some of these names are unisex, but most are distinctly "male." It's interesting that not a single "female" name occurs in the top 20. This must mean the site skews heavily male, and/or perhaps the men share common names moreso than women do.
PlanetScale Boost