Docsumo
p/docsumo
Automatically Extract Data from Unstructured Documents
Kevin William David
Extract Tables by Docsumo — Free tool to extract tables from PDF and Images
Featured
20
Docsumo is a Document AI software that helps enterprises capture and analyse unstructured documents. Use prebuilt APIs to convert invoices , bank statements, tax returns, ACORD forms & ID cards into JSON/CSV. Get accurate analytics for automated decisions.
Replies
Theron McCollough
Are you planning on expanding the document types? Or building them from user use cases? Thanks! Great product.
Rushabh Sheth
Hi @theron , yes, we already have an API library per use case and are expanding it as we come across more document types. Shall we connect over a call? Would love to understand from your experience at SVB - https://calendly.com/rushabhshet...
Bailey Kursar
This will be so helpful when analysing data stuck in research reports... congrats @rushabh_sheth4 and team! 👏
Max Prilutskiy
Launching soon!
Sounds nice! 💯
Rushabh Sheth
Thanks @prilutskiy !
Rushabh Sheth
Hi Hunters & Makers! After 2 years of helping enterprises automate document processing, we are excited to present Docsumo to you! We want to make it super easy for businesses to extract & analyze data from documents (PDF/images/scans). Hence, we decided to build these tools and give them away to the community for free. 💡 What is Docsumo? Docsumo automates document processing to help enterprises make accurate & fast decisions from unstructured documents. 🎁 What can it do? ✅ Capture tables & key-value pairs from PDF/scanned images. ✅ Review & edit extracted data using our human-in-the-loop tool. ✅ Train on your document type with as little as 20 samples. ✅ Categorize data, validate with API/rules & get calculated attributes. ✅ Out of the box API endpoints for standard doc types. 🤔 How is Docsumo different? Docsumo is able to extract complex tables and key-value pairs from any kind of document. It performs better than AWS Textract & Google Doc AI. You can customize the output & train on your document type. Additionally, you get categorized & normalized data along with analytics in the same API call. It comes with prebuilt models for 100+ document types including IDs, driver licenses, passports, vehicle registrations, insurance cards, invoices, bank statements, bill of lading, financial statements, ACORD forms, rent rolls, etc. 💰 How much does it cost? The tool is free to use if you need to process up to 20 documents/day. You don’t need to make any payments or even register your credit card. Please reach out to us at hello@docsumo.com if you are processing more than 2000 documents per month. Looking forward to seeing you try out the free tool and automate your processes. If you would like to speak with us for your use case, please schedule a call at https://calendly.com/docsumo/demo On behalf of Docsumo Team, Rushabh Sheth, Co-founder & CEO
Aakash N S
Looks great, bookmarked! I’ve been looking for a tool like this for a while.
Rushabh Sheth
Thanks @aakash_n_s. Do let us know once you try it.
Aman Gour
Great work on automation of a tedious task that every functions in organization suffers from. Good luck @rushabh_sheth4 and team!
Rushabh Sheth
@aman_gour thank you!
Otto Hanson
Really excited to learn more. We are using Textract now for extracting data from contracts, and tabular data just isn’t working. Would this work on random contracts with unpredictable types of tables (eg sometimes the tabular data is pricing info, sometimes its SLA info, etc)?
Avinash Agrawal
Neat! A couple of feature requests: 1. Reverse of mail-merge (Multiple Docs/PDFs of the same format —> Excel with one row per Doc/PDF) 2. I couldn’t figure out a way to sign-up/login 3. Neither a way to delete the document uploaded by me 4. Hygiene: While you have the Privacy Policy in the footer, would be great to include a line to comfort the user at the point where she uploads her first doc All the best 👍
Rushabh Sheth
@avinash_agrawal thanks a lot for the feedback. 1. Merge data from multiple docs is on our roadmap. Currently most customers directly feed the JSON output to a database. 2. Shall we connect to understand your use case? We will configure & share the trial account for you. 3. We will add this to our roadmap and enable it on the frontend. Right now all documents are automatically deleted within 30 days. 4. Great suggestion, we will discuss what message we can show on the page.
Sebastian Hofer
This is really great! Thank you!
Mayssa
Amazing tool! Great job Rushabh
Rushabh Sheth
@mayssa_bench Thank you so much!