Put your web data extraction on autopilot with AI technology that retrieves clean, structured data accessible via API. No coding, selectors or any manual input needed.
Would there ever be a way to migrate robots from other services? We use Dexi but biggest fear long term is not being able to move robots if anything happens to them
@bradungar Hey Brad, we work towards eliminating manually configured robots or crawlers. The algorithms we develop browse pages as humans do and tries to identify datapoints of interest. At the moment we provide APIs to extract data from ecommerce pages, blogs, job pages and real estate listings.
Hello hunters! At scale, websites change and crawlers break daily. Maintaining them is time consuming and painful. Crawlify is on a mission to help companies of all sizes automate their data extraction at scale.
To achieve this, the platform fuses machine learning with natural language processing to retrieve clean, structured data without the need for manual rules or site-specific training. You can try it for yourself on our website.
The current APIs are:
1. The Product API [Released] - it automatically extracts product data from any e-commerce page.
2. The Real Estate API [closed beta] - it automatically extracts data from real estate listings with other data points like crime & safety statistics, monitoring foreclosure/auctions listings, or urban planning and construction permits.
3. The Job Page API [close beta] extracts clean job listing data from any board.
4. The Article API (released) - enables the extraction of articles data from pages without an RSS feed.
@barnea_florin Thanks for the feedback. The real estate API is currently in beta, we plan to launch it to the public next month. Please subscribe on our website to get notified when we do so.
Looks really good, will try out the free version first!
Left a few thoughts here too as I was looking through your landing page incase its helpful - https://app.usebubbles.com/45da6...
@conor_clarke1 I can't access the link, can you please share it again? We are aware that our landing page needs some improvements :d therefore we are curious and thankful for your thoughts. Looking forward to read them
@francisca_aguayo Hey Francisca, thanks for sharing your feedback with us. Can you point us to that specific page? It will tremendously help us in improving our accuracy.
Tried it out on several e-commerce websites from different countries. The results are accurate so far. It would be really handy for affiliate websites if you could provide an embed widget for any product to display details like price and title.
@virgil_deckow We are already working on that. On the next iteration of the product, clients can extract more data points including product attributes.
Did a few smoke tests using the Product API. Satisfied of the results so far. What load and concurrency the endpoint would reasonably handle? We have an use case in which we need to get pricing data across multiple e-commerce websites in a very short time frame.
@damien_tress We use Google Cloud Run to handle requests using a serverless architecture. We scale automatically up and down depending on load. Therefore, we can handle any enterprise use case. We can set up up a custom trial account so you can test our API at the scale you need.
Found this on Reddit. I am working with a non profit to monitor used books bulk sales on several websites. I like the results I see so far. I've tried several other providers but they were a bit cost prohibitive. Do you think you can offer some free credits to power this use case?
@hans_o_conner At the moment we extract: Title, Description, Salary, Location recruiter contact info if available. Extracting data from an entire website is on the roadmap for the next iteration.
Currently, working a lot with product data to track some information of interest. Have you thought about adding an integration with Google Spreadsheet? It would update data directly on the spreadsheet page.
Is this project (Crawlify.II) still live or has it been discontinued? Please can someone confirm as I had initial conversations not long after this PH launch but now I want to use them, no oneβs answering anymore. Thanks ππΌ