Vidas Bacevičius
Scrape, Train, Predict: The Lifecycle of Data for AI Applications
#1about 2 minutes
Understanding the fundamentals of web scraping
Web scraping is the automated collection of data from websites using a scraper program and proxy servers to handle the request-response cycle.
#2about 2 minutes
Exploring business use cases for scraped data
Scraped data can be used to analyze past trends like SEO rankings and competitor pricing or to predict future trends like market demand.
#3about 4 minutes
Training AI models with custom scraped data
Public datasets like Common Crawl have limitations, so custom web scraping provides fresher, more relevant, and multimodal data for training superior AI models.
#4about 3 minutes
Powering real-time AI with retrieval augmented generation
Retrieval augmented generation (RAG) uses live web scraping to integrate the most current external knowledge directly into an LLM's response generation process.
#5about 7 minutes
Overcoming blocking techniques and messy HTML
Web scrapers face major challenges from anti-bot measures like fingerprinting and CAPTCHAs, as well as from inconsistent and messy HTML structures.
#6about 5 minutes
Using AI classification models to improve scraping
AI classification models trained on labeled HTML data can automatically validate responses to detect blocks and adaptively parse messy content without hardcoded selectors.
#7about 3 minutes
Demonstration of an AI copilot for automated scraping
An AI-powered tool can take a natural language prompt and a list of URLs to automatically generate parsing instructions and extract structured data.
#8about 1 minute
The symbiotic relationship between AI and web scraping
Web scraping provides the fresh, high-quality data that AI models need to function, while AI makes the scraping process itself smarter and more resilient.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
01:57 MIN
Presenting live web scraping demos at a developer conference
Tech with Tim at WeAreDevelopers World Congress 2024
Unlock Moments
Create a free account to watch a limited number of Moments each month.
Upgrade to PRO for unlimited access to the full archive.
Upgrade to PRO for unlimited access to the full archive.
You have an account? Log in
03:50 MIN
Solving scaling challenges in web data collection
Tech with Tim at WeAreDevelopers World Congress 2024
Unlock Moments
Create a free account to watch a limited number of Moments each month.
Upgrade to PRO for unlimited access to the full archive.
Upgrade to PRO for unlimited access to the full archive.
You have an account? Log in
03:28 MIN
Navigating the complexities of modern web scraping
How to scrape modern websites to feed AI agents
Unlock Moments
Create a free account to watch a limited number of Moments each month.
Upgrade to PRO for unlimited access to the full archive.
Upgrade to PRO for unlimited access to the full archive.
You have an account? Log in
05:22 MIN
The fundamental challenge of web scraping as a turing test
Cracking the Code: Decoding Anti-Bot Systems!
Unlock Moments
Create a free account to watch a limited number of Moments each month.
Upgrade to PRO for unlimited access to the full archive.
Upgrade to PRO for unlimited access to the full archive.
You have an account? Log in
03:32 MIN
Automating browser workflows with AI-powered tools
WeAreDevelopers LIVE: Scammer Payback with Python, Grok Goes Unhinged, The Future of Chromium and mo
Unlock Moments
Create a free account to watch a limited number of Moments each month.
Upgrade to PRO for unlimited access to the full archive.
Upgrade to PRO for unlimited access to the full archive.
You have an account? Log in
01:10 MIN
Defining web scraping and its primary use cases
Data is Key: Scraping Metadata from Websites
Unlock Moments
Create a free account to watch a limited number of Moments each month.
Upgrade to PRO for unlimited access to the full archive.
Upgrade to PRO for unlimited access to the full archive.
You have an account? Log in
02:04 MIN
Building a RAG chatbot with scraped data and Pinecone
How to scrape modern websites to feed AI agents
Unlock Moments
Create a free account to watch a limited number of Moments each month.
Upgrade to PRO for unlimited access to the full archive.
Upgrade to PRO for unlimited access to the full archive.
You have an account? Log in
13:52 MIN
Answering questions on scraping legality, VPNs, and rate limits
Cracking the Code: Decoding Anti-Bot Systems!
Unlock Moments
Create a free account to watch a limited number of Moments each month.
Upgrade to PRO for unlimited access to the full archive.
Upgrade to PRO for unlimited access to the full archive.
You have an account? Log in
Featured Partners
Related Videos
How to scrape modern websites to feed AI agents
Jan Curn
Data is Key: Scraping Metadata from Websites
Lars Kölker
The AI-Ready Stack: Rethinking the Engineering Org of the Future
Jan Oberhauser, Mirko Novakovic, Alex Laubscher & Keno Dreßel
From clicks to cribs - How to find your dream home with web scraping
Alexander Lichter
Unlocking Value from Data: The Key to Smarter Business Decisions-
Taqi Jaffri, Kapil Gupta & Farooq Sheikh and Tomislav Tipurić
Building Real-Time AI/ML Agents with Distributed Data using Apache Cassandra and Astra DB
Dieter Flick
HR ROBO SAPIENS: Decoding AI Agents and Workflow Automation for Modern Recruitment
José Kadlec
Bringing the power of AI to your application.
Krzysztof Cieślak
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.






Gemma Analytics


