deep-scraper
A high-performance web scraping tool using Docker and Crawlee (Playwright) for complex websites like YouTube and Twitter.
Overview: A high-performance engineering tool for deep web scraping, capable of penetrating protections on complex websites like YouTube and X/Twitter. This skill provides ‘interception-level’ raw data, optimized for Large Language Model (LLM) processing. Key Features: * High-performance web scraping * Penetration of complex website protections * Raw data optimized for LLM processing * Containerized Docker + Crawlee (Playwright) environment * Strict ID validation for YouTube tasks How It Works: This skill uses a Docker container to run a Crawlee (Playwright) environment, which scrapes data from the target website. The results are printed to stdout as a JSON string. Use Cases: * Extracting data from YouTube videos * Scraping Twitter data * Optimizing data for LLM processing
评价
暂无评价。