Found 32,424 repositories(showing 30)
unclecode
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
NaiboWang
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
binux
A Powerful Spider(Web Crawler) System in Python.
shengqiangzhang
一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、微信读书、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )
crawlab-team
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
code4craft
A scalable web crawler framework for Java.
BruceDone
A collection of awesome web crawler,spider in different languages
bda-research
Web Crawler/Spider for NodeJS + server-side jQuery ;-)
hakluke
Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application
yasserg
Open Source Web Crawler for Java
hiddendevj
Collection of China illegal cases about web crawler 本项目用来整理所有中国大陆爬虫开发者涉诉与违规相关的新闻、资料与法律法规。致力于帮助在中国大陆工作的爬虫行业从业者了解我国相关法律,避免触碰数据合规红线。
bitmagnet-io
A self-hosted BitTorrent indexer, DHT crawler, content classifier and torrent search engine with web UI, GraphQL API and Servarr stack integration.
internetarchive
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
apache
Apache Nutch is an extensible and scalable web crawler
CrawlScript
WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.
Qianlitp
A powerful browser crawler for web vulnerability scanners
oxylabs
Structured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.
xtuhcy
Easy to use lightweight web crawler(易用的轻量化网络爬虫)
spider-rs
Web crawler and scraper for Rust
fhamborg
news-please - an integrated web crawler and information extractor for news that just works
sjdirect
Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.
PuerkitoBio
Polite, slim and concurrent web crawler.
YoongiKim
Google, Naver multiprocess image web crawler (Selenium)
ArchiveTeam
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
amirgamil
A Unix-style personal search engine and web crawler for your digital footprint.
felipecsl
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
kkyon
Python Fast Dataflow programming framework for Data pipeline work( Web Crawler,Machine Learning,Quantitative Trading.etc)
webrecorder
Run a high-fidelity browser-based web archiving crawler in a single Docker container
apache
A scalable, mature and versatile web crawler based on Apache Storm
fredwu
A high performance web crawler / scraper in Elixir.