Search Results

Found 32,424 repositories(showing 30)

crawl4ai

unclecode

💚100

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

63.7k

6.5k

Apache-2.0

Python

Updated 15 minutes ago

EasySpider

NaiboWang

💚100

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

43.8k

5.3k

AGPL-3.0

JavaScript

Updated 1 hour ago

batch-processingbatch-scriptcode-free+17

pyspider

binux

💚95

A Powerful Spider(Web Crawler) System in Python.

16.9k

3.6k

Apache-2.0

Python

Updated 4 hours ago

crawlerpython

examples-of-web-crawlers

shengqiangzhang

💚100

一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、微信读书、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )

14.6k

3.8k

MIT

HTML

Updated 4 hours ago

agent-poolcrawlerexample+12

crawlab

crawlab-team

💚97

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

12.2k

1.9k

BSD-3-Clause

Updated 16 hours ago

crawlabcrawlercrawling-tasks+10

webmagic

code4craft

💚97

A scalable web crawler framework for Java.

11.7k

4.1k

Apache-2.0

Java

Updated 6 hours ago

crawlerframeworkjava+1

awesome-crawler

BruceDone

💛85

A collection of awesome web crawler,spider in different languages

7.2k

747

MIT

Updated 3 hours ago

awesomecrawlernode-crawler+4

node-crawler

bda-research

💛86

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

6.8k

874

MIT

TypeScript

Updated 1 hour ago

cheeriocrawlerextract-data+4

hakrawler

hakluke

💛75

Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application

5.0k

540

GPL-3.0

Updated 15 hours ago

bugbountycrawlinghacking+4

crawler4j

yasserg

💚90

Open Source Web Crawler for Java

4.6k

1.9k

Apache-2.0

Java

Updated 1 day ago

Crawler_Illegal_Cases_In_China

hiddendevj

💛73

Collection of China illegal cases about web crawler 本项目用来整理所有中国大陆爬虫开发者涉诉与违规相关的新闻、资料与法律法规。致力于帮助在中国大陆工作的爬虫行业从业者了解我国相关法律，避免触碰数据合规红线。

4.6k

315

HTML

Updated 21 hours ago

chinacrawlerlaw

bitmagnet

bitmagnet-io

💛76

A self-hosted BitTorrent indexer, DHT crawler, content classifier and torrent search engine with web UI, GraphQL API and Servarr stack integration.

4.0k

225

MIT

Updated 7 hours ago

bittorrentdhtprowlarr+7

heritrix3

internetarchive

💛81

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

3.2k

782

NOASSERTION

Java

Updated 3 hours ago

heritrixjavawarc+1

nutch

apache

💛81

Apache Nutch is an extensible and scalable web crawler

3.1k

1.3k

Apache-2.0

Java

Updated 4 days ago

apachecrawlinghadoop+3

WebCollector

CrawlScript

💛87

WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.

3.1k

1.4k

GPL-3.0

Java

Updated 6 days ago

crawlergo

Qianlitp

💛78

A powerful browser crawler for web vulnerability scanners

3.0k

499

GPL-3.0

Updated 11 hours ago

arsenalblackhatchrome-devtools+8

Structured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.

2.7k

MIT

Python

Updated 22 hours ago

ai-crawlerai-scraperai-scraping+9

gecco

xtuhcy

💛81

Easy to use lightweight web crawler（易用的轻量化网络爬虫）

2.5k

876

MIT

Java

Updated 4 days ago

crawlerdynamicfastjson+3

spider

spider-rs

🧡69

Web crawler and scraper for Rust

2.4k

195

MIT

Rust

Updated 42 minutes ago

ai-agentautomationcrawler+6

news-please

fhamborg

💛77

news-please - an integrated web crawler and information extractor for news that just works

2.4k

450

Apache-2.0

Python

Updated 4 days ago

cc-newsccnewscommoncrawl+17

abot

sjdirect

💛78

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

2.3k

553

Apache-2.0

Updated 2 days ago

abotabot-nugetc-sharp+17

gocrawl

PuerkitoBio

🧡64

Polite, slim and concurrent web crawler.

2.1k

194

BSD-3-Clause

Updated 1 week ago

crawlerrobots-txt

AutoCrawler

YoongiKim

💛76

Google, Naver multiprocess image web crawler (Selenium)

1.7k

425

Apache-2.0

Python

Updated 1 day ago

bigdatachromedrivercrawler+8

grab-site

ArchiveTeam

🧡63

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

1.6k

154

NOASSERTION

Python

Updated 10 hours ago

archivingcrawlcrawler+2

apollo

amirgamil

🧡52

A Unix-style personal search engine and web crawler for your digital footprint.

1.4k

MIT

Updated 1 month ago

personal-searchposeidonsearch+2

wombat

felipecsl

💛73

Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

1.4k

129

MIT

Ruby

Updated 16 hours ago

crawlerdslruby+1

botflow

kkyon

🧡62

Python Fast Dataflow programming framework for Data pipeline work( Web Crawler,Machine Learning,Quantitative Trading.etc)

1.2k

103

NOASSERTION

Python

Updated 1 week ago

browsertrix-crawler

webrecorder

🧡57

Run a high-fidelity browser-based web archiving crawler in a single Docker container

1.0k

136

AGPL-3.0

TypeScript

Updated 5 hours ago

crawlercrawlingwacz+4

stormcrawler

apache

🧡69

A scalable, mature and versatile web crawler based on Apache Storm

975

273

Apache-2.0

Java

Updated 12 hours ago

apache-stormcrawlerdistributed+3

crawler

fredwu

🧡67

A high performance web crawler / scraper in Elixir.

958

Elixir

Updated 3 days ago

crawlerelixirfiles+4

GitHub Explorer

Search Results

crawl4ai

EasySpider

pyspider

examples-of-web-crawlers

crawlab

webmagic

awesome-crawler

node-crawler

hakrawler

crawler4j

Crawler_Illegal_Cases_In_China

bitmagnet

heritrix3

nutch

WebCollector

crawlergo

oxylabs-ai-studio-py

gecco

spider

news-please

abot

gocrawl

AutoCrawler

grab-site

apollo

wombat

botflow

browsertrix-crawler

stormcrawler

crawler

crawl4ai

EasySpider

pyspider

examples-of-web-crawlers

crawlab

webmagic

awesome-crawler

node-crawler

hakrawler

crawler4j

Crawler_Illegal_Cases_In_China

bitmagnet

heritrix3

nutch

WebCollector

crawlergo

oxylabs-ai-studio-py

gecco

spider

news-please

abot

gocrawl

AutoCrawler

grab-site

apollo

wombat

botflow

browsertrix-crawler

stormcrawler

crawler