Search Results

Found 128,813 repositories(showing 30)

firecrawl

💚95

🔥 The Web Data API for AI - Power AI agents with clean web data

107.3k

6.9k

AGPL-3.0

TypeScript

Updated 1 minute ago

aiai-agentsai-crawler+16

crawl4ai

unclecode

💚100

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

63.8k

6.5k

Apache-2.0

Python

Updated 3 minutes ago

scrapy

💚100

Scrapy, a fast high-level web crawling & scraping framework for Python.

61.3k

11.5k

BSD-3-Clause

Python

Updated 5 minutes ago

crawlercrawlingframework+5

MediaCrawler

NanmiCoder

💚100

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频｜评论爬虫、微博帖子｜评论爬虫、百度贴吧帖子｜百度贴吧评论回复爬虫 | 知乎问答文章｜评论爬虫

47.7k

10.2k

NOASSERTION

Python

Updated 1 minute ago

EasySpider

NaiboWang

💚100

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

43.8k

5.3k

AGPL-3.0

JavaScript

Updated 40 minutes ago

batch-processingbatch-scriptcode-free+17

Scrapling

D4Vinci

💚100

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

36.0k

3.1k

BSD-3-Clause

Python

Updated just now

aiai-scrapingautomation+17

lux

iawia002

💚95

👾 Fast and simple video download library and CLI tool written in Go

31.0k

3.2k

MIT

Updated 2 hours ago

bilibilicrawlerdownload+10

colly

gocolly

💚100

Elegant Scraper and Crawler Framework for Golang

25.2k

1.8k

Apache-2.0

Updated 1 hour ago

crawlercrawlingframework+5

Scrapegraph-ai

ScrapeGraphAI

💚95

Python scraper based on AI

23.3k

2.0k

MIT

Python

Updated 3 minutes ago

ai-crawlerai-scrapingai-search+17

proxy_pool

jhao104

💚100

Python ProxyPool for web spider

23.3k

5.4k

MIT

Python

Updated 2 hours ago

crawlerhttpproxy+2

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

22.7k

1.3k

Apache-2.0

TypeScript

Updated 50 minutes ago

apifyautomationcrawler+14

gpt-crawler

BuilderIO

💚100

Crawl a site to generate knowledge files to create your own custom GPT from a URL

22.2k

2.4k

ISC

TypeScript

Updated 1 hour ago

Douyin_TikTok_Download_API

Evil0ctal

💚100

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具，支持API调用，在线批量解析及下载。

17.1k

2.5k

Apache-2.0

Python

Updated 21 minutes ago

apiasynccrawler+17

pyspider

binux

💚95

A Powerful Spider(Web Crawler) System in Python.

16.9k

3.6k

Apache-2.0

Python

Updated 1 hour ago

crawlerpython

katana

projectdiscovery

💚96

A next-generation crawling and spidering framework.

16.5k

1.1k

MIT

Updated 2 minutes ago

clicrawlergocrawler+4

maxun

getmaxun

💚98

🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minutes 🔥

15.3k

1.3k

AGPL-3.0

TypeScript

Updated 1 hour ago

agentsapiautomation+15

newspaper

codelucas

💚100

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

15.0k

2.1k

MIT

Python

Updated 18 hours ago

crawlercrawlingnews+3

examples-of-web-crawlers

shengqiangzhang

💚100

一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、微信读书、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )

14.6k

3.8k

MIT

HTML

Updated 4 hours ago

agent-poolcrawlerexample+12

Photon

s0md3v

💚98

Incredibly fast crawler designed for OSINT.

12.8k

1.7k

GPL-3.0

Python

Updated 10 hours ago

crawlerinformation-gatheringosint+2

crawlab

crawlab-team

💚97

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

12.2k

1.9k

BSD-3-Clause

Updated 2 hours ago

crawlabcrawlercrawling-tasks+10

webmagic

code4craft

💚97

A scalable web crawler framework for Java.

11.7k

4.1k

Apache-2.0

Java

Updated 2 hours ago

crawlerframeworkjava+1

spider-flow

ssssssss-team

💚96

新一代爬虫平台，以图形化方式定义爬虫流程，不写代码即可完成爬虫。

11.3k

2.2k

MIT

Java

Updated 1 day ago

crawlerjsoupspider+6

Python

injetlee

💚91

Python脚本。模拟登录知乎，爬虫，操作excel，微信公众号，远程开机

10.6k

4.3k

Python

Updated 1 hour ago

crawlerexcelpython+1

avbook

guyueyingmu

💚90

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

9.9k

2.0k

PHP

Updated 8 hours ago

adultadult-videoavmoo+10

crawlee-python

apify

💛81

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

8.8k

705

Apache-2.0

Python

Updated 1 hour ago

apifyautomationbeautifulsoup+14

wiseflow

TeamWiseFlow

💚92

为你 7*24 在线搞钱的“云上牛马”团队

8.2k

1.4k

NOASSERTION

TypeScript

Updated 6 minutes ago

crawlerdigital-employeemakemoney+4

awesome-web-scraping

lorien

💛82

List of libraries, tools and APIs for web scraping and data processing.

7.8k

885

NOASSERTION

Makefile

Updated 1 hour ago

captcha-bypasscaptcha-recaptchacrawler+11

pholcus

andeya

💛88

Pholcus is a distributed high-concurrency crawler software written in pure golang

7.6k

1.7k

Apache-2.0

Updated 1 hour ago

crowlerspider

awesome-crawler

BruceDone

💛85

A collection of awesome web crawler,spider in different languages

7.2k

747

MIT

Updated 6 hours ago

awesomecrawlernode-crawler+4

autoscraper

alirezamika

💛84

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

7.1k

717

MIT

Python

Updated 23 hours ago

aiartificial-intelligenceautomation+9

GitHub Explorer

Search Results

firecrawl

crawl4ai

scrapy

MediaCrawler

EasySpider

Scrapling

lux

colly

Scrapegraph-ai

proxy_pool

crawlee

gpt-crawler

Douyin_TikTok_Download_API

pyspider

katana

maxun

newspaper

examples-of-web-crawlers

Photon

crawlab

webmagic

spider-flow

Python

avbook

crawlee-python

wiseflow

awesome-web-scraping

pholcus

awesome-crawler

autoscraper

firecrawl

crawl4ai

scrapy

MediaCrawler

EasySpider

Scrapling

lux

colly

Scrapegraph-ai

proxy_pool

crawlee

gpt-crawler

Douyin_TikTok_Download_API

pyspider

katana

maxun

newspaper

examples-of-web-crawlers

Photon

crawlab

webmagic

spider-flow

Python

avbook

crawlee-python

wiseflow

awesome-web-scraping

pholcus

awesome-crawler

autoscraper