Search Results

Found 228,277 repositories(showing 30)

firecrawl

💚95

🔥 The Web Data API for AI - Power AI agents with clean web data

107.5k

6.9k

AGPL-3.0

TypeScript

Updated just now

aiai-agentsai-crawler+16

crawl4ai

unclecode

💚100

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

63.8k

6.5k

Apache-2.0

Python

Updated 47 minutes ago

huginn

💚100

Create agents that monitor and act on your behalf. Your agents are standing by!

49.1k

4.2k

MIT

Ruby

Updated 26 minutes ago

agentautomationfeed+9

EasySpider

NaiboWang

💚100

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

43.8k

5.3k

AGPL-3.0

JavaScript

Updated 5 hours ago

batch-processingbatch-scriptcode-free+17

lux

iawia002

💚95

👾 Fast and simple video download library and CLI tool written in Go

31.0k

3.2k

MIT

Updated 12 hours ago

bilibilicrawlerdownload+10

cheerio

cheeriojs

💚100

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

30.3k

1.7k

MIT

TypeScript

Updated 7 hours ago

cheeriodomhacktoberfest+7

Jobs_Applier_AI_Agent_AIHawk

feder-cr

💚95

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

29.6k

4.5k

AGPL-3.0

Python

Updated 31 minutes ago

agentapplication-resumeartificial-intelligence+17

colly

gocolly

💚100

Elegant Scraper and Crawler Framework for Golang

25.2k

1.8k

Apache-2.0

Updated 3 hours ago

crawlercrawlingframework+5

Scrapegraph-ai

ScrapeGraphAI

💚95

Python scraper based on AI

23.3k

2.0k

MIT

Python

Updated 30 minutes ago

ai-crawlerai-scrapingai-search+17

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

22.7k

1.3k

Apache-2.0

TypeScript

Updated 14 minutes ago

apifyautomationcrawler+14

Douyin_TikTok_Download_API

Evil0ctal

💚100

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具，支持API调用，在线批量解析及下载。

17.1k

2.5k

Apache-2.0

Python

Updated 3 hours ago

apiasynccrawler+17

maxun

getmaxun

💚98

🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minutes 🔥

15.3k

1.3k

AGPL-3.0

TypeScript

Updated 44 minutes ago

agentsapiautomation+15

newspaper

codelucas

💚100

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

15.0k

2.1k

MIT

Python

Updated 1 day ago

crawlercrawlingnews+3

nsfw_data_scraper

alex000kim

💚98

Collection of scripts to aggregate image data for the purposes of training an NSFW Image Classifier

12.6k

2.9k

MIT

Shell

Updated 2 hours ago

content-moderationdeep-learningmachine-learning+3

chinese-xinhua

pwxcoo

💚97

:orange_book: 中华新华字典数据库。包括歇后语，成语，词语，汉字。

11.5k

2.7k

MIT

Python

Updated 20 hours ago

chinesechinese-characterschinese-language+9

avbook

guyueyingmu

💚90

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

9.9k

2.0k

PHP

Updated 18 hours ago

adultadult-videoavmoo+10

Goutte

FriendsOfPHP

💛84

Goutte, a simple PHP Web Scraper

9.2k

990

MIT

PHP

Updated 1 day ago

crawlee-python

apify

💛81

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

8.8k

705

Apache-2.0

Python

Updated 55 minutes ago

apifyautomationbeautifulsoup+14

awesome-crawler

BruceDone

💛85

A collection of awesome web crawler,spider in different languages

7.2k

747

MIT

Updated 15 hours ago

awesomecrawlernode-crawler+4

autoscraper

alirezamika

💛84

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

7.1k

717

MIT

Python

Updated 5 hours ago

aiartificial-intelligenceautomation+9

rod

go-rod

💛76

A Chrome DevTools Protocol driver for web automation and scraping.

6.9k

456

MIT

Updated 5 hours ago

automationcdpchrome-devtools+14

llm-scraper

mishushakov

💛75

Turn any webpage into structured data using LLMs

6.3k

378

MIT

TypeScript

Updated 7 hours ago

aiartificial-intelligencebrowser+10

ferret

MontFerret

💛74

Declarative web scraping

6.0k

320

Apache-2.0

Updated just now

cdpchromecli+12

x-ray

matthewmueller

💛79

The next web scraper. See through the <html> noise.

5.9k

342

MIT

JavaScript

Updated 1 day ago

headless-chrome-crawler

yujiosaka

💛75

Distributed crawler powered by Headless Chrome

5.7k

405

MIT

JavaScript

Updated 6 hours ago

chromechromiumcrawler+7

snscrape

JustAnotherArchivist

💛83

A social networking service scraper in Python

5.3k

776

GPL-3.0

Python

Updated 1 day ago

pythonscrapersocial-media+1

tiktok-scraper

drawrowfly

💛79

TikTok Scraper. Download video posts, collect user/trend/hashtag/music feed metadata, sign URL and etc.

5.0k

891

TypeScript

Updated 5 hours ago

browser-fingerprinting

niespodd

🧡68

Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

5.0k

273

JavaScript

Updated 1 day ago

abckarkoseautomation+15

Scraperr

jaypyles

💛72

Self-hosted webscraper.

4.9k

235

MIT

TypeScript

Updated 2 hours ago

dockerhelmkubernetes+10

node-ytdl-core

fent

💛78

YouTube video downloader in javascript.

4.7k

857

MIT

JavaScript

Updated 19 hours ago

nodescrapervideo-downloader+2

GitHub Explorer

Search Results

firecrawl

crawl4ai

huginn

EasySpider

lux

cheerio

Jobs_Applier_AI_Agent_AIHawk

colly

Scrapegraph-ai

crawlee

Douyin_TikTok_Download_API

maxun

newspaper

nsfw_data_scraper

chinese-xinhua

avbook

Goutte

crawlee-python

awesome-crawler

autoscraper

rod

llm-scraper

ferret

x-ray

headless-chrome-crawler

snscrape

tiktok-scraper

browser-fingerprinting

Scraperr

node-ytdl-core

firecrawl

crawl4ai

huginn

EasySpider

lux

cheerio

Jobs_Applier_AI_Agent_AIHawk

colly

Scrapegraph-ai

crawlee

Douyin_TikTok_Download_API

maxun

newspaper

nsfw_data_scraper

chinese-xinhua

avbook

Goutte

crawlee-python

awesome-crawler

autoscraper

rod

llm-scraper

ferret

x-ray

headless-chrome-crawler

snscrape

tiktok-scraper

browser-fingerprinting

Scraperr

node-ytdl-core