Found 55 repositories(showing 30)
ondergetekende
A toolkit to build pythonic web scraper libraries
ImYourBoyRoy
AI-first web scraping engine with stealth bypass, MCP server, and multimodal output (Markdown, JSON, PDF) for agents and automation.
ProphetLamb
Mordern toolkit for implementing web scrapers as a ASP.NET service.
devalexanderdaza
A comprehensive TypeScript toolkit for building robust web scrapers with Crawlee, featuring maximum configurability, plugin system, and CLI generator.
This repo contains four Python projects from the Code Alpha Internship: a Hangman game, stock portfolio tracker, automation toolkit (file mover, email extractor, web scraper), and a rule-based chatbot. Each script showcases basic Python skills and real-world functionality.
leimei7
这是一个包含多个爬虫小项目的集合,用于演示不同网站的爬取技术和实现方式。每个项目都有其特定的功能和用途,下面将详细介绍每个项目的功能、使用方法和依赖项。
CodeForgeX2012
No description available
TilottamaShinde
web-scraper-toolkit is a python based project designed to extract information from website using BerautifulSoup , Reqests , Pandas and Selenium libraries.
lazymac2x
Universal Web Scraper API - extract metadata, links, headlines, images, tables. REST + MCP.
nvllmnd
Web Crawler and Scraper for Metanoia Cybersecurity Toolkit
DalyanParker
A Data Analysis Toolkit & Web Scraper for Optimized Medical Queries
fuahyo
MCP toolkit for automated web scraper maintenance, selector generation, and seamless integration with scraping workflows
adelelawady
Scraperly is your all-in-one Python toolkit for creating engaging AI-powered videos. It seamlessly combines web scraping, AI content processing, and video generation to turn your text into professional-looking videos with minimal effort.
This repo contains four Python projects from the Code Alpha Internship: a Hangman game, stock portfolio tracker, automation toolkit (file mover, email extractor, web scraper), and a rule-based chatbot. Each script showcases basic Python skills and real-world functionality.
prasadp1307
Natural language processing (NLP) refers to the branch of computer science—and more specifically, the branch of artificial intelligence or AI—concerned with giving computers the ability to understand text and spoken words in much the same way human beings can. What is NLTK used for? The Natural Language Toolkit (NLTK) is a platform used for building Python programs that work with human language data for applying in statistical natural language processing (NLP). It contains text processing libraries for tokenization, parsing, classification, stemming, tagging and semantic reasoning. What is web scraping? Web scraping is the process of using bots to extract content and data from a website. Unlike screen scraping, which only copies pixels displayed onscreen, web scraping extracts underlying HTML code and, with it, data stored in a database. The scraper can then replicate entire website content elsewhere.
This thesis presents three main contributions to the essential questions of the corresponding research. The first contribution is data scraping from Wikipedia news articles using web-scraper, which further deals with extraction of entities and relations from news events by using constituency parser from natural language processing toolkit. The second contribution presents the annotation interface for annotation purposes. Finally, the third contribution pertains to annotated arguments and entities from the annotation interface, passed through the relation extraction framework based on Long Short Term Memory (LSTM) and Multi-Task Learning (MTL). The model has been evaluated on datasets and its performance is measured on the evaluation metrics, demonstrating the model’s effectiveness with supervised learning and MTL, which has significantly improved the extraction accuracy. Keywords: Information extraction, relation extraction, entity extraction, n-ary relation extraction, constituency parser, annotation interface, Long Short Term Memory (LSTM) and Multi-Task Learning (MTL).
maivyly52-gif
amazon web scraper toolkit
ahmed202020803
A comprehensive toolkit for scraping and analyzing web data with various tools and utilities
operezol
No description available
detih2
Production-ready web scraping tools: BeautifulSoup for static pages, Playwright for JS-rendered content, export to CSV/Excel/JSON.
timdunn22
HN Who Is Hiring scraper. Extracts structured job data (company, role, location, remote, tech stack) to CSV/JSON. Python + BeautifulSoup.
MuneerAhmad7
No description available
v0id-lab
Production-ready web scraping framework. Async, proxy rotation, anti-detection. Scrape any website.
merchbotai
No description available
Chris-Dyson
Production-ready Python web scraping toolkit with rate limiting, CSS selectors, and CSV/JSON export
naevis960
🕷️ A powerful and flexible web scraping toolkit built with Python. Supports multiple sites, anti-detection, and data export.
syunend-create
Universal web scraper using Playwright and BeautifulSoup4 with CSV export support
Hi-im-Connect
Production-grade Python web scraper toolkit — static + JS-heavy sites, AI-powered data extraction
arcbjorn
No description available
maivyly52-gif
web scraper amazon automation toolkit