Search Results

Found 302 repositories(showing 30)

flight-scrappper

bertolo1988

🧡50

Web scraper made with nodejs and selenium-webdriver that gathers flight data and stores it in a mongodb database.

MIT

JavaScript

Updated 3 weeks ago

This is Web Scraper utilizing Scrapy Framework, MongoDB and AfricasTalking to get stock prices for companies listed on the Nairobi Stock Exchange. This project will store ticker name and price as well notify via SMS once properly setup via AfricasTalking.

MIT

Python

Updated 1 month ago

africastalkinggithub-actionsmongodb+3

SourceGrade

joelseq

❤️35

A web scraper for www.gradesource.com built with MongoDB, Express.js, Node.js, React and Redux

MIT

JavaScript

Updated 3 years ago

expressmongodbnodejs+2

BotanicTool

MateiIonutEduard

❤️40

Web scraper tool that fetch list of products from specific website and builds MongoDB dataset.

MIT

Updated 3 months ago

technews-flutter

codewithkd77

🧡65

TechBuzz is a Flutter news app with a Node.js & MongoDB backend. It features a web admin dashboard, AI news scraper, and options to bookmark & share articles. 🚀 Built with BLoC architecture for state management. Tech Stack: Flutter, Node.js, MongoDB, Web Scraping.

Dart

Updated 17 hours ago

blocflutterflutter-apps+3

Mongo_Scraper

CaptainEFFF

❤️20

# All the News That's Fit to Scrape ### Overview In this assignment, you'll create a web app that lets users view and leave comments on the latest news. But you're not going to actually write any articles; instead, you'll flex your Mongoose and Cheerio muscles to scrape news from another site. ### Before You Begin 1. Create a GitHub repo for this assignment and clone it to your computer. Any name will do -- just make sure it's related to this project in some fashion. 2. Run `npm init`. When that's finished, install and save these npm packages: 1. express 2. express-handlebars 3. mongoose 4. cheerio 5. axios 3. **NOTE**: If you want to earn complete credit for your work, you must use all five of these packages in your assignment. 4. In order to deploy your project to Heroku, you must set up an mLab provision. mLab is remote MongoDB database that Heroku supports natively. Follow these steps to get it running: 5. Create a Heroku app in your project directory. 6. Run this command in your Terminal/Bash window: * `heroku addons:create mongolab` * This command will add the free mLab provision to your project. 7. When you go to connect your mongo database to mongoose, do so the following way: ```js // If deployed, use the deployed database. Otherwise use the local mongoHeadlines database var MONGODB_URI = process.env.MONGODB_URI || "mongodb://localhost/mongoHeadlines"; mongoose.connect(MONGODB_URI); ``` * This code should connect mongoose to your remote mongolab database if deployed, but otherwise will connect to the local mongoHeadlines database on your computer. 8. [Watch this demo of a possible submission](https://youtu.be/4ltZr3VPmno). See the deployed demo application [here](http://nyt-mongo-scraper.herokuapp.com/). 9. Your site doesn't need to match the demo's style, but feel free to attempt something similar if you'd like. Otherwise, just be creative! ### Commits Having an active and healthy commit history on GitHub is important for your future job search. It is also extremely important for making sure your work is saved in your repository. If something breaks, committing often ensures you are able to go back to a working version of your code. * Committing often is a signal to employers that you are actively working on your code and learning. * We use the mantra “commit early and often.” This means that when you write code that works, add it and commit it! * Numerous commits allow you to see how your app is progressing and give you a point to revert to if anything goes wrong. * Be clear and descriptive in your commit messaging. * When writing a commit message, avoid vague messages like "fixed." Be descriptive so that you and anyone else looking at your repository knows what happened with each commit. * We would like you to have well over 200 commits by graduation, so commit early and often! ### Submission on BCS * **This assignment must be deployed.** * Please submit both the deployed Heroku link to your homework AND the link to the Github Repository! ## Instructions * Create an app that accomplishes the following: 1. Whenever a user visits your site, the app should scrape stories from a news outlet of your choice and display them for the user. Each scraped article should be saved to your application database. At a minimum, the app should scrape and display the following information for each article: * Headline - the title of the article * Summary - a short summary of the article * URL - the url to the original article * Feel free to add more content to your database (photos, bylines, and so on). 2. Users should also be able to leave comments on the articles displayed and revisit them later. The comments should be saved to the database as well and associated with their articles. Users should also be able to delete comments left on articles. All stored comments should be visible to every user. * Beyond these requirements, be creative and have fun with this! ### Tips * Go back to Saturday's activities if you need a refresher on how to partner one model with another. * Whenever you scrape a site for stories, make sure an article isn't already represented in your database before saving it; Do not save any duplicate entries. * Don't just clear out your database and populate it with scraped articles whenever a user accesses your site. * If your app deletes stories every time someone visits, your users won't be able to see any comments except the ones that they post. ### Helpful Links * [MongoDB Documentation](https://docs.mongodb.com/manual/) * [Mongoose Documentation](http://mongoosejs.com/docs/api.html) * [Cheerio Documentation](https://github.com/cheeriojs/cheerio) ### Reminder: Submission on BCS * Please submit both the deployed Heroku link to your homework AND the link to the Github Repository! --- ### Minimum Requirements * **This assignment must be deployed.** Attempt to complete homework assignment as described in instructions. If unable to complete certain portions, please pseudocode these portions to describe what remains to be completed. Hosting on Heroku and adding a README.md are required for this homework. In addition, add this homework to your portfolio, more information can be found below. --- ### Hosting on Heroku Now that we have a backend to our applications, we use Heroku for hosting. Please note that while **Heroku is free**, it will request credit card information if you have more than 5 applications at a time or are adding a database. Please see [Heroku’s Account Verification Information](https://devcenter.heroku.com/articles/account-verification) for more details. --- ### Create a README.md Add a `README.md` to your repository describing the project. Here are some resources for creating your `README.md`. Here are some resources to help you along the way: * [About READMEs](https://help.github.com/articles/about-readmes/) * [Mastering Markdown](https://guides.github.com/features/mastering-markdown/) --- ### Add To Your Portfolio After completing the homework please add the piece to your portfolio. Make sure to add a link to your updated portfolio in the comments section of your homework so the TAs can easily ensure you completed this step when they are grading the assignment. To receive an 'A' on any assignment, you must link to it from your portfolio. --- ### One Last Thing If you have any questions about this project or the material we have covered, please post them in the community channels in slack so that your fellow developers can help you! If you're still having trouble, you can come to office hours for assistance from your instructor and TAs. That goes threefold for this unit: MongoDB and Mongoose compose a challenging data management system. If there's anything you find confusing about these technologies, don't hesitate to speak with someone from the Boot Camp team. **Good Luck!**

JavaScript

Updated 6 months ago

spider

mrbadri

❤️35

web Scraper, mongodb

JavaScript

Updated 1 year ago

WebScraper-MongoDB

Youssefbaghr

❤️35

A Python-based web scraping tool that extracts content from websites using BeautifulSoup and stores the collected data in MongoDB.

Python

Updated 10 months ago

smart-image-scraper

hwasiti

❤️40

Deep learning-based image dataset cleaning of Flickr. Scraped metadata saved in MongoDB. Web app designed & deployed: https://bit.ly/smart_image_scraper

Apache-2.0

Python

Updated 1 year ago

deep-learningexif-dataflickr-api+2

Darkweb_Scraper

AdilKhan000

❤️35

This project is a Python-based scraper designed to access and scrape posts from the Dread dark web forum. The script leverages `Selenium`, `Requests`, `MongoDB`, and proxies configured through `Tor` and `Privoxy` to safely navigate and access `.onion` sites.

Python

Updated 2 months ago

Pointy_Goblins

etorres-revature

❤️40

Short-term rental advertisement aggregator using: Express.js; Node.js; React.js; web scrapers; and MongoDB.

GPL-3.0

JavaScript

Updated 4 years ago

context-apifull-stackhooks+5

mongooseScraper

tastaub

❤️35

# All the News That's Fit to Scrape ### Overview In this assignment, you'll create a web app that lets users view and leave comments on the latest news. But you're not going to actually write any articles; instead, you'll flex your Mongoose and Cheerio muscles to scrape news from another site. ### Before You Begin 1. Create a GitHub repo for this assignment and clone it to your computer. Any name will do -- just make sure it's related to this project in some fashion. 2. Run `npm init`. When that's finished, install and save these npm packages: 3. express 4. express-handlebars 5. mongoose 6. body-parser 7. cheerio 8. request 9. **NOTE**: If you want to earn complete credit for your work, you must use all six of these packages in your assignment. 10. In order to deploy your project to Heroku, you must set up an mLab provision. mLab is remote MongoDB database that Heroku supports natively. Follow these steps to get it running: 11. Create a Heroku app in your project directory. 12. Run this command in your Terminal/Bash window: * `heroku addons:create mongolab` * This command will add the free mLab provision to your project. 13. When you go to connect your mongo database to mongoose, do so the following way: ```js // If deployed, use the deployed database. Otherwise use the local mongoHeadlines database var MONGODB_URI = process.env.MONGODB_URI || "mongodb://localhost/mongoHeadlines"; // Set mongoose to leverage built in JavaScript ES6 Promises // Connect to the Mongo DB mongoose.Promise = Promise; mongoose.connect(MONGODB_URI); ``` * This code should connect mongoose to your remote mongolab database if deployed, but otherwise will connect to the local mongoHeadlines database on your computer. 14. [Watch this demo of a possible submission](mongo-homework-demo.mov). See the deployed demo application [here](http://nyt-mongo-scraper.herokuapp.com/). 15. Your site doesn't need to match the demo's style, but feel free to attempt something similar if you'd like. Otherwise, just be creative! ### Submission on BCS * Please submit both the deployed Heroku link to your homework AND the link to the Github Repository! ## Instructions * Create an app that accomplishes the following: 1. Whenever a user visits your site, the app should scrape stories from a news outlet of your choice and display them for the user. Each scraped article should be saved to your application database. At a minimum, the app should scrape and display the following information for each article: * Headline - the title of the article * Summary - a short summary of the article * URL - the url to the original article * Feel free to add more content to your database (photos, bylines, and so on). 2. Users should also be able to leave comments on the articles displayed and revisit them later. The comments should be saved to the database as well and associated with their articles. Users should also be able to delete comments left on articles. All stored comments should be visible to every user. * Beyond these requirements, be creative and have fun with this! ### Tips * Go back to Saturday's activities if you need a refresher on how to partner one model with another. * Whenever you scrape a site for stories, make sure an article isn't already represented in your database before saving it; we don't want duplicates. * Don't just clear out your database and populate it with scraped articles whenever a user accesses your site. * If your app deletes stories every time someone visits, your users won't be able to see any comments except the ones that they post. ### Helpful Links * [MongoDB Documentation](https://docs.mongodb.com/manual/) * [Mongoose Documentation](http://mongoosejs.com/docs/api.html) * [Cheerio Documentation](https://github.com/cheeriojs/cheerio) ### Reminder: Submission on BCS * Please submit both the deployed Heroku link to your homework AND the link to the Github Repository! --- ### Minimum Requirements Attempt to complete homework assignment as described in instructions. If unable to complete certain portions, please pseudocode these portions to describe what remains to be completed. Hosting on Heroku and adding a README.md are required for this homework. In addition, add this homework to your portfolio, more information can be found below. --- ### Hosting on Heroku Now that we have a backend to our applications, we use Heroku for hosting. Please note that while **Heroku is free**, it will request credit card information if you have more than 5 applications at a time or are adding a database. Please see [Heroku’s Account Verification Information](https://devcenter.heroku.com/articles/account-verification) for more details. --- ### Create a README.md Add a `README.md` to your repository describing the project. Here are some resources for creating your `README.md`. Here are some resources to help you along the way: * [About READMEs](https://help.github.com/articles/about-readmes/) * [Mastering Markdown](https://guides.github.com/features/mastering-markdown/) --- ### Add To Your Portfolio After completing the homework please add the piece to your portfolio. Make sure to add a link to your updated portfolio in the comments section of your homework so the TAs can easily ensure you completed this step when they are grading the assignment. To receive an 'A' on any assignment, you must link to it from your portfolio. --- ### One Last Thing If you have any questions about this project or the material we have covered, please post them in the community channels in slack so that your fellow developers can help you! If you're still having trouble, you can come to office hours for assistance from your instructor and TAs. That goes threefold for this week: MongoDB and Mongoose compose a challenging data management system. If there's anything you find confusing about these technologies, don't hesitate to speak with someone from the Boot Camp team. **Good Luck!**

HTML

Updated 7 months ago

scraper-springboot-angular-mongodb

ujjavaldesai07

❤️35

Web Scraper built using Spring Boot, Angular, and MongoDB. This tool scrapes the data from the website periodically in the background using a Multithreading environment and shows the data in table view with filters and sorting options in the frontend.

Java

Updated 3 years ago

angularangular-materialjunit+5

Facebook_scraping

Skanderba8

❤️20

A Python-based Facebook scraper using Selenium and BeautifulSoup to extract posts, images, comments, reactions, and dates from public Facebook pages. The app stores data in MongoDB and provides a simple Flask web interface for users to start scraping by entering the URL and date range.

Python

Updated 3 months ago

scrape-hw

jdrenteria

❤️20

JavaScript

Updated 7 months ago

mongodb-web-scraper

tomtom828

❤️35

A Node.js & MongoDB webapp that web-scrapes news data and allows users to comment about it.

HTML

Updated 4 years ago

mongodbnodejswebscraper

CheerioMongo-scraper

BFGriffith

❤️25

a NodeJS web-scraper application that uses Cheerio + MongoDB and allows users to comment on articles —

MIT

JavaScript

Updated 6 years ago

cheeriomongonode

gfinance

PTAug-zz

❤️35

A web scraper of Google Finance, that stores the data in a MongoDB database.

Python

Updated 5 years ago

indeed-scraper

furquan-lp

❤️25

Web Scraper for Indeed.com (70k+ jobs scraped so far and stored on MongoDB)

GPL-3.0

Python

Updated 2 years ago

poker_tournament_scraper

dchrostowski

❤️20

This is a containerized web scraper and mongodb to pull tournament data from various poker sites

JavaScript

Updated 11 months ago

Twitter-scraper

SRIDHAR3131

❤️35

Twitter data scraper web application! With Streamlit, MongoDB, and Snscrape-Python library, you can now easily collect, visualize and analyze Twitter data.

Python

Updated 5 months ago

mongodbpythonstreamlit+1

10man-discord-bot

trunderman

❤️35

Discord chatbot that functions as a web scraper. Intended use is for the bot to pull statistics from a players most recently played CS:GO match, send them to a database, and then show the leaderboard of the players with the best score in that discord server. Created using Express, Discord.js and mongodb

JavaScript

Updated 6 years ago

Node.js-MongoDB-Web-Scraper-Dynamic-Website-Content

sbreese

❤️35

Demo code showing how to use Node.js, Mongoose & MongoDB to scrape web content, store it in a database, and display it on a web page.

JavaScript

Updated 2 years ago

simplest-xpath-web-scraper

asirihewage

❤️35

Simplest web scraper created using Python3 and MongoDB

Python

Updated 3 years ago

datadata-miningpython3+3

web-scraping-mongodb

rmglennon

❤️40

News web scraper that stores links, notes, and favorites in MongoDB

MIT

JavaScript

Updated 4 years ago

ixitbot

k0d3d

❤️35

Scrape and crawl data from the web using NodeJs. Pages are queued using Redis, scrapers have flexible schema definitions and results are stored in MongoDB.

JavaScript

Updated 3 years ago

scraped

greggypc

❤️20

Scrape the freshest headlines from NPR News. Save, view and leave comments on your favorite headlines. This web scraper app features a Node/Express/MongoDB backend.

JavaScript

Updated 3 years ago

Medium-Scraper

zachha

❤️25

** Medium's site has been updated to make it harder to scrape articles, working on a potential fix. ** Medium Scraper is a mongodb powered web scraper that scrapes the website medium.com for articles in the user's preferred category

MIT

JavaScript

Updated 3 years ago

articlecheeriohandlebars+3

Image-Scrapping

pooja30123

❤️35

Image Scraper with MongoDB is a Flask-based web app that lets you scrape images from the internet using a keyword and store them both locally and in MongoDB Atlas. Simple UI, real-time scraping, and cloud storage — all in one project!

Jupyter Notebook

Updated 7 months ago

Patent-Information-Scraper-with-Python

Alejandro-Lopez83

❤️35

Patent Information Scraper: A Python-based web scraping tool that automatically extracts patent summaries from OEPM and Patentscope websites. Built with Selenium and BeautifulSoup4, it processes URLs and stores data in MongoDB, featuring robust error handling and progress tracking.

Python

Updated 1 year ago

GitHub Explorer

Search Results

flight-scrappper

nse-stock-scraper

SourceGrade

BotanicTool

technews-flutter

Mongo_Scraper

spider

WebScraper-MongoDB

smart-image-scraper

Darkweb_Scraper

Pointy_Goblins

mongooseScraper

scraper-springboot-angular-mongodb

Facebook_scraping

scrape-hw

mongodb-web-scraper

CheerioMongo-scraper

gfinance

indeed-scraper

poker_tournament_scraper

Twitter-scraper

10man-discord-bot

Node.js-MongoDB-Web-Scraper-Dynamic-Website-Content

simplest-xpath-web-scraper

web-scraping-mongodb

ixitbot

scraped

Medium-Scraper

Image-Scrapping

Patent-Information-Scraper-with-Python

flight-scrappper

nse-stock-scraper

SourceGrade

BotanicTool

technews-flutter

Mongo_Scraper

spider

WebScraper-MongoDB

smart-image-scraper

Darkweb_Scraper

Pointy_Goblins

mongooseScraper

scraper-springboot-angular-mongodb

Facebook_scraping

scrape-hw

mongodb-web-scraper

CheerioMongo-scraper

gfinance

indeed-scraper

poker_tournament_scraper

Twitter-scraper

10man-discord-bot

Node.js-MongoDB-Web-Scraper-Dynamic-Website-Content

simplest-xpath-web-scraper

web-scraping-mongodb

ixitbot

scraped

Medium-Scraper

Image-Scrapping

Patent-Information-Scraper-with-Python