Found 2 repositories(showing 2)
mhkc
Test data for bootstrapping a demo instance of Bonsai
ssomani7
• Implemented end to end data pipeline for collecting live stream of tweets from Twitter. Created a Twitter developer account and an application in it. Using the Twitter API & Access tokens and keys from this app, user can collect live stream of tweets for the interested topic of tweets in real time. • 4 different Maven modules for Java in backend illustrating 1) Idempotent Kafka Producer to get data from Twitter API into Kafka Topic. 2) Idempotent Kafka Consumer to get data from Kafka & storing it in ElasticSearch hosted in bonsai.io cloud. 3) A custom Java class for filtering Twitter tweets based on followers count & other features. 4) Performance improvement using Batching with Bulk Request Handling, Exception Handling for bad data, Multithreading and Logging. • Stored data locally in PostgreSQL with schema enforcement using Avro. Tested REST proxy using Insomnia client.
All 2 repositories loaded