Found 4 repositories(showing 4)
tzy0301
Using Databricks and Apache Spark (PySpark) to perform distributed processing on the Wikipedia corpus (54MB, 400,000+ entries), combined with the MAGPIE idiom corpus for NLP and semantic structure analysis.
Practice NLP throw Wikipedia .
ong-zijian
This is a practice of Beautiful Soup, Mechanical Soup and Selenium that I learnt from online. To see me in practice with some of them, here is a NLP Wikipedia Project that I did: https://github.com/ong-zijian/NLP_Wikipedia_QA
A Python library and CLI tool designed to demonstrate core software engineering practices and NLP logic encapsulation. This project pipelines data from the Wikipedia API, processes it using TextBlob, and extracts noun phrases.
All 4 repositories loaded