Back to search
Using Databricks and Apache Spark (PySpark) to perform distributed processing on the Wikipedia corpus (54MB, 400,000+ entries), combined with the MAGPIE idiom corpus for NLP and semantic structure analysis.
Stars
0
Forks
0
Watchers
0
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
12
commits