Stars
21
Forks
14
Watchers
21
Open Issues
5
Overall repository health assessment
No package.json found
This might not be a Node.js project
2
commits
1
commits
Merge pull request #7 from AlpineNow/jt-update-spark-csv-version
9f9fe04View on GitHubDEV-14701 Updating the Spark CSV Version to 1.3, and marking that as provided, along with the spark-avro jar.
c6ac565View on GitHubDEV-13397 Handle cases where training/validation datasets end up being empty by throwing proper exceptions.
12061baView on GitHubDEV-12802 An addition of Gradient Boosting. Additionally, a massive refactoring of the previous Random Forest code. Removed sbt in favor of pom.xml. GBT and RF share the decision tree codebase.
c064fccView on GitHubChanging groupby to reducebykey in the distributed node splits.
c0d18f7View on GitHubRemoving some horrible inefficiencies from the equal frequency discretizer.
c4088a2View on GitHubImproving the performance of the node Id RDD through constant re-caching.
2bf8672View on GitHubAdding a cache to remember the heaviest child node when predicting.
f63ff0fView on GitHubFixing string split problems when there are trailing empty columns in rows.
685d8b2View on GitHubChanging the default delimiter to be tab for transformer runners.
1410f28View on GitHub