Clean, process and format the WikiHow dataset (https://github.com/mahnazkoupaee/WikiHow-Dataset) to work with the pointer generator network (Pytorch implementation: https://github.com/atulkum/pointer_summarizer).
Stars
1
Forks
0
Watchers
1
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
18
commits
Added titles.txt for refence to all article titles after calling process.py
4a58a61View on GitHubAdded modified version of abisee's make_datafiles.py to tokenize and format wikihow articles to feed into the pointer generator network
47f2355View on GitHubAdded optional script to split titles file into train, test and validation files, can use pre-split files provided to have consistent comparisons
8a12cabView on GitHubScript to generate article files with hashed titles from cleaned wikihow csv dataset
72c2da1View on GitHub