ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
Stars
95
Forks
40
Watchers
95
Open Issues
0
Overall repository health assessment
No package.json found
This might not be a Node.js project
95
commits
79
commits
11
commits
5
commits
4
commits
1
commits
1
commits
1
commits
Add clean image names script; and add-captions script.
74c11ddView on GitHubAutomatically generate the roxyimages file so a user doesn't have to via the entrypoint.
58db632View on GitHubFix for #44 this closes #44 pysolr not in container, and also add OODT_HOME/bin to PATH and also make sure env vars are set in bash profile.
834291eView on GitHubfix query used to only find records without a sha1sum and add it. change slice index to 0 always since we are keeping a stack.
7e96202View on GitHubadd newline at end of compse. Add script for computing and add sha1sum.
80dcb0aView on GitHub