-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to process a single directory #29
Comments
Off the top of my head... two ways to go:
The first will be tricky for the user to specify sub-directories. When specifying a sub-directory, is it a path relative to the assignment folder, or is it relative to the caller's location? The later is easier for the user since s/he can use tab completion. The former is easier for the program, since it does not have to do some tricky path resolutions. Going back to the numbered list above, I think the second approach would be easier for the user, and not so difficult to implement. We could store a list of the files (or directories) processed in a file next to the config file. |
If we go with option 2, what format would we use for the file tracking the directories that have been processed? JSON? |
And we probably need a way to unmark a directory from the processed list. |
A submission must be reprocessed if there was any file to be processed was changed or added since the last time the submission was processed. For example, suppose Alice submits one file f1. Now suppose we are to process f1 and f2. When Alice's submission is processed, f1 is processed. Later, Alice turns in f2. Rerunning, we now must reprocess Alice's submission: both f1 and f2. Similarly if Alice turns in a new version of f1, we must reprocess Alice's entire submission. |
For each file processed, save its path and hash to a "cache file" (better names welcome). Then when asked to process files again, check this cache file to see which files have been modified. Below is a rough sketch of how to implement this in Python. # source: http://stackoverflow.com/questions/1912567/python-library-to-detect-if-a-file-has-changed- import pickle
import hashlib #instead of md5
try:
l = pickle.load(open("db"))
except IOError:
l = []
db = dict(l)
path = "/etc/hosts"
#this converts the hash to text
checksum = hashlib.md5(open(path).read()).hexdigest()
if db.get(path, None) != checksum:
print "file changed"
db[path] = checksum
pickle.dump(db.items(), open("db", "w")) |
There should be an option to process a single directory, rather than all subdirectories of the main directory.
The most common case will be to process all subdirectories, but sometimes a student's repository will be pulled late or have to be reprocessed after an error, and so will need to be processed individually, without reprocessing all the other directories.
The text was updated successfully, but these errors were encountered: