Add option to process a single directory #29

kwurst · 2014-06-13T13:42:37Z

There should be an option to process a single directory, rather than all subdirectories of the main directory.
The most common case will be to process all subdirectories, but sometimes a student's repository will be pulled late or have to be reprocessed after an error, and so will need to be processed individually, without reprocessing all the other directories.

StoneyJackson · 2014-06-13T14:31:12Z

Off the top of my head... two ways to go:

Allow sub-directories to be specified on the command-line. convert path/to/config.json subdir1 subdir2
Track which directories have been processed with the converter. Then only process sub-directories that have not yet been processed. Then provide a command line switch to re-process all. convert -A path/to/config.json.

The first will be tricky for the user to specify sub-directories. When specifying a sub-directory, is it a path relative to the assignment folder, or is it relative to the caller's location? The later is easier for the user since s/he can use tab completion. The former is easier for the program, since it does not have to do some tricky path resolutions.

Going back to the numbered list above, I think the second approach would be easier for the user, and not so difficult to implement. We could store a list of the files (or directories) processed in a file next to the config file.

kwurst · 2014-06-13T18:28:35Z

If we go with option 2, what format would we use for the file tracking the directories that have been processed? JSON?
And I think it only needs to be a list of directories processed. A directory should not be processed if it is missing any of the required files.
We could have a flag that would force processing of only the files that exist, for students who have not turned in all the required files, so we could produce a PDF of the existing files so that there is something to grade...

kwurst · 2014-06-13T18:29:36Z

And we probably need a way to unmark a directory from the processed list.

StoneyJackson · 2014-06-13T20:13:27Z

A submission must be reprocessed if there was any file to be processed was changed or added since the last time the submission was processed.

For example, suppose Alice submits one file f1. Now suppose we are to process f1 and f2. When Alice's submission is processed, f1 is processed. Later, Alice turns in f2. Rerunning, we now must reprocess Alice's submission: both f1 and f2. Similarly if Alice turns in a new version of f1, we must reprocess Alice's entire submission.

StoneyJackson · 2014-07-13T01:45:50Z

For each file processed, save its path and hash to a "cache file" (better names welcome). Then when asked to process files again, check this cache file to see which files have been modified.

Below is a rough sketch of how to implement this in Python.

# source: http://stackoverflow.com/questions/1912567/python-library-to-detect-if-a-file-has-changed-    import pickle
    import hashlib #instead of md5
    try:
        l = pickle.load(open("db"))
    except IOError:
        l = []
    db = dict(l)
    path = "/etc/hosts"
    #this converts the hash to text
    checksum = hashlib.md5(open(path).read()).hexdigest() 
    if db.get(path, None) != checksum:
        print "file changed"
        db[path] = checksum
    pickle.dump(db.items(), open("db", "w"))

kwurst added the enhancement label Jun 13, 2014

kwurst mentioned this issue Jun 13, 2014

Feature/issue28 #33

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to process a single directory #29

Add option to process a single directory #29

kwurst commented Jun 13, 2014

StoneyJackson commented Jun 13, 2014

kwurst commented Jun 13, 2014

kwurst commented Jun 13, 2014

StoneyJackson commented Jun 13, 2014

StoneyJackson commented Jul 13, 2014

Add option to process a single directory #29

Add option to process a single directory #29

Comments

kwurst commented Jun 13, 2014

StoneyJackson commented Jun 13, 2014

kwurst commented Jun 13, 2014

kwurst commented Jun 13, 2014

StoneyJackson commented Jun 13, 2014

StoneyJackson commented Jul 13, 2014