Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/listFinishedTasks unclear how to use or what the benefit is #101

Open
lionel-rowe opened this issue Feb 21, 2024 · 0 comments
Open

/listFinishedTasks unclear how to use or what the benefit is #101

lionel-rowe opened this issue Feb 21, 2024 · 0 comments

Comments

@lionel-rowe
Copy link

Per JavaScript/ocrsdk.js:

// Note: if your application queues several files and waits for them
// it's recommended that you use listFinishedTasks instead (which is described
// at https://ocrsdk.com/documentation/apireference/listFinishedTasks/).

It looks like the only sample that actually implements this recommendation is Java/Abbyy.Ocrsdk.client/srcProcessManyFiles.java:

private static void waitAndDownloadResults( Map<String,String> taskIds ) throws Exception {
// Call listFinishedTasks while there are any not completed tasks from taskIds
// Please note: API call 'listFinishedTasks' returns maximum 100 tasks
// So, to get all our tasks we need to delete tasks on server. Avoid running
// parallel programs that are performing recognition with the same Application ID

Given the limit of 100 results, sorted by date ascending (?!), and with no way of paginating, it's necessary to manually delete each completed task once it's been downloaded. This seems very flaky — if the program crashes before completion, leaving several tasks un-downloaded, those tasks will presumably just hang around taking up space on the list of returned tasks forever. Once the number of such zombie tasks reaches 100, calls to /listFinishedTasks will never return any relevant tasks, so the program will continue polling the endpoint forever until manually terminated.

Also, is there any real drawback to just polling /getTaskStatus for each ongoing task? It's not clear what the benefit of polling /listFinishedTasks is, given the increased complexity and flakiness. Presumably calls to /getTaskStatus are cheap, as they don't send or return much data or do any processing, just check a status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant