-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add get_query_history #62
Conversation
746fbb1
to
56d072b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great.
Two small suggestions: One is using list comprehensions for generating the job ids, and the second (which may invalidate the first one):
I think it might make sense for the get_query_history
method to return a list of Job Objects, where each job is something like:
@dataclass
class Job:
job_id: str
phase: str
owner_id: str
creation_time: datetime
run_id: Optional[str] = None
quote: Optional[datetime] = None
start_time: Optional[datetime] = None
end_time: Optional[datetime] = None
execution_duration: Optional[int] = None
destruction: Optional[datetime] = None
lang: Optional[str] = None
query_text: Optional[str] = None
error_message: Optional[str] = None
def __str__(self) -> str:
"""A nice representation for the user, don't have to include all fields, just some core ones like: job_id, creation_time, phase, query_text"""
...
def __repr__(self) -> str:
return self.__str__()
Otherwise just the job id's might not be very useful to them on their own.
src/lsst/rsp/catalog.py
Outdated
for job in joblist: | ||
for k, v in job.items(): | ||
if k == "@id": | ||
jobs.append(v) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps slightly more efficient to do a list comprehension here with a direct lookup for each job (untested):
for job in joblist: | |
for k, v in job.items(): | |
if k == "@id": | |
jobs.append(v) | |
break | |
[job["@id"] for job in joblist if "@id" in job] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. Getting all the information out of the jobs up front would certainly make the extension easier (since right now there's a separate step to get the query text), but on the other hand, it would mean we'd need an API call per job, rather than a single one to get all jobs.
EDIT is that true? I should look closely at what the XML actually is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's the problem: a single entry is:
<uws:jobref id="idw8x0xiyp7hcs24">
<uws:phase>COMPLETED</uws:phase>
<uws:runId>dp02_dc2_catalogs.Object - data-dev</uws:runId>
<uws:ownerId>adam</uws:ownerId>
<uws:creationTime>2025-01-21T19:41:32.860Z</uws:creationTime>
</uws:jobref>
So I could get ... well, creation time and phase are somewhat useful but not to my immediate use-case. Owner ID is useless because it's the same as the owner of the token we used to make the request, and runID isn't all that interesting either.
I want to avoid doing an API call per jobref ID here, and I go to some lengths to cache it when I need to do that in the back-end service in the extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess one option is that some of those fields that are only accessible after an extra call could be properties that will call out to the individual call when accessed by the user?
Also note that we may end up actually showing the query text in the job list. This is being discussed within the IVOA, I brought it up in that meeting and most people were happy with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also runId
may end up being useful, I think Firefly folks have started using it and they note the table that was accessed & the environment. But also my understanding is that the user will be able to modify this, so it may be something useful for them.
Overall I think happy with you to go with whatever you think works best for now and we can revisit later after getting some user feedback?
No description provided.