-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random Pagination ISSUE-143 #147
base: revamp
Are you sure you want to change the base?
Conversation
@@ -102,7 +102,7 @@ async def jobs( | |||
"endDate": end_date.__str__(), | |||
"results": results_df.to_dict("records"), | |||
"total": jobsCount, | |||
"offset": offset + size, | |||
"offset": offset + size if size else 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some APIs (and you added some more with this PR), you're protecting against having offset
set to a value without size
. That's not the case here, and this expression isn't considering the value of offset
. So if size
we add offset + size
(regardless of offset
) or else use 0
.
Is that what you meant? Should you be raising the HTTPException
here if offset and not size
?
>>> offset = 10
>>> size = None
>>> offset + size if size else 0
0
>>> offset + (size if size else 0)
10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not offset and not size
, The size is set to 10,000 and on the subsequent fetch, the offset increases to 10,000, which causes Elasticsearch to throw an error. So, it should be (offset+size) if size!=10000 else 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is interesting; because if there are more than 10,000 matches, Opensearch ought to allow getting the rest of them. But more practically, in terms of your UI and API, a "page" of 10,000 is ridiculous (that's basically the problem you're solving with revamp
), so I'm not sure the border case is interesting.
I think maybe what bothers me the most about the 10000 is that it's an arbitrary and unexplained "magic number" that shows up repeatedly as a literal in the code. Perhaps it should instead be a constant (like MAX_PAGE
) that can be commented somewhere.
I also wonder about the != MAX_PAGE
logic, since someone could try ?size=10001
and trigger the same issue. Or even ?size=999
would result (even with 2 hits) in returning a "next offset" of 10001. So maybe it should be if size + offset < MAX_PAGE
? But that sort of logic bothers me from a usability angle -- it's not clear to the caller why their pagination is being arbitrarily "reset". An alternate solution would return the "true" offset, but have your "normalization" logic explicitly reject an offset >= 10000.
Encapsulating the logic to test/normalize the incoming offset
and size
values would be good, too, for consistency. (And clearer documentation.) Maybe something like
MAX_PAGE = 10000
def normalize_pagination(offset: Optional[int], size: Optional[int]) -> tuple[int, int]:
if offset and not size:
raise HTTPException(400, f"offset {offset} specified without size")
elif not offset and not size:
size = MAX_PAGE
offset = 0
elif not offset:
offset = 0
elif offset >= MAX_PAGE:
raise HTTPException(400, f"offset {offset} is too big (>= {MAX_PAGE})")
return offset, size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concerns match Dave's.
Random Pagination not working
Type of change
Description
Pagination on clicking
first
andlast
buttons not workingRelated Tickets & Documents
Checklist before requesting a review
Testing