Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REVAI-3855: Update python sdk to support Super API #105

Merged
merged 33 commits into from
Dec 21, 2023

Conversation

kirillatrev
Copy link
Collaborator

REVAI-3855: Update python sdk to support Super API
https://revinc.atlassian.net/browse/REVAI-3855

@kirillatrev kirillatrev marked this pull request as ready for review December 20, 2023 20:44
@kirillatrev kirillatrev requested a review from a team as a code owner December 20, 2023 20:44
@kirillatrev
Copy link
Collaborator Author

Tests are coming shortly


response = self._make_http_request(
"GET",
urljoin(self.base_url, 'jobs/{}/transcript/summary'.format(id_)),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any way to extract 'jobs/{}/transcript/summary' to be reusable?

Comment on lines +32 to +34
speakers_count=None,
summarization: Summarization = None,
translation: Translation = None):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
speakers_count=None,
summarization: Summarization = None,
translation: Translation = None):
speakers_count=None,
summarization: Summarization=None,
translation: Translation=None):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as bellow:

I follow PEP8, which is commonly accepted standard for python formatting, PyCharm supports PEP 8 and suggests this formatting here. I Used autoformat everywhere where reasonable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is less matter what is the autoformatting when all the remaining code will be formatted differently.

at the same time, as per PEP8

Don’t use spaces around the = sign when used to indicate a keyword argument, or when used to indicate a default value for an unannotated function parameter:
# Correct:
def complex(real, imag=0.0):
    return magic(r=real, i=imag)
# Wrong:
def complex(real, imag = 0.0):
    return magic(r = real, i = imag)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed on Zoom. Agreed that formatting of arguments with type annotation is different according to the standard.

Comment on lines 9 to 14
def __init__(
self,
prompt: str = None,
model: NlpModel = None,
formattingType: SummarizationFormattingOptions = None
):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this formatting correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used auto formatter as suggested with linter, no complaints.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I follow PEP8, which is commonly accepted standard for python formatting, PyCharm supports PEP 8 and suggests this formatting here. I Used autoformat everywhere where reasonable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split the unit tests up into simpler test cases. They are pretty hefty, each testing three or four response methods along with the request method. I would maybe split each test case into

  1. request_local_file - assert request with proper params
  2. request_source_url - assert request with proper params
  3. get_transcript_summary_text
  4. get_transcript_summary_json
  5. get_transcript_summary_streamed_json
  6. get_translation...
    ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be a real integration test before, but as of now I agree that it makes more sense to split those tests. Thanks!

Copy link

@k-weng k-weng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far - had some comments regarding simplifying the unit tests and the need for the audio files and transcript file since they can just be mocked in code.

In addition, it would nice to add an example to https://github.com/revdotcom/revai-python-sdk/tree/develop/examples and document some usage in the README - like we have for Human Transcription - https://github.com/revdotcom/revai-python-sdk?tab=readme-ov-file#human-transcription

@k-weng
Copy link

k-weng commented Dec 20, 2023

Can you also bump version to 2.19.0 and add an entry to https://github.com/revdotcom/revai-python-sdk/blob/develop/HISTORY.rst

@k-weng k-weng requested a review from dmtrrk December 21, 2023 18:16
@@ -66,7 +70,9 @@ def submit_job_url(
notification_config=None,
skip_postprocessing=False,
remove_atmospherics=False,
speakers_count=None):
speakers_count=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like we are missing to pass these parameters to _create_job_options_payload

remove_atmospherics=False,
speakers_count=None,

@amikofalvy do you know whether it is intentional?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmtrrk We just haven't supported them in SDKs yet - there's an open PR for it -#104


response = self._make_http_request(
"GET",
urljoin(self.base_url, 'jobs/{0}/captions/translation/{2}{1}'.format(id_, query, language)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't you pass params as format(id_, language, query)?
it is a bit difficult to read the code with the unnecessary {2}{1}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to avoid adding changes to format arguments order, but at this point it is better to change it, Thanks!

Comment on lines +32 to +34
speakers_count=None,
summarization: Summarization = None,
translation: Translation = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is less matter what is the autoformatting when all the remaining code will be formatted differently.

at the same time, as per PEP8

Don’t use spaces around the = sign when used to indicate a keyword argument, or when used to indicate a default value for an unannotated function parameter:
# Correct:
def complex(real, imag=0.0):
    return magic(r=real, i=imag)
# Wrong:
def complex(real, imag = 0.0):
    return magic(r = real, i = imag)

from .summarization_job_status import SummarizationJobStatus
from ..nlp_model import NlpModel

"""Summarization request options."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be inside the class



class SummarizationOptions:
def __init__(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on other files, it looks like the trailing ): should be on the same line with the last argument.

nit, but this is a public code. also, there should be a comment

return dict_result


"""Summarization options."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what kind of syntax is this. I believe this should be inside the class

@@ -0,0 +1,20 @@
from typing import List, Dict

"""Transcript summary model."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be either before imports, or inside the class

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.

so the file might look like this:

"""Module description here"""

import xyz

class MyClass:
    """This is my class"""
    pass

class MyClass2:
    """This is my class 2"""
    pass

model: NlpModel = None,
status: SummarizationJobStatus = None,
failure: str = None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove line

def __init__(
self,
language: str = None,
model: NlpModel = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether you discussed it already., I see that some code is written in py2.x style without the type annotations, other - with..

while strict typing is good, wondering how can we be more consistent in the publicly available code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed to encourage incremental improvements going forward.

"POST",
JOBS_URL,
json={
'metadata': "python sdk SuperApi test",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure SuperApi should be here

@kirillatrev
Copy link
Collaborator Author

Examples to follow in another PR once this update is published

@kirillatrev kirillatrev requested review from dmtrrk and k-weng December 21, 2023 19:50
Copy link

@k-weng k-weng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I'll let @dmtrrk provide the final approval

@kirillatrev kirillatrev merged commit cfac469 into develop Dec 21, 2023
6 checks passed
@kirillatrev kirillatrev deleted the feature/REVAI-3855 branch December 21, 2023 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants