Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change json encoding ebhavior #1070

Merged
merged 10 commits into from
Dec 3, 2024
Merged

Conversation

diptanu
Copy link
Collaborator

@diptanu diptanu commented Nov 28, 2024

The primary use case for JSON encoding is to support calling graphs with JSON data from other languages and retreive output from graphs as JSON. We were using JSON Pickle which adds a lot of un-necessary information in the json encoded result, and is confusing for the most part when deserializing in other language.

This PR -

  1. Encodes the input and outputs as normal JSON. Any objects that can't be natively serialized with python's json module won't work. We would support native python types and dictionaries as inputs and outputs when JSON encoding is being used.
  2. Introduce input and output encoders since in some cases we might want the input to be cloudpickle encoded for it's flexibility of serializing complex python types, but the output needs to be json encoded so that the output can be read back form typescript, Java or rust runtimes.

Test

Tested the simple function with json encoder graph.


(ve) diptanuc@Diptanus-MacBook-Pro-2 python-sdk % curl -X 'POST' \
  'http://localhost:8900/namespaces/default/compute_graphs/test_simple_function_with_json_encoding/invoke_object' \
  -H 'accept: */*' \
  -H 'Content-Type: application/json' \
  -d '"foo"'
data: {"id":"859fd5ec72107565"}

(ve) diptanuc@Diptanus-MacBook-Pro-2 python-sdk % curl -X 'GET' \
  'http://localhost:8900/namespaces/default/compute_graphs/test_simple_function_with_json_encoding/invocations/859fd5ec72107565/fn/simple_function_with_json_encoder/output/c894d638be414f01' \
  -H 'accept: */*'
"foob"%


@diptanu
Copy link
Collaborator Author

diptanu commented Nov 28, 2024

Tests needs to be fixed. Something broke after I removed a bunch of encoding related calls which were spread all over the code. We shouldn't be serializing and deserializing data outside of the object encoders

python-sdk/indexify/executor/agent.py Outdated Show resolved Hide resolved
@@ -954,6 +954,23 @@ files = [
{file = "sniffio-1.3.1.tar.gz", hash = "sha256:f4324edc670a0f49750a81b895f35c3adb843cca46f0530f79fc1babb23789dc"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do a poetry uninstall jsonpickle

@diptanu diptanu force-pushed the change-json-encoding-behavior branch from c7246ba to b0ad30c Compare December 2, 2024 23:35
Copy link
Collaborator

@miguelhrocha miguelhrocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to approve to unblock myself, but I can tackle the feedback retroactively

Copy link
Member

@seriousben seriousben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some questions and low priority proposed changes.

Text("registration Error: ", style="red bold")
+ Text(f"failed to register: {e}", style="red")
)
logging.error(f"failed to register: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logging.error(f"failed to register: {e}")
logging.error(f"failed to register:", exception=str(e))

"failed to download input",
url=url,
reducer_url=reducer_url,
error=response.text,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error=response.text,
error=response.text,
exception: str(e),

logger.error(
"failed to download reducer output",
url=reducer_url,
error=response.text,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error=response.text,
error=response.text,
exception: str(e),

task_id=completed_task.task.id,
retries=completed_task.reporting_retries,
error=type(e).__name__,
message=str(e),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
message=str(e),
exception=str(e),

task_id=completed_task.task.id,
retries=completed_task.reporting_retries,
status_code=response.status_code,
response_text=response.text,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
response_text=response.text,
response_text=response.text,
exception: str(e),


@staticmethod
def serialize_list(data: List[Any]) -> str:
return jsonpickle.encode(data)
return json.dumps(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wrap an exception into a value error here?


@staticmethod
def deserialize_list(data: str) -> List[Any]:
return jsonpickle.decode(data)
def deserialize_list(data: str, t: Type) -> List[Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wrap an exception into a value error here?

val: int


@indexify_function(accumulate=JsonSum, input_encoder="json")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we support accumulate=int?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seriousben i think that will work too, because an int is serializable with json and cloudpickle

@miguelhrocha miguelhrocha merged commit 5776e0a into main Dec 3, 2024
5 checks passed
@miguelhrocha miguelhrocha deleted the change-json-encoding-behavior branch December 3, 2024 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants