Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes in worker and tools related to sdk filestorage #844

Merged
merged 12 commits into from
Nov 22, 2024
Merged
4 changes: 4 additions & 0 deletions tools/classifier/sample.env
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ TOOL_DATA_DIR=../data_dir
X2TEXT_HOST=http://unstract-x2text-service
X2TEXT_PORT=3004

# File System Configuration for Workflow Execution
# Directory path for execution data storage
# (e.g., bucket/execution/org_id/workflow_id/execution_id)
EXECUTION_DATA_DIR=<execution_dir_path_with_bucket>
chandrasekharan-zipstack marked this conversation as resolved.
Show resolved Hide resolved
# Storage provider for Workflow Execution (e.g., minio, S3)
WORKFLOW_EXECUTION_FS_PROVIDER="minio"
WORKFLOW_EXECUTION_FS_CREDENTIAL={"endpoint_url":"http://localhost:9000","key":"","secret":""}
2 changes: 1 addition & 1 deletion tools/classifier/src/config/properties.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"schemaVersion": "0.0.1",
"displayName": "File Classifier",
"functionName": "classify",
"toolVersion": "0.0.38",
"toolVersion": "0.0.39",
muhammad-ali-e marked this conversation as resolved.
Show resolved Hide resolved
"description": "Classifies a file into a bin based on its contents",
"input": {
"description": "File to be classified"
Expand Down
4 changes: 4 additions & 0 deletions tools/structure/sample.env
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ PROMPT_PORT=3003
X2TEXT_HOST=http://unstract-x2text-service
X2TEXT_PORT=3004

# File System Configuration for Workflow Execution
# Directory path for execution data storage
# (e.g., bucket/execution/org_id/workflow_id/execution_id)
EXECUTION_DATA_DIR=<execution_dir_path_with_bucket>
# Storage provider for Workflow Execution (e.g., minio, S3)
WORKFLOW_EXECUTION_FS_PROVIDER="minio"
WORKFLOW_EXECUTION_FS_CREDENTIAL={"endpoint_url":"http://localhost:9000","key":"","secret":""}
2 changes: 1 addition & 1 deletion tools/structure/src/config/properties.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"schemaVersion": "0.0.1",
"displayName": "Structure Tool",
"functionName": "structure_tool",
"toolVersion": "0.0.46",
"toolVersion": "0.0.48",
muhammad-ali-e marked this conversation as resolved.
Show resolved Hide resolved
"description": "This is a template tool which can answer set of input prompts designed in the Prompt Studio",
"input": {
"description": "File that needs to be indexed and parsed for answers"
Expand Down
7 changes: 6 additions & 1 deletion tools/structure/src/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,12 @@ def run(
transform_dict,
)

highlight_data = transform_dict(epilogue, tool_data_dir)
if hasattr(self, "workflow_filestorage"):
highlight_data = transform_dict(
epilogue, tool_data_dir, self.workflow_filestorage
)
else:
highlight_data = transform_dict(epilogue, tool_data_dir)
metadata[SettingsKeys.HIGHLIGHT_DATA] = highlight_data
metadata[SettingsKeys.CONFIDENCE_DATA] = get_confidence_data(
epilogue, tool_data_dir
Expand Down
4 changes: 4 additions & 0 deletions tools/text_extractor/sample.env
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ TOOL_DATA_DIR=
X2TEXT_HOST=
X2TEXT_PORT=

# File System Configuration for Workflow Execution
# Directory path for execution data storage
# (e.g., bucket/execution/org_id/workflow_id/execution_id)
EXECUTION_DATA_DIR=<execution_dir_path_with_bucket>
# Storage provider for Workflow Execution (e.g., minio, S3)
WORKFLOW_EXECUTION_FS_PROVIDER="minio"
WORKFLOW_EXECUTION_FS_CREDENTIAL={"endpoint_url":"http://localhost:9000","key":"","secret":""}
2 changes: 1 addition & 1 deletion tools/text_extractor/src/config/properties.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"schemaVersion": "0.0.1",
"displayName": "Text Extractor",
"functionName": "text_extractor",
"toolVersion": "0.0.36",
"toolVersion": "0.0.37",
muhammad-ali-e marked this conversation as resolved.
Show resolved Hide resolved
"description": "The Text Extractor is a powerful tool designed to convert documents to its text form or Extract texts from documents",
"input": {
"description": "Document"
Expand Down
148 changes: 74 additions & 74 deletions worker/pdm.lock

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions worker/sample.env
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,17 @@ REMOVE_CONTAINER_ON_EXIT=True
CONTAINER_CLIENT_PATH=unstract.worker.clients.docker

EXECUTION_RUN_DATA_FOLDER_PREFIX="/app/workflow_data"

# Feature Flags
FLIPT_SERVICE_AVAILABLE=False
EVALUATION_SERVER_IP=unstract-flipt
EVALUATION_SERVER_PORT=9005
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

# File System Configuration for Workflow and API Execution
# Directory Prefixes for storing execution files
WORKFLOW_EXECUTION_DIR_PREFIX="/unstract/execution"
# Storage Provider for Workflow Execution
# Valid options: MINIO, S3, etc..
WORKFLOW_EXECUTION_FS_PROVIDER="MINIO"
WORKFLOW_EXECUTION_FS_CREDENTIAL='{"endpoint_url": "", "key": "", "secret": ""}'