Skip to content

Latest commit

 

History

History
23 lines (19 loc) · 556 Bytes

File metadata and controls

23 lines (19 loc) · 556 Bytes

Multi ranks MLServer

requirement

pip install mlserver grpcio grpcio-health-checking grpcio-tools

1. Start xft grpc server

Follow xft grpc server to start a grpc xft server with multi-ranks using scripts in grpc_launcher with mpirun.

2. Configure model setting

Edit params in model-settings.json.

"token_path": "/data/llama-2-7b-chat-hf",
"xft_grpc_server_ip": "localhost",
"xft_grpc_server_port": "50051"

3. Start MLServer

cd mlserver/multi-ranks
mlserver start .