To run Hydro in cluster mode (i.e., on multiple nodes and with autoscaling enabled), you will need to have an AWS account. Hydro depends on Kubernetes. This tutorial will walk you through setting up the required dependencies to run the Kubernetes CLI (kubectl
) and kops (a tool to create & manage Kubernetes clusters on public cloud infrastructure).
We assume you are running inside an EC2 linux VM on AWS, where you have Python3 installed (preferably Python3.6 or later -- we have not tested with earlier versions).
- Run the following commands to clone the various git repositories for
the Hydro project and configure the
HYDRO_HOME
environment variable:
# you can change ~/hydro-project to whatever you like
export HYDRO_HOME=~/hydro-project
mkdir $HYDRO_HOME
cd $HYDRO_HOME
git clone --recurse-submodules https://github.com/hydro-project/anna.git
git clone --recurse-submodules https://github.com/hydro-project/anna-cache.git
git clone --recurse-submodules https://github.com/hydro-project/cluster.git
git clone --recurse-submodules https://github.com/hydro-project/droplet.git
cd cluster
- Install the
kubectl
binary using the Kubernetes documentation, found here. Don't worry about setting up your kubectl configuration yet. - Install the
kops
binary -- documentation found here - Install a variety of Python dependencies:
pip3 install awscli boto3 kubernetes
1
kops
requires an IAM group and user with permissions to access EC2, Route53, etc. You can find the commands to create these permissions here. Make sure that you capture the Access Key ID and Secret Access Key for thekops
IAM user and set them as environmnent variables (AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
) and pass them intoaws configure
, as described in the above link.kops
also requires an S3 bucket for state storage. More information about configuring this bucket can be found here.- Finally, in order to access the cluster, you will need a domain name2 to point to. Currently, we have only tested our setup scripts with domain names registered in Route53.
kops
supports a variety of DNS settings, which you can find more information about here. If you would like help with running using other DNS settings and run into any challenges, please please open an issue, and we'd be happy to help debug.
- Our cluster creation scripts depend on three environment variables:
HYDRO_HOME
,HYDRO_CLUSTER_NAME
, andKOPS_STATE_STORE
. SetHYDRO_HOME
as described in Step 0. Set theHYDRO_CLUSTER_NAME
variable to the name of the Route53 domain that you're using (see Footnote 2 if you are not using a Route53 domain -- you will need to modify the cluster creation scripts). Set theKOPS_STATE_STORE
variable to the S3 URL of S3 bucket you created in Step 2 (e.g.,s3://hydro-kops-state-store
). - As described in Footnote 1, make sure that your
$PATH
variable includes the path to theaws
CLI tool. You can check if its on your path by runningwhich aws
-- if you see a valid path, then you're set. - As descried in Step 2, make sure you have run
aws configure
and set your region (by default, we useus-east-1
) and the access key parameters for the kops user created in Step 2.
You're now ready to create your first cluster. To start off, we'll create a tiny cluster, with one memory tier node and one routing node. From the $HYDRO_HOME/cluster/
directory, run python3 -m hydro.cluster.create_cluster -m 1 -r 1 -f 1 -s 1
. This will take about 10-15 minutes to run. Once it's finished, you will see the URL of two AWS ELBs, which can be used to interact with the Anna KVS and Droplet, respectively.
1 By default, the AWS CLI tool installs in ~/.local/bin
on Ubuntu. You will have to add this directory to your $PATH
.
2 You can also run in local mode, where you set the HYDRO_CLUSTER_NAME
environment variable to {clustername}.k8s.local
. This setting doesn't require a domain name -- however, this mode limits cluster size because it only runs in mesh networking mode (which only allows up to 64 nodes, from what we can tell), and requires modifying our existing cluster creation scripts. We don't have documentation written up for this as its not a use case we intend to support, but you can either open an issue or send us an email if you're interested in this.