Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to connect to the Aerospike cluster using k8s ingress #31

Open
mmazek opened this issue Jan 22, 2021 · 10 comments
Open

Not able to connect to the Aerospike cluster using k8s ingress #31

mmazek opened this issue Jan 22, 2021 · 10 comments
Assignees

Comments

@mmazek
Copy link

mmazek commented Jan 22, 2021

Hi, I'm using Aerospike Community installed using the Helm Chart from this repo. I'm deploying to AWS EKS and I'm not able to make it work.
The relevant part of the values.yaml:

  hostNetwork:
    enabled: true
    useExternalIP: true
  loadBalancerServices:
    enabled: true

This way I end up with the following errors printed by the aerospike-init container:
WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceRequest.status of type main.Stat. Retrying.
After some time the init script finished with Exit status 0 and the pod runs. It's not advertising the external IPs though, so it's not possible to connect from outside of the cluster.
Can you please check this out? Installing Aerospike directly on EC2 is not an option in my case.

Helm Chart version: 5.3.0
values.yaml:

aerospike:
  dbReplicas: 3
  rbac:
    create: true
    serviceAccountName: aerospike
  hostNetwork:
    enabled: true
    useExternalIP: true
  antiAffinity: "on"
  loadBalancerServices:
    enabled: true
  affinity:
     nodeAffinity:
       requiredDuringSchedulingIgnoredDuringExecution:
         nodeSelectorTerms:
         - matchExpressions:
           - key: topology.kubernetes.io/zone
             operator: In
             values:
             - us-west-2a
  resources:
    requests:
      cpu: 8
      memory: 10Gi
    limits:
      cpu: 10
      memory: 12Gi
  persistenceStorage:
    - mountPath: /opt/aerospike/data-pv
      enabled: true
      name: datadir-pv
      storageClass: ebs-gp3
      accessMode: ReadWriteOnce
      size: 200Gi
      volumeMode: Filesystem

  enableAerospikePrometheusExporter: true
@spkesan
Copy link
Contributor

spkesan commented Jan 27, 2021

Hi @mmazek

In case of AWS ELBs, the load balancer ingress points are typically DNS based (unlike GCP or Openstack load balancers which usually has IP address as ingress points). The init container was expecting an IP address instead of a hostname when querying the service resource and hence failed. This has been addressed in the latest helm chart release. Can you try with the latest release.

You can simply run helm repo update and the install the chart again.

https://artifacthub.io/packages/helm/aerospike/aerospike
https://artifacthub.io/packages/helm/aerospike/aerospike-enterprise

If you don't want to pull the latest charts, you can also simply just select the latest init container image, which should solve this issue.

helm install <release-name> aerospike/aerospike --set initImage.tag=latest ....... <rest of the command> .....

@spkesan spkesan self-assigned this Jan 27, 2021
@mmazek
Copy link
Author

mmazek commented Jan 29, 2021

I updated my installation to the newest Helm Chart version and still ending up with the same issue. The log from the aerospike-init:

2021-01-29T14:28:45.566Z INFO Initializing variables.
2021-01-29T14:28:45.566Z INFO Initializing config volume.
2021-01-29T14:28:45.591Z INFO Finding peers.
2021-01-29T14:28:45.591Z INFO Self: aerospike-0.aerospike.aerospike.svc.cluster.local
2021-01-29T14:28:45.592Z INFO Found self during DNS lookup. Peers list updated.
2021-01-29T14:28:45.592Z DEBUG aerospike-0.aerospike.aerospike.svc.cluster.local
2021-01-29T14:28:45.592Z INFO Preparing Aerospike configuration file.
2021-01-29T14:28:45.623Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-29T14:28:55.628Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-29T14:29:05.633Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-29T14:29:15.637Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-29T14:29:25.642Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
...```

@mmazek
Copy link
Author

mmazek commented Jan 29, 2021

After it finishes we end up with aerospike.conf file with alternate-access-address [EC2 node's public IP], while it should clearly be a Load Balancer's IP or DNS. Of course it doesn't work, because those nodes are not open to the world.

@spkesan
Copy link
Contributor

spkesan commented Jan 29, 2021

@mmazek
Right, it is supposed to fetch loadbalancer ingress endpoint and add it to the config. But it's unable to parse the output when querying kubernetes service resource. Hence, it is skipping the part of adding loadbalancer IP to alternate-access-address.
Since you have also hostNetwork.enabled=true it is fetches EC2 node's IP and adds it to the alternate-access-address.

Can you share the output of the following command? (Please redact things like public IPs etc. or add a dummy string)

kubectl get -n aerospike service/loadbalancer-aerospike-aerospike-0 -o json

@spkesan
Copy link
Contributor

spkesan commented Jan 29, 2021

@mmazek Oh! I just noticed that your pod name is aerospike-0. Ideally it should be <release-name>-aerospike-0. Can you try with a <release-name> other than aerospike, something like aerospike-test? My guess is that this unexpected pod name is causing the init container to query for the wrong service which is invalid. Let's see if that fixes the issue.

@spkesan
Copy link
Contributor

spkesan commented Jan 29, 2021

I think that could be it. The init container uses pod name to identify which service to query.
The helm chart trims the statefulset name (and hence the pod name) if the chart name and release name are same. This is causing the init container to look up an invalid service and it fails.

@mmazek
Copy link
Author

mmazek commented Jan 31, 2021

Hi, your recommendation doesn't work unfortunately... I changed the release name to aerospike-blah and ended up with the following k8s objects:

❯ k get all
NAME                   READY   STATUS     RESTARTS   AGE
pod/aerospike-blah-0   2/2     Running    0          10m
pod/aerospike-blah-1   0/2     Init:0/1   0          2m
pod/aerospike-blah-2   2/2     Running    0          8m11s

NAME                                              TYPE           CLUSTER-IP       EXTERNAL-IP                                                               PORT(S)             AGE
service/loadbalancer-aerospike-blah-aerospike-0   LoadBalancer   IP0   LB0.us-west-2.elb.amazonaws.com    3000:31490/TCP      8m40s
service/loadbalancer-aerospike-blah-aerospike-1   LoadBalancer   IP1    LB1.us-west-2.elb.amazonaws.com    3000:30886/TCP      8m39s
service/loadbalancer-aerospike-blah-aerospike-2   LoadBalancer   IP2     LB2.us-west-2.elb.amazonaws.com   3000:31933/TCP      8m39s

NAME                              READY   AGE
statefulset.apps/aerospike-blah   2/3     10m

and I see in the init container's logs the following info:

2021-01-31T22:10:30.151Z WARN Unable to locate self during DNS lookup. Retrying.
2021-01-31T22:10:31.152Z WARN Unable to locate self during DNS lookup. Retrying.
2021-01-31T22:10:32.154Z INFO Found self during DNS lookup. Peers list updated.
2021-01-31T22:10:32.154Z DEBUG aerospike-blah-0.aerospike-blah.aerospike.svc.cluster.local
2021-01-31T22:10:32.154Z DEBUG aerospike-blah-1.aerospike-blah.aerospike.svc.cluster.local
2021-01-31T22:10:32.154Z DEBUG aerospike-blah-2.aerospike-blah.aerospike.svc.cluster.local
2021-01-31T22:10:32.154Z INFO Preparing Aerospike configuration file.
2021-01-31T22:10:32.167Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:10:42.173Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:10:52.179Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:11:02.185Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:11:12.192Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:11:22.199Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:11:32.206Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:11:42.213Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:11:52.219Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:12:02.226Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:12:12.232Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:12:22.239Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:12:32.245Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:12:42.253Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:12:52.258Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:13:02.266Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:13:12.272Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:13:22.278Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:13:32.284Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:13:42.290Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:13:52.295Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:14:02.302Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:14:12.309Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:14:22.324Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:14:32.330Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:14:42.337Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:14:52.343Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:15:02.350Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:15:12.356Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:15:22.362Z WARN Error getting loadbalancer IP, port and node port: json: cannot unmarshal string into Go struct field ServiceResponse.status of type main.Stat. Retrying.
2021-01-31T22:15:32.363Z INFO Init container successfully executed.

Of course I don't see the LB's info in the conf at the end.

@mmazek
Copy link
Author

mmazek commented Jan 31, 2021

OK, I got it now - the release name can't contain the chart name, so I went with ReleaseName = as - the load balancer DNS is added to the config, but when I run aql from my PC it's initially connecting via the LB, but then Aerospike cluster is sending back the pods' IPs and the connection doesn't work.

@spkesan
Copy link
Contributor

spkesan commented Feb 1, 2021

What's the AQL command that you are using to connect?

Use --services-alternate command line option with AQL and ASADM

aql -h <LoadBalancerIP> -p <LoadBalancerPort> --services-alternate
asadm -h <LoadBalancerIP> -p <LoadBalancerPort> --services-alternate

@mmazek
Copy link
Author

mmazek commented Feb 1, 2021

OK, I managed to make this work, but now I ended up with the last remaining issue - The Service is creating public Load Balancers by default (I'm running AWS EKS here) and it's not an option for us. For now, I'll simply hack your Helm chart and I'll add service.beta.kubernetes.io/aws-load-balancer-internal: "true" in the annotations for the services, but I can also add a feature request in this repo to make annotations possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants