Replicated shared snapshotter doesn't comes up on node restart


#1

Whenever the primary node is restarted replicated-shared-fs-snapshotter-* doesn’t come up and goes into Init:CrashLoopBackOff we need to manually force delete the pod in order to fix it. Due to this the other pods which are dependent on the shared filesystem goes into Init:CrashLoopBackOff due to race condition as mentioned in this.
We are running on DigitalOcean 4vCPU 8GB machine. The cluster has one primary and 2 worker nodes. The issue is observed in a single primary node as well.

We have a single primary node running on AWS. There it works out fine. How to solve this? Screen shot attached for reference.


#2

Hello, we are tracking this and we will fix it in the next release, which is due next month.