Replicated shared snapshotter doesn't comes up on node restart

MANI_M · August 21, 2020, 5:04pm

Whenever the primary node is restarted replicated-shared-fs-snapshotter-* doesn’t come up and goes into Init:CrashLoopBackOff we need to manually force delete the pod in order to fix it. Due to this the other pods which are dependent on the shared filesystem goes into Init:CrashLoopBackOff due to race condition as mentioned in this.
We are running on DigitalOcean 4vCPU 8GB machine. The cluster has one primary and 2 worker nodes. The issue is observed in a single primary node as well.

We have a single primary node running on AWS. There it works out fine. How to solve this? Screen shot attached for reference.

salahalsaleh · August 21, 2020, 5:58pm

Hello, we are tracking this and we will fix it in the next release, which is due next month.

Topic		Replies	Views
Flexvolume creates deadlock and Deployment enters into Crashloopbackoff on node reboot Packaging an application	4	677	July 29, 2020
Log Collection for Crashed and Terminated Pods Supporting your customers support	3	270	March 2, 2023
How do you configure replicated for high availability? How do I?	3	1143	March 20, 2022
Longhorn fails to start and displays 'failed to generate spec: path "/tmp/longhorn-environment-check" is mounted on "/tmp" but it is not a shared mount' in the pod Events table Troubleshooting	0	444	July 20, 2022
Authenticating Job pods to private registry How do I?	2	236	September 18, 2022

Replicated shared snapshotter doesn't comes up on node restart

Related Topics