Here’s the scenario,
When our SP2013 farm was created, there were 4 servers – 2 WFEs, 1 App, 1 SQL. All servers are VMs.
Distributed Cache instances were created on the 2 WFEs and 1 App server. Things rocked along smoothly for a while. Then one of the WFE servers began acting hosed… had a consultant check out the farm, try to revive dying server, he deemed it unrevivable. So new VM server was created and configured and added to the farm to replace the dying WFE server.
Fast forward a few months… keep seeing errors with Distributed Cache service in the Health Analyzer. After much poking and prodding and PowerShell learning, determined that the Dead server which cannot be revived was the CacheHost. Get-CacheHost command shows 4 servers… the dead one has a status of UNKNOWN. 2 servers have a status of DOWN, and one has a Status of UP.
I’ve tried to remove the dead Cache Host using the Unregister-CacheHost command following this Blog posting: http://alstechtips.blogspot.com/2014/07/sharepoint-2013-how-to-remove-cache.html
However this results in ErrorCode<UnspecifiedErrorCode>:SubStatus<ES0001>:No such host is known
Running Get-CacheHost after the Unregister-CacheHost command still shows 4 servers with above statuses.
Suggestions?
Have you attempted to export your cluster settings, and create a new cluster and re-import those settings?