In the many years I’ve worked with SharePoint, I’ve never seen anything like this.
I’m onsite with a client performing an audit of their SharePoint 2010 environments. I cleaned up the health analyzer issues, as well as the event and ULS logs. I also moved all the services that were erroneously running on the web front ends to the app server (I’ve advised them to add a second app server). The only real change was that I had to rebuild search, since it had been broken for months.
Once the environments were healthy and stable, I installed SharePoint Admin Tools on dev, qa, and prod environments. Running the diagnostics tool on the dev and qa farms went just fine. When I ran the tool on the production environment(at the end of the day), the app pool for my main web application went down on both front ends. No other app pools/web apps went down. I immediately restarted the app pools, and the web app (with multiple site collections) came back up. It’s been up since then.
The problem is that random web parts on some publishing pages decided to break. Â Two list view web parts on the front page started throwing up an error. I was able to fix that by opening the web part properties, changing the name, applying it, then changing it back.
On other pages, web parts are changing types completely. For example, there was a page with a content editor on it that was working fine, but when a user tried to edit the web part properties, it showed up as a list view web part instead. Â On another page, a list view web part turned into a form content editor, and on a third page, a list view web part turned into a data view web part. Pages that aren’t publishing pages don’t seem to encounter this issue.
These are all easily fixed by deleting the web part and replacing it, which isn’t ideal. More importantly, I need to figure out the root cause. I’m pretty sure the SharePoint Admin Tools wrote something to the config database, or maybe one of the content databases for the main web app.
Has anyone else run into this issue before?
The publishing pages are using the same masterpage as the rest of the site collection, and they’ve used that master page for at least 6 years. Â I opened each page in maintenance mode, and I see some closed web parts on them that could be removed, but the affected web parts are listed wrong.
I may clean up the closed web parts, but still no dice. Someone on facebook recommended looking for broken timer jobs, but I did that as part of my initial audit/cleanup, and there aren’t any issues there.
I think that I’m going to go through the dev and qa environments today and see if the problem exists there too. If so, it will be harder to do a root cause analysis, but at least I’ll know it wasn’t anything I did.
Update: The problem does exist in QA. That narrows it down to an issue with the content, since the farms are all pretty different. It could very well be the master page causing this.
These being publishing pages, are they using custom master pages or page layouts?
If so, how were they deployed?
Is it possible that there is an issue with any of these?