2015-10-12

Patching SharePoint with Cumulative Updates – do it or don’t do it . Always a dilemma.  Cumulative Updates includes security fixes as well as fixes for other reported problems. It is also important to mention that Cumulative Updates do not contain new functionalities; new SharePoint functionalities comes with Service Packs.

New Microsoft policy is to release SharePoint Cumulative Updates once every month. For a security point of view it is highly recommended to install these updates, even more if your SharePoint farm is exposed to internet. But then again it is time consuming, risky job with downtime of your services.

Moreover, there is no rollback procedure recommended by Microsoft, for them rollback is to cross your fingers and hope for the best. If something happens look at logs and try to fix issues manually which will even more increase downtime of your services. Pretty scary stuff, right?

In theory it all looks like a piece of cake; install patches on server, start SharePoint Products Configuration Wizard, click Next couple of times and seat back and observe progress bar. But this is just a theory, reality is a bit different.

Things are getting a bit complicated when you have multiple servers in the farm. According to Microsoft recommended procedure is to install patches on application servers, then on web servers, and then to start upgrade in same order. For detail information please see https://technet.microsoft.com/en-us/library/ff806338.aspx#usinginplace. Also there is a great blog by Samuel Betts, where he explained in detail process of patching multiple farm servers (http://blogs.msdn.com/b/sambetts/archive/2013/08/22/sharepoint-farm-patching-explained.aspx).

That is not all; there is also a problem with patching duration. Installing patches can last for hours depending of how server is busy. Just imagine: You have two front end and two application servers to update and for each you have to spend three hours for installation! Even if you combine installations it will be over 5 hours just for installation! And that is only half of the job.

As I said before, rollback procedure, according to Microsoft, does not exist. You can not uninstall patches once they are installed. So if your installation package reports a failure and can not be installed all the way, you are stuck with updates which were installed before installation failure. You get half baked cookie! To make matters worse, you will not be able to do finish SharePoint Product Configuration because your SharePoint farm servers will not be at same patch level.

As you can see, there is a lot to take into consideration when implementing patches on SharePoint servers. However, thanks to fellow bloggers you can learn a trick or two and decrease the risks and installation time, and therefore application downtime.

In our company there is SharePoint farm with three front end servers and one application server, so when it comes time to apply SharePoint updates you need to be prepared and informed. Usually we update SharePoint server during weekends and with previous announcement to other colleagues that the service will be unavailable.

Step 1. Backup

Before you start to do anything it is a smart thing do to backup. First I do farm backup – this is my last resort if everything else fails. You can schedule farm backup to be done during the Friday night, so when you start a patching process on Saturday morning you don’t have to wait for a farm backup to complete. Off course, there is a risk that someone was altering something during the night, but let’s be honest, chances for that are minimal. Also, there is a great chance that you will never gonna do farm restore, but you never know. Better to be safe than sorry later.

Although farm backup is good start, you can not do rollback just from it.

There is very nice blog by Chris Mullendore about taking snapshots of SharePoint servers (http://blogs.msdn.com/b/mossbiz/archive/2013/02/22/sharepoint-vs-snapshots-part-2.aspx) from which I realized that there is a safe way to take snapshots and not to worry much about consistency.

So what I do is to turn off all SharePoint servers (key is to turn off all of them), take snapshots of all servers, and also do backup of all SharePoint databases. Of course, you can backup databases earlier, but I prefer this way, because in this way all crucial components are backed up in the same point in time. Chris in his blog have mentioned potential risk of a loss of connectivity to Active Directory due to a machines have changed their account passwords, but that is a small risk. If you want to be sure just check PasswordLastSet property on your SharePoint servers using this command:

(get-adcomputer <servername> -Properties passwordlastset).passwordlastset

In most cases machine password duration is 30 days, so if your password is set later you are in the clear.

Shutting down SharePoint servers off course means downtime of services. Depending on amount of data on servers and databases, snapshot of servers and backup of databases can take a while. In our case it takes about a hour – hour and a half to complete this task. But this is small price to pay.

This is a good time to point out that creating a backup and rollback plan is very important part of patching procedure. It is better to loose hour or two for creating backups then to loose your job because you have no rollback plan when something goes wrong. Excuses like ‘What can go wrong?’ or ‘I have done this 100 of times and never had problem.’ are never in place. We are professionals and it is our job to do everything in our power to complete task as efficient and painless as possible.

Step 2. Update installation

So far we created farm backup (just in case), snapshots of all SharePoint servers and backup of SharePoint databases. Next step is to install patches on servers.

Russ Maxwell wrote great blog how to reduce amount of time needed for patch installation on SharePoint servers (see http://blogs.msdn.com/b/russmax/archive/2013/04/01/why-sharepoint-2013-cumulative-update-takes-5-hours-to-install.aspx). Better yet, he wrote a code which automates installation process; it shuts down all necessary services, installs update and starts services again. Code also pauses search service application. This code dramatically decreases patching time. All you need to do is to save Patchit.ps1 file in the same folder with patch installation and run it. Of course you need to start SharePoint 2013 Management Shell with administrative rights and start script from there, but you already knew that, right?

I copy installation files along with Patchit.ps1 script to all SharePoint servers, and then start patching. First I  install updates on application server, and then on web servers. After that I usually check event log to see if there is some errors and if all updates were installed successfully. Only when I verify that everything is installed as it should be I can proceed to next step. If something goes wrong and cannot be fixed rollback procedure can be activated (rollback procedure will be explained later).

Step 3. SharePoint upgrade

Let us recapitulate what we have done so far. We have create backups and took snapshots of our SharePoint servers. After that, using Patchit.ps1 script, we have installed Cumulative Update on servers, first on application server and then on web front end servers.

Next step is to run SharePoint upgrade. This can be done in two ways, one way is to start SharePoint 2013 Products Configuration Wizard, which is located in Microsoft SharePoint 2013 Product folder. Other way is to run PSConfig script, which is my preferred way, and I use it whenever I can.

According to TechNet article https://technet.microsoft.com/en-us/library/ff806338.aspx, proper upgrade order is to upgrade specific services first, upgrade content databases next, and then proceed with server upgrade. In our environment we don’t have specific services so I go to upgrading content databases. This is an optional step, but it will help ensure that all content databases are upgraded first. It has the advantage of enabling some parallelism to reduce the outage time. If it is not performed, all remaining non-upgraded content databases will be upgraded serially when you run the SharePoint upgrade to upgrade the farm servers.

In order to upgrade content databases I wrote small script which finds all content databases in the farm and upgrade them. Just run the UpgradeContentDatabase.ps1 script and PowerShell will do the rest. You will just have to verify upgrade for each content database.

After upgrading content databases now we have to upgrade SharePoint servers. Microsoft recommends that first we need to upgrade Central Administration server, and after that web servers using PSConfig.exe command:

PSConfig.exe -cmd upgrade -inplace b2b -force -cmd applicationcontent -install -cmd installfeatures

And that’s it, we have successfully implemented Cumulative Update on SharePoint farm. After that open Central Administration site and on Upgrade and Migration part go to Check Upgrade status and review upgrade. If there are no errors you are good to go. Also what I like to do is to go to Central Administration > Upgrade Migration > Review Database Status and check if status for all databases is No action required.

Troubleshooting and rollback

Troubleshooting is a very broad theme, I can write whole new blog about it. My recommendation is a great blog by Samuel Betts http://blogs.msdn.com/b/sambetts/archive/2015/08/19/patching-sharepoint-2013-farms-continued.aspx in which he explains most common issues and gives some initial pointers on where to start troubleshooting if it does happen.

If, for some reason, after upgrade you get correlation ID error best way to find what happened in log file is to merge logs and find actual error message:

  • Open SharePoint PowerShell as Administrator
  • Use Merge-SPLogFile –Path “C:\Error.log” –Correlation “ID here”
  • Cmdlet grabs the ULS logs from all the servers in SharePoint farm
  • Populates C:\Error.Log file with all instances of Correlation ID
  • Open C:\Error.log file with ULS Viewer to see error associated with correlation ID

Sometimes, problem can be IIS. For instance, you can receive error that SharePoint service is unavailable due to a stopped IIS Application pool.

Best way for reviewing IIS logs is:

Another thing I found useful is this: psconfig tells you that it can’t proceed because one or more of the servers are missing a required patch.  It shows the server name, the patch name, and it says “Missing/Required”.  The thing it says is missing though is the exact patch you just got done installing.  Frustrating, yes?

To resolve this try running the command Get-SPProduct –local on the server that says it’s missing the required patch.  Doing so forces a timer job to run that does some version fix up.

In case that you can not fix issues, or the downtime of SharePoint services is too long, you can start rollback procedure using server snapshots and database backups. I pointed out several times, but I will do it again: according to Microsoft this is not recommended action. However, when we are using snapshots with SharePoint, we need to not just snapshot a servers, a database, or a set of files… we are attempting to capture the state of the SharePoint farm at a point in time… and we need to capture that state in such a way that it is in perfect sync across all of the components. Shutting the machines down is a way to reduce the number of things that could be altering that state at any given moment, thereby increasing the likelihood that the capture of that state that we did will be a complete capture of that state, and that if we need to revert to that prior state, there are fewer things you need to accommodate, fix, or otherwise deal with that could make your farm completely unstable (and unsupportable).

So, the rollback should look something like this:

  • Power down all SharePoint servers
  • Restore all snapshots
  • Restore all SharePoint SQL databases from backup
  • Power on SharePoint servers

If all other procedures fail, last thing we can do is to restore SharePoint farm. There is a Technet article (https://technet.microsoft.com/en-us/library/ee428314.aspx) which covers this procedure very well. Or you can use PowerShell Get-SPBackupHistory and Restore-SPFarm cmdlets:

(Get-SPBackupHistory -Directory <backup directory> -ShowBackup)[0].SelfId | Restore-SPFarm -Directory <backup directory> -RestoreMethod overwrite

This cmdlet gets all of the farm backup operations that have been run for the backup directory, finds the most recent backup, and then passes its backup GUID to the Restore-SPFarm cmdlet. The Restore-SPFarm cmdlet will then perform an overwrite restore from that backup package.

In the end, I think that is important to mention that you can use this steps if you are installing service pack on your SharePoint farm.

This concludes my long (and I am hoping not boring) blog regarding this important subject. Of course, there are lots of other themes I have not covered, like reducing downtime, using SharePoint 2013 Products Configuration Wizard, etc. I focused more on creating one document where you can find whole story, and don’t have to, like me, go through bunch of Technet documentation and blogs, and try to assemble pieces into whole picture.

PS. My friend Neven (to whom I owe a beer for his help) gave me a reference to three good articles posted by Phil Childs in which he explains how to resolve issues found in Health Analyzer. These are really great articles. In first, author explains how to resolve issues with missing features. Author was nice enough to provide us with a script for resolving these issues. For more information please check http://get-spscripts.com/2011/06/removing-features-from-content-database.html

Other article by same author explains issue with missing setup files which are referenced in SharePoint databases, but are not installed on farm. http://get-spscripts.com/2011/06/diagnosing-missingsetupfile-issues-from.html

Third one about issues with missing Web-part error. http://get-spscripts.com/2011/08/diagnose-missingwebpart-and.html

Before start upgrade you need to check Health Analyzer and resolve issues found in configuration category. Otherwise you are going to have a headache while trying to run SharePoint configuration.

About the author 

Krsto Savic