Monthly Archives: August 2017

Configuration Manager Update not progressing

Technical Preview 1708 was released yesterday and I fired up the lab virtual machines to give it a try. ***Several Hours Later*** I am still trying to install. 1st I had a few SQL issues to sort out but mainly they were let the database start before opening the console. Next the download would not start. It turns out the sms_site_component_manager was crashing.  Luckily a quick search found that is was a known issue

OK, now that the component manager is running the download completes and I can start the install. Everything looks normal and I head off to bed to let it cook overnight. Imagine my surprise to find that this morning the status has not changed. The next step in the update would be to run the prerequisite change and it has no status. Off to the logs and I find that the CMUpdate.log has not been updated in a while. And one of the last updates is “CONFIGURATION_MANAGER_UPDATE service is signalled to stop…”  So I check the windows services and sure enough the CONFIGURATION_MANAGER_UPDATE service is not running. After starting that service up now everything is progressing again. Adding this to my growing list of upgrade checks and making a pot of coffee while I watch the logs for this upgrade.

Everybody out of the pool, the Application Pool

Hi my name is MrBoDean and I need to confess that I am not running a supported version of SCCM. Yes, I am migrating to Current Branch but the majority of my system are still in SCCM 2012 R2 without SP1. The reason why is quite boring and repeated and many companies I am sure, but takes a while to tell. So for the past year it feels like duct tape and bailing wire are all that is keeping the 2012 environment up while we try to upgrade between a string of crisis. It is a shame when the Premier Support engineers are on a 1st name basis with you.

So Monday night I was called at 2AM because OSD builds were failing. It happens and most times a quick review of the log files points you in the right direction. Not so much this time around. The builds were failing to even start, everyone was failing with the error

I have 4 Management Points they can not all be down. They are are up and responding when I test them with

Ok back the the smsts.log for the client that is failing.  It starts up up fine and even does its initial communication for MPLocation and gets a response

It picks the 1st MP in the list and sends a Client Identity Request. That fails quickly with a timeout error.

While it does retry it only submits the request to one MP. But the retry fails with the same error and the build fails before it even starts. Nothing stands out initially and being a little groggy, I go for the old stand by of turning it off and back on again. MP1 was the once getting the timeouts so it gets the reboot. After the reboot we try and again and get the same error. At this point a couple of hours have passed and the overnight OSD builds are canceled and I grab a quick nap and start again 1st thing in the morning.  Well that was the plan until the day crew starts trying to do OSD build and everything everywhere is failing. So I open a critical case with Microsoft. While waiting for the engineer to call I keep looking at logs try to identify what is going on. I RDP into MP1 to check the iis configuration and notice that the system is slow to launch applications. I take a peek at task manager and see that RPC requests were consuming 75% on the available memory. To reset those connections and to get the system responsive quickly, down it went for another reboot. Once it came back up, I took a chance and tried to start a OSD build. This time it worked. So the good news goes out to the field techs. Now I just need to figure out what happened to explain why. Management always needs to know why and what are you doing to not let it happen again. About this time the Microsoft engineer calls and we lower the case to a normal severity. I capture some logs for him and to his credit he quickly finds that MP1 was experiencing returning a 503.2 iis status when the overnight builds were failing. To reduce the risk of this occurring again we set the connection pool limit to 2000 for the application pool “CCM Server Framework Pool” on the management points. I get the task of monitoring to make sure the issue does not return and we agree to touch base the next day. Well I am curious about what lead to this and how long it has been going on. Going back the the past couple of days I see a clear spike in the 503 errors Monday evening starting with a few thousand and ramping up to over 300,000 by Tuesday morning. While I recommend using log parser to analyze the iis logs if you are just looking for a count of a single status code you can get it with powershell. This will give you the count of the 503 status with a subcode of 2. (Just be sure to update the log file name to the date you are checking.

While I still have not found out why, at least I know what was causing the timeout error. While that knowledge, I finally get some sleep. Surprised that no one called to wake me up because the issue was occurring again, I manage to get into the office early and start looking at the logs again and see another large spike in the 503 errors. I do a quick test to be sure OSD is working and it is. A quick email to the Microsoft engineer and some more log captures leads to an interesting conversation.

We check to make sure that the clients are using all the management points with this sql query

And we see that the clients are using all the management points but MP1 and MP4 have about twice as many clients as the other two management points. next we check the number of web connections bot of these servers have with netstat: in a command prompt

*Just in case you try and run this command in powershell, you will find that the powershell parser will evaluate the quotes and cause the find command to fail. To run the command in powershell escape the quotes.

This showed the MP1 and MP4 were maintaining around 2000 connections each. With a app pool connection limit of 2000 this means that any delay on processing requests  can quickly lead to the application pool being exceeded and lots of 503 errors will result.  So this time the connection pool limit was set to 5000. But a word of caution before you do this in your environment. When a request is waiting in the queue, by default it must complete within 5 minutes or it is kicked out and the request will have to be retried. Be sure that your servers have the CPU and Memory resources to handle additional load that this may cause.

In SCCM 2012 R2 pre SP1 there is no preferred management point. The preferred management points where added in SP1 and improved in Current Branch to be preferred by Boundary Groups. In 2012 your 1st Management point is the preferred MP until the client location process rotates the MP or the client is unable to communicate with a MP for 40 minutes. In this case MP1 is the initial MP for all OSD builds because it is always 1st in a alphabetical sorted list. MP4 is the default MP for the script used for manual client installs. If my migration to Current Branch was done I would be able to assign management point to boundary groups and better balance out the load.  But until then I am tweaking the connection limit on the application pool to keeps things working. Hopefully you are not in the same boat but if you are maybe this can help.


TP 1707 Run Scripts Thoughts

I just finished running kicking the tires on the Configuration Manager Technical Preview 1707 Run Scripts feature. This is a post on my personal thoughts about the functionality and as time moves on and more improvements are made all these thoughts may (and hopefully should) become irrelevant.

This has great potential but also poses several challenges.

  • Currently the results of the script execution are hard to report on for most users. For example, Someone important wants you to run a script and report with systems reported XYZ. To do that you have to query SQL and then parse the results. But if you are not the SCCM Admin you may not have any SQL rights to the database. So building custom reports is in your future until there are some canned ones.
  • Releasing this is a large enterprise is almost a no go until the ability to apply scopes and RBAC goodness is available. I can just imagine explaining this to compliance officers and auditors. Oh the meetings we will have.
  • There seems to be plans to support revisions of scripts but for the moment you have to be perfect or start over. As my wife often reminds me, I am good but not perfect. I need to correct mistakes. As admins we get to update all the other objects if needed, we will need to update scripts.
  • A kill switch. One bad script deployed to everything could suck up resources or do worse on every system. Good review and use of the approval workflow should help but you know someone will do it.

Configuration Manager TP 1707 – Run Scripts

I want to talk a bit about the new Run Script feature that was added in 1706. In Technical Preview 1707 it gained the option to add parameters to a script. This has the potential to huge benefit many users of Config Manager and is a great example of SAAS quickly delivering functionality.

Creating a script if very straight forward for this example it is just a Query of WMI for Win32_ComputerSystem

After the script is created, You must approve it. (There is a hierarchy setting to allow\stop authors to approve their own scripts. This should only be allowed in a test environment. ) After the script has been approved it can be run. To run a script go to a collection with the systems you would like to target. You can run the script against the collection as a whole or individual systems in the collection. (You must show the collection membership to target individual systems. The Run Script  option is not available via the default device view.)

Next select the script to run

To view the results of the script execution you will need to use Script Status in the Monitoring view.

 Any output from the script is stored in Script Output. For a good peak at what is going on behind the scenes check out this great write up from the 1706 TP by Tom Degreef

Now for the new stuff. Parameters!! Create a new script using the same simple wmi query with a parameter.

If you click next you will be able to set the default value for the parameter.

BUG… errr feature alert… If you click next or back without editing the parameter value the edit button is no longer present.

Not to worry you will be able to edit the parameter at run time.

When you run a script with a parameter you get a new dialog that allows you to edit the parameter values.

If you were not able or choose to not set the value when creating the script, click on the parameter name and click edit.  Be sure the parameter name is highlighted or the edit button will not do anything.  I spent a bit thinking  how silly to no be able to edit a parameter more then once. Rechecking my steps proved that was not the case.

Set the parameter value and let the script run.

Hopefully this will get you started with running scripts with parameters