Problem description
The Symantec Management Platform and related infrastructure systems, namely the Site Servers, can fail to operate normally when the server do not have access to the Internet for extended period of times [1][4].
For the SMP this causes the console to be extremely slow [2] when loading a new session or opening a new web-application (with delay to load up to 30~40 seconds). On the Site Servers this prevents the CTDataLoader service, which feeds task execution information back to the SMP, from starting properly [3].
Root cause analysis
The root cause lies in a combination of elements, some that we previously stated:
- Assemblies in the Global Assembly Cache need to be signed. This is a security features decided by Microsoft necessary to ensure traceability, as assemblies in the GAC can be used by any .Net process.
- The server had no Internet access for a given amount of time (3~4 weeks)
- the process that runs (ctdataloader.exe for the Site Server or w3wp.exe for the SMP) needs to access .Net libraries (dll's, aka assemblies in .Net parlance) that reside in the global assembly cache
Lets unwind this stack of events to explain the issue: because the .net process requires to load assemblies from the GAC in memory the validity of the signing certificate need to be verified. This requires to look-up the certificate hierarchy to ensure none of the singing parties were invalidated (i.e. the CA, intermediate CA or certificate itself aren't revoked). But given the server had no Internet access for some times the caches CRL information is no longer trusted and the system requires to download the CRL information back from the Internet. Given the look-up fail the system retries to download the CRL data at 10 second interval, a number of time.
This causes the problem described above.
Resolution
The resolution currently available from Microsoft is to modify the .net framework machine.config to specify that certificate revocation checks should not be attempted.
So far the solution provided points to modifying the file manually, however when you have 30 or 100 Site Server this is not a viable option.
This download contains a program (compiled for 32-bit and 64-bit Windows), and its source code, that helps automatically modify the machine.config files on targeted site server (via a Task for example).
Download information
Attached in this download we have a zip file containing the following 3 files, with their sha256 hashes and descriptions:
autofixCRL.exe | BCE406E039BB1B7F645CBF5A765B4D4826301BF702817ECF386D83E46D88C121 | .Net v2.0 exe |
autofixCRL-x64.exe | 9E9D633245E3DE61B06BB903482BCB37A410FDE7E453791F12798C5E28A827AD | .Net v2.0 64-bit exe |
autofixCRL.cs | 3E0A03FF7DD2144314FE4FC28625BEB7352B224B898E4D9509CFAF52ED706A60 | Source code of the tool |
Download usage
autofix - a tool to automatically fix the CRL issues on Internet bared servers Usage: autofix [OPTION] ... The tool will return 0 on success and -1 in any other case. Supported command line arguments (all are case insensitive and optional): /fixCRL, /fix, /f This will modify Win32 and Win64 machine.config files to prevent CRL related issue. /revert, /r, Restore CRL checks on the machine.config, in case your servers are allowed again on the Internet (or for testing purposes). /debug, /d Cause the tool to run in debug mode. This will console output unlike during standard invocation. /help, --help, /? Print out this help message on the console and return.
References
[1] December 17th 2008: http://www.symantec.com/business/support/index?page=content&id=HOWTO9584
[2] June 14th 2012: https://www-secure.symantec.com/connect/articles/using-altiris-profiler-research-blackholes-where-time-disappears-between-events
[3] July 13th 2012: https://www-secure.symantec.com/connect/blogs/things-know-when-your-servers-dont-have-internet-access
[4] July 24th 2012: http://www.symantec.com/business/support/index?page=content&id=TECH192580