Recently in Windows Administration Category

System Administration Lessons Learned from Star Trek

| No Comments | No TrackBacks
1. "You have to know how things work on a starship..." (Star Trek II)

Kirk's old enemy, Khan, took command of the Reliant, a Federation starship.  When the Reliant approached the Enterprise, Kirk hesitated to raise his shields.  This gave Khan the opportunity to attack and severely damage the defenseless Enterprise.  Kirk retaliated by using his superior knowledge of Federation technology to remotely order the Reliant to drop its shields, allowing Kirk to launch an effective counterattack.  When asked how he knew the strategy would work, Kirk remarked that (if you're the captain) you have to know how things work on a starship.

Similarly, if you're a Windows System Administrator, you have to know how PC hardware works and how Windows itself works if you're going to be very effective.  Since becoming a Windows administrator, I've had to dig deeply into the Registry, crash logs, technical references, and programming guides to solve some of the more challenging issues to come my way.  The more I know about how things work (or how they're supposed to work), the more effective I tend to be.

2. "A no-win situation is the possibility every commander may face." (Star Trek II)

In Starfleet Academy, the Federation tests potential officers by putting them in a simulated situation that they cannot win.  This is done to see how they react to the pressure and inevitable defeat.  Having just "failed" this unpassable test, a cadet asks Kirk why they are put through it.  He says that a no-win situation is a possibility every commander may face at some point (though Kirk himself cheated his way out of it and won the "no-win" scenario).

In system administration, there are problems that will come your way that you simply can't fix.  Maybe it's a system that's been hit by too many viruses, a Registry that's too corrupted to be sorted out, or hardware that just doesn't work.  You can spend hours or days trying to fix a problem like this without ever really solving it.  You have to know when you're facing a "no-win scenario" and cut your losses by walking away from the problem.  That might mean wiping the system and reinstalling everything instead of spending hours correcting a series of problems, tossing out a piece of hardware that "ought to work" but somehow doesn't, or giving up on software that simply doesn't do what it's advertised to do.  

3. "The needs of the many outweigh the needs of the few... or the one." (Star Trek II)

Spock gave up his life at the end of Star Trek II to save the Enterprise and her crew.  When asked by Captain Kirk why he did it, Spock replied that the needs of the many outweighed the needs of the few, or the one.  In other words, Spock knew that by giving up his life he could save many others.

In system administration, you're probably not going to be faced with a "life or death" choice like this, but almost daily you're faced with situations where the needs of your end user community ("the many") dictate actions you ("the one") take.  For example, you may find yourself at the office after hours, sacrificing your personal time in order to complete a software upgrade, patch a server, or otherwise do something that would inconvenience users if you tried to do it during the work day.  Chances are, you're also probably "on call" to help those same users if they have problems late at night or on the weekend, and you're expected to help them.  The needs of the many, in this case, outweigh your own needs.

4. "Mr. Scott, have you always multiplied your repair estimates by a factor of four?  Certainly, sir. How else can I keep my reputation as a miracle worker?" (Star Trek III)

Mr. Scott admitted to Captain Kirk in Star Trek III that he had always multiplied his repair estimates by a factor of four.  This gave him the opportunity to take all the time he needed to solve a problem, while still completing the task more quickly than the captain had expected.  As a result, he was seen as a miracle worker by Captain Kirk.

In system administration, you're often asked how long something is going to take.  While I don't recommend multiplying your estimate by four, I do believe that you should always practice the principle of "underpromise and over-deliver" when dealing with others.  A task that looks like it should be a one-hour job can easily become a 2-3 hour job if things go wrong, the system begins responding too slowly, an emergency arises that you need to address first, etc.  If you tell someone something will take an hour and you aren't done two hours later, they're angry.  But if you tell them it will take two hours and you're done in 90 minutes, you're a miracle worker.  I'm not suggesting that you make a habit of lying, but rather that you give yourself a little breathing room to allow for things you might have forgotten, things that take longer than expected, or unexpected circumstances.

5. "The fancier the plumbing, the easier it is to stop up the drain." (Star Trek III)

In Star Trek III, Captain Kirk and the crew of the Enterprise essentially "stole" the ship in order to save Spock and Doctor McCoy.  Mr. Scott expected the Federation's newest, fastest, fanciest ship (the Excelsior) to be given the task of pursuing the older, slower Enterprise.  He removed a handful of critical computer chips from the Excelsior's system while working on it, preventing the ship from being able to give chase.  When asked how he managed to sabotage the Excelsior in a way that they didn't detect, he replied that the fancier the plumbing was, the easier it was to stop up the drain.  In other words, the systems on the Excelsior were so complicated that it was easy to screw them up.

System administrators often have several ways to deal with a situation.  Some ways are simpler than others.  You should always be wary of any solution that has too many potential "points of failure".  While an elaborate Perl script might push out an urgent security patch to 10 systems simultaneously from the comfort of your desk chair, you could over-think the script and end up accidentally applying that patch to 100 systems you didn't want to apply it to.  Sometimes it's better to keep things simple, because it can reduce the chance of failure or allow you to respond more quickly.  Similarly, you can "over engineer" a solution to a problem and spend more time architecting a clever solution to something you could fix manually in a few minutes.

6. "Sometimes the needs of the one outweigh the needs of the many." (Star Trek III)

When asked why the crew of the Enterprise risked their lives and their careers to save Spock in Star Trek III, Captain Kirk told him that sometimes the needs of the one outweigh the needs of the many.  In other words, Spock was their friend and they were willing to risk themselves because he meant more to them than their lives or careers.

In systems administration tasks, sometimes you have to do things that make a lot of people very unhappy.  For example, when pushing out security patches it is often necessary to reboot someone's PC to complete the installation.  Naturally, if that person has documents open in Microsoft Office when you reboot them, they're not going to be happy about it.  Multiply that over a large organization, and that simple reboot action can upset a lot of people.  However, as a system administrator, you're responsible for protecting your network from malware.  While "the many" users' needs may dictate that their PCs not be rebooted, your responsibility as "the one" who protects the network must outweigh theirs.  This is not to say that you're more important, or that you should be fine with mid-day reboots as a matter of practice, but rather that there will be times in the job where you've got to risk the wrath of the users for a greater good.

7. "Perhaps 'because it is there' is not sufficient reason for climbing a mountain."  (Star Trek V)

In Star Trek V, Captain Kirk is attempting to climb a mountain when he slips and begins to fall off.  Spock saves him at the last second.  Later, Spock tells Kirk that perhaps "because it is there" isn't a good enough reason to risk your life climbing a mountain.

There are times in system administration where there is something that you can technically do, but which isn't a good idea when examined more closely.  Maybe you have a script that could update all the company's computers with the latest Windows Service Pack overnight.  You might even be tempted to do it, since your management's asking you about when you're going to get the job done.  However, just because you can roll that Service Pack out in a heartbeat doesn't mean that's the right thing to do.  You could come in the next morning and find out that the Service Pack you pushed out last night broke the salespeople's contact management software, the accountants' general ledger program, and the CEO's favorite screensaver.  Suddenly, instead of being the miracle worker you thought you were going to be, you're on everyone's hit list.  There are times in system administration when caution is needed, and experience will often help you know when climbing the proverbial mountain is a good idea and when it isn't.

8. "An ancestor of mine maintained that if you eliminate the impossible, whatever remains, however improbable, must be the truth." (Star Trek VI)

In Star Trek VI, when attempting to figure out who assassinated the Klingon Chancelor, Spock began investigating his shipmates to identify the assassins.  When he came up with a seemingly incredible solution, he uttered the famous line above (which is paraphrased from Sherlock Holmes).

System administrators are often called upon to troubleshoot the strangest problems.  Sometimes the solution to those problems can be counterintuitive, and may even sound "impossible".  Here's a real-life example from my Windows 98 days.  The company had just implemented a new application in the Marketing and Finance areas.  For some reason, the laptop users in Marketing were getting a lot of "out of memory" errors when trying to use the application.  They requested more RAM.  We installed it.  The out of memory errors became even more frequent.  I started doing some research online and learned about a table kept by Windows 98 that was used to manage the available RAM.  My research indicated that the table had a fixed size and under certain conditions could "fill up" on the user.  One way you could free up space in this table was to remove some RAM.  I tried this on the Marketing laptops and, sure enough, the "out of memory" errors went away.  So, as impossible as it might seem, removing memory from the machines cleared up an "out of memory" error.

9. "People can be very frightened of change." (Star Trek VI)

In Star Trek VI, the Klingons suffered an environmental disaster that threatened to destroy their civilization.  As a result, they sought peace with the Federation, a change from their long-standing policy of conflict and subjugation.  In both the Federation and the Klingon Empire, there were people who had hated their rivals so much, and for so long, that the prospect of peace between the two governments was something they couldn't stomach.  It was said that such people were frightened of change (the coming peace).

This is very true in the Information Technology (IT) world.  When system administrators are about to make any kind of a signficant change, they're often required to document, justify, explain, and test the change well in advance of making it.  Inevitably, you will eventually change something that causes a problem.  Perhaps some Excel macros quit working after you upgrade Microsoft Office, or the new version of Internet Explorer doesn't work with an application used in Human Resources.  Those unfortunate consequences tend to make organizations as a whole resistant to change, even fearful of it.  As a system administrator, one of your responsibilities is to introduce change in a manner that allows you to control the potential negative impacts.

When we planned to roll out Windows XP Service Pack 2 (a while ago), I helped test as many of the applications used around the company as possible.  I would try to identify if Firewall changes would be needed, if the application required one of the "compatibility mode" options, if it would need to be patched, etc.  The point of all the hours I put in doing those things was to minimize the disruptive effects of upgrading to Windows XP Service Pack 2.  By all accounts, our hard work paid off and there were few, if any, complaints once the software began rolling out across the organization.


10. "One of the advantages of being a captain, Doctor, is being able to ask for advice without necessarily having to take it." (ST:TOS "Dagger of the Mind")

In the original Star Trek series, Captain Kirk often sought the advice of his senior officers.  Even though he sought their advice on how to deal with a problem, he did not always heed it.

Systems administrators typically work in teams.  Members of teams typically have one or more areas of expertise, and other areas where their expertise may be less extensive.  As a member of the team, you should always be willing to seek the advice of your teammates when you're about to do anything that might reflect negatively on the team if it goes wrong.  Just because you ask for a teammate's advice, however, doesn't mean you have to follow it.  Sometimes your own expertise or experience may "trump" the advice of a teammate, however well-intentioned and intelligent the advice might be.  The key lies in knowing when to take advice and when to ignore it, which is something you learn with time and experience.

11. "Power is danger." (ST:TOS "Balance of Terror")

A commonly uttered security mantra is that you should give users only the amount of administrative ability necessary for them to do their jobs, and no more.  If users don't have a business need for administrator access to their systems, they shouldn't have it.  In this way, if those same users introduce malware to your network via an infected floppy, CD, USB key, etc., that malware will have a hard time spreading.  Having no administrator access will also prevent them from installing unauthorized or pirated software, shutting off their computer's firewall, or doing other things that could compromise the security and stability of your network.

Similarly, as a system administrator you should always be careful and deliberate with your actions when you're using administrator permission on a machine.  Don't do indescriminate web browsing with the administrator account.  Don't run untested scripts against lots of end user machines.  Don't delete files you aren't sure about.  In short, recognize that your "godlike" powers over the computer make you dangerous, and always use those powers sparingly and carefully.

12. "Leave bigotry in your quarters; there's no room for it on the bridge." (ST:TOS "Balance of Terror")

System administrators tend to be the kind of people who like to tinker with things.  Even though we may be Mac administrators, we dabble in Windows or Linux.  If we're Linux administrators, we can't resist the urge to fiddle with a script on OS X or a batch file on Windows.  Because we have a lot of experience, we can sometimes become opinionated about technology, to the point of bigotry.  In a corporate setting, this kind of bigotry can be suicidal.  If your response to every Windows problem you're asked to resolve is to launch into a missive about how this wouldn't be a problem on the Mac, you're in the wrong job.  Unless they happen to ask for them, users don't want your opinions about the technology they're using.  Most of them could care less whether they're using Windows, OS X, Linux, or something else.  They just want to do their jobs, and they need you to fix the problem that's keeping them from working.  You may have a long list of reasons why the company should dump Windows and move to Linux or OS X. They might be very intelligent, objective, and thoughtful reasons.  But if you're being paid to administer Windows, you should keep those opinions to yourself unless asked for them. You'll just create unrest and friction with your co-workers, and that doesn't help anyone.

13. "The more complex the mind, the greater the need for the simplicity of play." (ST:TOS "Shore Leave")

Most people adorn their offices with a few well-chosen artifacts.  Perhaps they're pictures of loved ones, awards they've won, or souvenirs from their travels.  System administrators have those things too, but they also tend to like little toys.  For example, I've often got a netbook, an MP3 player, and some other gizmo keeping me company.  They might be expensive gadgets to other people, but they're fun toys to me, and it helps me to reduce my stress to play around with them occasionally... such as on my lunch hour.  Systems administrators tend to be fun, playful, and funny people (once you get to know them).  The complex web of information we have to master and use on a daily basis tends to make us seek out "fun" when we're not working or need a break.

14. "Insufficient facts always invite danger." (ST:TOS "Space Seed")

In the original Star Trek, Captain Kirk freed Khan Noonien Singh and his crew from an extended hibernation.  Khan and his crew were evasive about who they were and what they were doing on the ship they were rescued from.  Both Spock and Kirk did their best to extract information from them, but got very little.  Kirk noticed that Spock seemed uncomfortable with their new guests.  When asked why, he explained that they knew little about them, and that this lack of knowledge could be dangerous.  Later, Khan and his crew attempted to take control of the Enterprise.  Spock was right not to trust them.

System administrators who are willing to jump in and start working with something they know little about often learn through (bad) experience to become more cautious.  In handling security patches, for instance, I'm very careful.  When a new patch comes in, I have no way of knowing if that patch will break a critical business system, prevent systems from booting up, or force a reboot in the middle of the CEO's presentation to the executive board.  Before I release the patch to anyone else, I try it on my own system first to see how it behaves.  I then try it on my teammates and nearby co-workers.  If it doesn't cause a problem for them, I begin slowly fanning it out to the rest of the company.  Once I learn that the patch seems harmless I will then allow it to make its way on to large numbers of computers.  I make every effort to learn as much as I can about the patch before letting it "run loose" on the network.


15. "Either one of us, by himself, is expendable. Both of us are not." (ST:TOS "The Devil in the Dark")

It's not uncommon in system administration for there to be one person who handles a specific task, with another person serving as backup to that person.  The logic is to ensure that if the primary person gets sick, goes on vacation, takes a job elsewhere, or is hit by a bus, the team can continue to do the things it is responsible for.  It's a bad idea for the primary and the backup to be out of the office at the same time, and should be avoided if at all possible.  Inevitably, the day you're both out of the office there will be a major crisis in your area of expertise, and no one there who can resolve the problem.

16. "If I can have honesty, it's easier to overlook mistakes." (ST:TOS "Space Seed")

Sooner or later, you're going to make a mistake.  Maybe you accidentally deleted some critical files from a server.  Maybe you meant to adjust the firewall settings and ended up turning it off.  It might be something relatively minor, or heart-stoppingly major.  Whatever mistake you make, be willing to own up to it.  There's nothing to be gained by lying to your teammates or management to cover up a mistake.  If you own up to your mistakes, people will respect and trust you.  If you lie about them, they soon realize they can't rely on you and begin to resent you for the time they spend uncovering the truth.  Demand honesty from your coworkers, but deliver it in return.

17. "No one can guarantee the actions of another." (ST:TOS "Day of the Dove")

As part of system administration, or indeed any job, it can be necessary to make assumptions about how people will react to something and predict how they'll deal with it.  But just because a particular reaction seems logical, reasonable, and expected, don't assume everyone will do it.  Always make allowances in your plans, your scripts, and your procedures for your end users to do the illogical, unexpected, and "wrong thing at the wrong time".  Build in the safeguards you can to prevent as many problems as you reasonably can, but realize that no matter how hard you try, there's likely to be someone who does something you didn't plan for.

VBScript to Determine a PC's Need for a Reboot

| No Comments | No TrackBacks

From time to time in Windows administration and patch management, it's necessary to know whether a machine you're about to do something to is waiting on a reboot. When an installer program needs to replace a file that's in use, it can't do that, so it places the file on the disk with a temporary name and places a value in the Windows Registry to indicate that the file needs to be renamed at the next reboot. Therefore, if you want to detect whether a given machine needs a reboot in order to complete the work of a previously-applied hotfix, patch, or software install, you can look at that value in the Registry to see if there's any work to be done on the next reboot. If there is, the machine needs a reboot. If there's nothing there, the machine doesn't need a reboot.

The Registry key you need to examine is a MultiString Value called, aptly enough "PendingFileRenameOperations" located on the following Registry path:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager

Below is a sample VBScript to perform a test of the local or a remote machine to see if a reboot is needed based on the PendingFileRenameOperations key. The script must be run with Administrator permission on the system to be checked. If run without Administrator permission, the script will be unable to connect with the remote machine and an error will be displayed.

When executed, the script prompts for the name of a PC on the network, which can be the PC you're using at the time. If no PC name is entered, the script aborts. Otherwise, it makes a Windows Management Instrumentation (WMI) call to the Registry provider on the remote machine and requests the value of the PendingFileRenameOperations key. If an actual value is found, this means that PC requires a reboot. If no value is found or the key isn't there, then the PC does not require a reboot. A message is displayed for the user indicating if the machine in question does or does not need to be rebooted.

I hope you'll find the script useful.

dim oReg

'
' Set a constant we'll use later
'
Const HKEY_LOCAL_MACHINE = &H80000002

'
' Ask the user for a PC name to check and abort if they
' don't give us one.
'
strComputer = InputBox("Which PC do you want to check?",_
                 "Reboot Need Checker")

if strComputer="" then
  wscript.quit
end if

thePC = ltrim(rtrim(strComputer))

'
' Use the Windows Management Instrumentation (WMI) capability
' to connect to the remote computer's Registry provider.
'
on error resume next
set oReg = GetObject("winmgmts:{impersonationLevel=impersonate}!\\" & _
           strComputer & "\root\default:StdRegProv")

If Err.Number <> 0 Then
   MsgBox "Could not connect with WMI to PC " & strComputer & _
          "'s Registry.", vbOKOnly, "ERROR!"
   wscript.quit
End If

'
' Use the WMI Registry Provider to look up the reboot status in
' the remote PC's Registry. Display an error if we can't do it.
'
strvalue = "NOTHING"
strKeyPath = "SYSTEM\CurrentControlSet\Control\Session Manager"
strValueName = "PendingFileRenameOperations"

oReg.GetMultiStringValue HKEY_LOCAL_MACHINE,_
                         strKeyPath,_
						 strValueName,_
						 arrValues

If Err.Number <> 0 Then
     MsgBox "Could not read reboot status for the PC " & _
	        strComputer, vbOKOnly, "ERROR!"
     wscript.quit
End If
    
'
' If arrValues returns a non-zero value below, then there are filenames in
' the PendingFileRenameOperations key, and therefore a reboot
' is needed to complete those rename operations.
'
if arrvalues > 0 then

   msgbox strComputer & " requires a reboot at this time. ", _
          vbokonly,"Reboot Needed"

else

   msgbox strComputer & " does not require a reboot. ", _
          vbokonly,"No Reboot Needed"
   wscript.quit

end if

The Chrome Browser - Google's First "Evil"?

| No Comments | No TrackBacks

One of the things Google is famous for is a saying along the lines of "don't do anything evil" which is to sum up their attitude as a company. Earlier today, they released the "Chrome" web browser for Windows, a new approach to how browsers should work and designed from the ground up to handle web-based applications.

Having spent a few minutes with the browser, and keeping in mind it's a beta, I'm reasonably impressed. It seemed to be quick, properly rendered the pages I pointed it to, and didn't gobble up lots of system resources in the process. However, being a Windows administrator, I have a couple of problems with it.

Chrome doesn't install in the typical "C:\Program Files" location where (by default) applications are supposed to be installed. Instead, Chrome installs in the "C:\Documents and Settings" directory for the person who runs the installer. That's weird, and not something I'd expect from Google. Still, in and of itself, it's not exactly "evil".

The "evil" thing about Chrome is that it not only doesn't respect the "C:\Program Files" default installation location (and doesn't let the person installing it change that location), because it chooses to install in the "C:\Documents and Settings" directory it bypasses the normal protections against unauthorized users installing software on a system. Normally, a user requires administrator permission to install a software package like Internet Explorer, FireFox, or OpenOffice.org. Corporations rely on this to ensure their systems contain only licensed, authorized software. They rely on it to prevent unauthorized and potentially dangerous software from making it onto their systems. Using "Documents and Settings" as a way to get around these protections is, in my view, pretty "evil" and certainly beneath Google.

Open Source Windows System Management

| No Comments | No TrackBacks

There are quite a few commercial systems management products out there for Windows. As with any product space, each has its strengths and weaknesses. Altiris, for example, offers incredible power. LanDesk may lack some of that power, but is far easier to use. As far as I know, there's no comparable systems management suite consisting of primarily open source software. I'm considering changing that situation.

In the past couple of years, I've begun learning a lot of new things about scripting for systems administration, deploying patches, repackaging and deploying software, and generally maintaining the health of systems on a network. I've shared bits of that knowledge here, as I've had the time and desire to write them up. But I've never taken things to the "next level" and actually converted that knowledge into a usable tool set.

For example, I have a DOS batch script which will deploy a specific Microsoft patch to a specific computer from the command line. I have another script which can simultaneously execute a command on multiple systems. Another set of scripts will run a CHKDSK on a remote system, examine the output, determine if any "significant" errors exist, instruct the system to repair errors on the next reboot, and reboot the system. Other scripts can check for impending disk failure, low disk space conditions, etc. Taken as a whole, these scripts would be useful for a small shop (say, 1000 PCs or less) to manage their systems. Extended a bit, they could probably handle a larger network of machines.

Because I'm starting to get the "itch" to create something, I'm toying with the idea of developing my own equivalent of an Altiris or LanDesk that's built using free or freely-available software. That way, the small organization with 20-150 PCs can manage their system like the bigger shops. And the bigger shops who may not have the money for one of the commercial products can still reap the benefits of automated systems management, without the expense.

This is still just the germ of an idea in my head. My existing scripts are too site specific and undocumented to be widely used without a lot of tweaking. And heck, I may not even have the programming and scripting skill needed to pull off some of the things I would consider critical to such a tool. (For example, minimizing network bandwidth usage by transmitting a software package to one machine on a subnet, then transmitting the package from that machine to others on the same subnet might be more than I know how to accomplish.)

Still, it's fun to think about...