Archive

Archive for November, 2008

System Administration Lessons Learned from Star Trek

November 25th, 2008

1. “You have to know how things work on a starship…” (Star Trek II)

Kirk’s old enemy, Khan, took command of the Reliant, a Federation starship.  When the Reliant approached the Enterprise, Kirk hesitated to raise his shields.  This gave Khan the opportunity to attack and severely damage the defenseless Enterprise.  Kirk retaliated by using his superior knowledge of Federation technology to remotely order the Reliant to drop its shields, allowing Kirk to launch an effective counterattack.  When asked how he knew the strategy would work, Kirk remarked that (if you’re the captain) you have to know how things work on a starship.

Similarly, if you’re a Windows System Administrator, you have to know how PC hardware works and how Windows itself works if you’re going to be very effective.  Since becoming a Windows administrator, I’ve had to dig deeply into the Registry, crash logs, technical references, and programming guides to solve some of the more challenging issues to come my way.  The more I know about how things work (or how they’re supposed to work), the more effective I tend to be.

2. “A no-win situation is the possibility every commander may face.” (Star Trek II)

In Starfleet Academy, the Federation tests potential officers by putting them in a simulated situation that they cannot win.  This is done to see how they react to the pressure and inevitable defeat.  Having just “failed” this unpassable test, a cadet asks Kirk why they are put through it.  He says that a no-win situation is a possibility every commander may face at some point (though Kirk himself cheated his way out of it and won the “no-win” scenario).

In system administration, there are problems that will come your way that you simply can’t fix.  Maybe it’s a system that’s been hit by too many viruses, a Registry that’s too corrupted to be sorted out, or hardware that just doesn’t work.  You can spend hours or days trying to fix a problem like this without ever really solving it.  You have to know when you’re facing a “no-win scenario” and cut your losses by walking away from the problem.  That might mean wiping the system and reinstalling everything instead of spending hours correcting a series of problems, tossing out a piece of hardware that “ought to work” but somehow doesn’t, or giving up on software that simply doesn’t do what it’s advertised to do.

3. “The needs of the many outweigh the needs of the few… or the one.” (Star Trek II)

Spock gave up his life at the end of Star Trek II to save the Enterprise and her crew.  When asked by Captain Kirk why he did it, Spock replied that the needs of the many outweighed the needs of the few, or the one.  In other words, Spock knew that by giving up his life he could save many others.

In system administration, you’re probably not going to be faced with a “life or death” choice like this, but almost daily you’re faced with situations where the needs of your end user community (”the many”) dictate actions you (”the one”) take.  For example, you may find yourself at the office after hours, sacrificing your personal time in order to complete a software upgrade, patch a server, or otherwise do something that would inconvenience users if you tried to do it during the work day.  Chances are, you’re also probably “on call” to help those same users if they have problems late at night or on the weekend, and you’re expected to help them.  The needs of the many, in this case, outweigh your own needs.

4. “Mr. Scott, have you always multiplied your repair estimates by a factor of four?  Certainly, sir. How else can I keep my reputation as a miracle worker?” (Star Trek III)

Mr. Scott admitted to Captain Kirk in Star Trek III that he had always multiplied his repair estimates by a factor of four.  This gave him the opportunity to take all the time he needed to solve a problem, while still completing the task more quickly than the captain had expected.  As a result, he was seen as a miracle worker by Captain Kirk.

In system administration, you’re often asked how long something is going to take.  While I don’t recommend multiplying your estimate by four, I do believe that you should always practice the principle of “underpromise and over-deliver” when dealing with others.  A task that looks like it should be a one-hour job can easily become a 2-3 hour job if things go wrong, the system begins responding too slowly, an emergency arises that you need to address first, etc.  If you tell someone something will take an hour and you aren’t done two hours later, they’re angry.  But if you tell them it will take two hours and you’re done in 90 minutes, you’re a miracle worker.  I’m not suggesting that you make a habit of lying, but rather that you give yourself a little breathing room to allow for things you might have forgotten, things that take longer than expected, or unexpected circumstances.

5. “The fancier the plumbing, the easier it is to stop up the drain.” (Star Trek III)

In Star Trek III, Captain Kirk and the crew of the Enterprise essentially “stole” the ship in order to save Spock and Doctor McCoy.  Mr. Scott expected the Federation’s newest, fastest, fanciest ship (the Excelsior) to be given the task of pursuing the older, slower Enterprise.  He removed a handful of critical computer chips from the Excelsior’s system while working on it, preventing the ship from being able to give chase.  When asked how he managed to sabotage the Excelsior in a way that they didn’t detect, he replied that the fancier the plumbing was, the easier it was to stop up the drain.  In other words, the systems on the Excelsior were so complicated that it was easy to screw them up.

System administrators often have several ways to deal with a situation.  Some ways are simpler than others.  You should always be wary of any solution that has too many potential “points of failure”.  While an elaborate Perl script might push out an urgent security patch to 10 systems simultaneously from the comfort of your desk chair, you could over-think the script and end up accidentally applying that patch to 100 systems you didn’t want to apply it to.  Sometimes it’s better to keep things simple, because it can reduce the chance of failure or allow you to respond more quickly.  Similarly, you can “over engineer” a solution to a problem and spend more time architecting a clever solution to something you could fix manually in a few minutes.

6. “Sometimes the needs of the one outweigh the needs of the many.” (Star Trek III)

When asked why the crew of the Enterprise risked their lives and their careers to save Spock in Star Trek III, Captain Kirk told him that sometimes the needs of the one outweigh the needs of the many.  In other words, Spock was their friend and they were willing to risk themselves because he meant more to them than their lives or careers.

In systems administration tasks, sometimes you have to do things that make a lot of people very unhappy.  For example, when pushing out security patches it is often necessary to reboot someone’s PC to complete the installation.  Naturally, if that person has documents open in Microsoft Office when you reboot them, they’re not going to be happy about it.  Multiply that over a large organization, and that simple reboot action can upset a lot of people.  However, as a system administrator, you’re responsible for protecting your network from malware.  While “the many” users’ needs may dictate that their PCs not be rebooted, your responsibility as “the one” who protects the network must outweigh theirs.  This is not to say that you’re more important, or that you should be fine with mid-day reboots as a matter of practice, but rather that there will be times in the job where you’ve got to risk the wrath of the users for a greater good.

7. “Perhaps ‘because it is there’ is not sufficient reason for climbing a mountain.”  (Star Trek V)

In Star Trek V, Captain Kirk is attempting to climb a mountain when he slips and begins to fall off.  Spock saves him at the last second.  Later, Spock tells Kirk that perhaps “because it is there” isn’t a good enough reason to risk your life climbing a mountain.

There are times in system administration where there is something that you can technically do, but which isn’t a good idea when examined more closely.  Maybe you have a script that could update all the company’s computers with the latest Windows Service Pack overnight.  You might even be tempted to do it, since your management’s asking you about when you’re going to get the job done.  However, just because you can roll that Service Pack out in a heartbeat doesn’t mean that’s the right thing to do.  You could come in the next morning and find out that the Service Pack you pushed out last night broke the salespeople’s contact management software, the accountants’ general ledger program, and the CEO’s favorite screensaver.  Suddenly, instead of being the miracle worker you thought you were going to be, you’re on everyone’s hit list.  There are times in system administration when caution is needed, and experience will often help you know when climbing the proverbial mountain is a good idea and when it isn’t.

8.  “An ancestor of mine maintained that if you eliminate the impossible,whatever remains, however improbable, must be the truth.” (Star Trek VI)

In Star Trek VI, when attempting to figure out who assassinated the Klingon Chancelor, Spock began investigating his shipmates to identify the assassins.  When he came up with a seemingly incredible solution, he uttered the famous line above (which is paraphrased from Sherlock Holmes).

System administrators are often called upon to troubleshoot the strangest problems.  Sometimes the solution to those problems can be counterintuitive, and may even sound “impossible”.  Here’s a real-life example from my Windows 98 days.  The company had just implemented a new application in the Marketing and Finance areas.  For some reason, the laptop users in Marketing were getting a lot of “out of memory” errors when trying to use the application.  They requested more RAM.  We installed it.  The out of memory errors became even more frequent.  I started doing some research online and learned about a table kept by Windows 98 that was used to manage the available RAM.  My research indicated that the table had a fixed size and under certain conditions could “fill up” on the user.  One way you could free up space in this table was to remove some RAM.  I tried this on the Marketing laptops and, sure enough, the “out of memory” errors went away.  So, as impossible as it might seem, removing memory from the machines cleared up an “out of memory” error.

9. “People can be very frightened of change.” (Star Trek VI)

In Star Trek VI, the Klingons suffered an environmental disaster that threatened to destroy their civilization.  As a result, they sought peace with the Federation, a change from their long-standing policy of conflict and subjugation.  In both the Federation and the Klingon Empire, there were people who had hated their rivals so much, and for so long, that the prospect of peace between the two governments was something they couldn’t stomach.  It was said that such people were frightened of change (the coming peace).

This is very true in the Information Technology (IT) world.  When system administrators are about to make any kind of a signficant change, they’re often required to document, justify, explain, and test the change well in advance of making it.  Inevitably, you will eventually change something that causes a problem.  Perhaps some Excel macros quit working after you upgrade Microsoft Office, or the new version of Internet Explorer doesn’t work with an application used in Human Resources.  Those unfortunate consequences tend to make organizations as a whole resistant to change, even fearful of it.  As a system administrator, one of your responsibilities is to introduce change in a manner that allows you to control the potential negative impacts.

When we planned to roll out Windows XP Service Pack 2 (a while ago), I helped test as many of the applications used around the company as possible.  I would try to identify if Firewall changes would be needed, if the application required one of the “compatibility mode” options, if it would need to be patched, etc.  The point of all the hours I put in doing those things was to minimize the disruptive effects of upgrading to Windows XP Service Pack 2.  By all accounts, our hard work paid off and there were few, if any, complaints once the software began rolling out across the organization.

10. “One of the advantages of being a captain, Doctor, is being able to ask for advice without necessarily having to take it.” (ST:TOS “Dagger of the Mind”)

In the original Star Trek series, Captain Kirk often sought the advice of his senior officers.  Even though he sought their advice on how to deal with a problem, he did not always heed it.

Systems administrators typically work in teams.  Members of teams typically have one or more areas of expertise, and other areas where their expertise may be less extensive.  As a member of the team, you should always be willing to seek the advice of your teammates when you’re about to do anything that might reflect negatively on the team if it goes wrong.  Just because you ask for a teammate’s advice, however, doesn’t mean you have to follow it.  Sometimes your own expertise or experience may “trump” the advice of a teammate, however well-intentioned and intelligent the advice might be.  The key lies in knowing when to take advice and when to ignore it, which is something you learn with time and experience.

11. “Power is danger.” (ST:TOS “Balance of Terror”)

A commonly uttered security mantra is that you should give users only the amount of administrative ability necessary for them to do their jobs, and no more.  If users don’t have a business need for administrator access to their systems, they shouldn’t have it.  In this way, if those same users introduce malware to your network via an infected floppy, CD, USB key, etc., that malware will have a hard time spreading.  Having no administrator access will also prevent them from installing unauthorized or pirated software, shutting off their computer’s firewall, or doing other things that could compromise the security and stability of your network.

Similarly, as a system administrator you should always be careful and deliberate with your actions when you’re using administrator permission on a machine.  Don’t do indescriminate web browsing with the administrator account.  Don’t run untested scripts against lots of end user machines.  Don’t delete files you aren’t sure about.  In short, recognize that your “godlike” powers over the computer make you dangerous, and always use those powers sparingly and carefully.

12. “Leave bigotry in your quarters; there’s no room for it on the bridge.” (ST:TOS “Balance of Terror”)

System administrators tend to be the kind of people who like to tinker with things.  Even though we may be Mac administrators, we dabble in Windows or Linux.  If we’re Linux administrators, we can’t resist the urge to fiddle with a script on OS X or a batch file on Windows.  Because we have a lot of experience, we can sometimes become opinionated about technology, to the point of bigotry.  In a corporate setting, this kind of bigotry can be suicidal.  If your response to every Windows problem you’re asked to resolve is to launch into a missive about how this wouldn’t be a problem on the Mac, you’re in the wrong job.  Unless they happen to ask for them, users don’t want your opinions about the technology they’re using.  Most of them could care less whether they’re using Windows, OS X, Linux, or something else.  They just want to do their jobs, and they need you to fix the problem that’s keeping them from working.  You may have a long list of reasons why the company should dump Windows and move to Linux or OS X. They might be very intelligent, objective, and thoughtful reasons.  But if you’re being paid to administer Windows, you should keep those opinions to yourself unless asked for them. You’ll just create unrest and friction with your co-workers, and that doesn’t help anyone.

13. “The more complex the mind, the greater the need for the simplicity of play.” (ST:TOS “Shore Leave”)

Most people adorn their offices with a few well-chosen artifacts.  Perhaps they’re pictures of loved ones, awards they’ve won, or souvenirs from their travels.  System administrators have those things too, but they also tend to like little toys.  For example, I’ve often got a netbook, an MP3 player, and some other gizmo keeping me company.  They might be expensive gadgets to other people, but they’re fun toys to me, and it helps me to reduce my stress to play around with them occasionally… such as on my lunch hour.  Systems administrators tend to be fun, playful, and funny people (once you get to know them).  The complex web of information we have to master and use on a daily basis tends to make us seek out “fun” when we’re not working or need a break.

14. “Insufficient facts always invite danger.” (ST:TOS “Space Seed”)

In the original Star Trek, Captain Kirk freed Khan Noonien Singh and his crew from an extended hibernation.  Khan and his crew were evasive about who they were and what they were doing on the ship they were rescued from.  Both Spock and Kirk did their best to extract information from them, but got very little.  Kirk noticed that Spock seemed uncomfortable with their new guests.  When asked why, he explained that they knew little about them, and that this lack of knowledge could be dangerous.  Later, Khan and his crew attempted to take control of the Enterprise.  Spock was right not to trust them.

System administrators who are willing to jump in and start working with something they know little about often learn through (bad) experience to become more cautious.  In handling security patches, for instance, I’m very careful.  When a new patch comes in, I have no way of knowing if that patch will break a critical business system, prevent systems from booting up, or force a reboot in the middle of the CEO’s presentation to the executive board.  Before I release the patch to anyone else, I try it on my own system first to see how it behaves.  I then try it on my teammates and nearby co-workers.  If it doesn’t cause a problem for them, I begin slowly fanning it out to the rest of the company.  Once I learn that the patch seems harmless I will then allow it to make its way on to large numbers of computers.  I make every effort to learn as much as I can about the patch before letting it “run loose” on the network.

15. “Either one of us, by himself, is expendable. Both of us are not.” (ST:TOS “The Devil in the Dark”)

It’s not uncommon in system administration for there to be one person who handles a specific task, with another person serving as backup to that person.  The logic is to ensure that if the primary person gets sick, goes on vacation, takes a job elsewhere, or is hit by a bus, the team can continue to do the things it is responsible for.  It’s a bad idea for the primary and the backup to be out of the office at the same time, and should be avoided if at all possible.  Inevitably, the day you’re both out of the office there will be a major crisis in your area of expertise, and no one there who can resolve the problem.

16. “If I can have honesty, it’s easier to overlook mistakes.” (ST:TOS “Space Seed”)

Sooner or later, you’re going to make a mistake.  Maybe you accidentally deleted some critical files from a server.  Maybe you meant to adjust the firewall settings and ended up turning it off.  It might be something relatively minor, or heart-stoppingly major.  Whatever mistake you make, be willing to own up to it.  There’s nothing to be gained by lying to your teammates or management to cover up a mistake.  If you own up to your mistakes, people will respect and trust you.  If you lie about them, they soon realize they can’t rely on you and begin to resent you for the time they spend uncovering the truth.  Demand honesty from your coworkers, but deliver it in return.

17. “No one can guarantee the actions of another.” (ST:TOS “Day of the Dove”)

As part of system administration, or indeed any job, it can be necessary to make assumptions about how people will react to something and predict how they’ll deal with it.  But just because a particular reaction seems logical, reasonable, and expected, don’t assume everyone will do it.  Always make allowances in your plans, your scripts, and your procedures for your end users to do the illogical, unexpected, and “wrong thing at the wrong time”.  Build in the safeguards you can to prevent as many problems as you reasonably can, but realize that no matter how hard you try, there’s likely to be someone who does something you didn’t plan for.

admin Windows Administration , ,

Another Step in the Papillary Carcinoma Treatment

November 6th, 2008

Today, I met with an endocrinologist who will be overseeing my treatment from this point on.  He explained that based on the type of cancer I had on my thyroid (papillary carcinoma) and the fact that it didn’t appear to have spread, the prognosis is extremely good. 

The next step will be for me to meet with the nuclear medicine specialist who will eventually administer radioactive iodine to me.  That appointment has not yet been scheduled, but is expected to take place in the next couple of weeks.  After that meeting, I’ll most likely be placed on an iodine-restricted diet, designed to make any thyroid or cancer cells remaining after surgery starved for iodine.  Then, when the radioactive iodine is administered, they’ll grab up all they can get and die out. 

As I understand it, for 5 days after taking the iodine, I’m to minimize my exposure to other people to prevent the radiation from affecting them.  If you’re curious as to what those restrictions might be, there are other web sites that can tell you.

Once we’ve done that treatment, I’ll start taking synthetic thyroid hormone.  The challenge will be to determine the correct dosage for me, as each person is a little different.  Once that’s settled, I’ll have annual checks to see if the cancer has returned.

admin Life