Here is some food for thought for you who own or control or have vested interest in corporations.
If you were to go to your CIO or your IS manager and ask the following; what would their response be?
- Can you show me the network map?
- Can you show me the documentation on the V-LANS?
- Can you give me an accurate inventory of the servers that we have including their age and configuration?
- Can you tell me what is on each server or device and what it does?
- Who has access to what on each server and who decides what that access is?
- Can you tell me how they are connected to the network, is there a redundant path?
- Can you produce an inventory of what software is on each server?
- Can you show me the recent log files of each server and tell me about what concerns you have regarding what those log files say?
- Where is the actual software that is on the servers and where are the license keys?
You would be surprised how many Sysadmins tell me that they don’t keep the software, they just download it when they need it. Really, you have just had a disaster and your internet is down and will not be up for at least 72 hours, now what? Not only does it make sense to have the disk for this reason but it takes time (valuable time) to go and find and download software. They have argued that it is not the most current on the disk. Why not? Why have you not updated your Software Library? There is a lot to being a Sysadmin, (SA) it is not about sitting on your butt in your office surfing the web, reading the news and updating Facebook while being annoyed by the occasional request for a password reset! Old software that is a few versions behind the curve is still better than none! Even if you “don’t have time” to keep your library updated; something is better than nothing.
Speaking of passwords, most companies really need a security officer and really don’t understand why. I have seen some Sysadmins that are so lazy that they assign passwords to people and then keep an excel list of them on the server. These are not really Sysadmins because that is genuinely stupid. To open the company to so many different kinds of fraud, industrial espionage, and other forms of abuse of the system; just because the guy does not want to be bothered with password resets is incredible. This guy would not be working for me as there is no excuse for this! I don’t care how “nice a guy he is.” Laziness and stupidity are a bad combination for a Sysadmin to have.
- What software revision level are we at and is it the most recent? If not, why not?
- Are Firmware rev levels kept up with and checked regularly?
- Are the drivers up to date?
- Can you produce a list of the passwords for each server?
- What are the power requirements for these servers?
- What are the cooling requirements for the equipment and are there any issues?
- How long can we run if there is a power outage?
- When is the last time that the batteries were changed out in the UPS’s?
- Is each and every device in the server room labeled?
- Is all networking cable installed in a manner that not only makes sense but looks like it belongs there vs. haphazardly plugged in on the run?
- Can you show me a map of the switches, what port is doing what?
- Tell me about load leveling.
- Have all of the intelligent devices SNMP passwords been changed from the default?
- If so, what are the passwords? If not, why not?
- Are there traps being sent to a syslog server?
- Who reads the logs, how often; and are there any concerns?
- How are the concerns addressed?
- Show me the notes from change control or change management meetings?
- Are these notes managed in a responsible manner and are all changes noted in the living document?
- What is the average age of the workstation on the floor/building?
- Describe the policy regarding passwords? How often are they changed?
- Describe your Hardware asset management strategy?
- Describe your Software asset management strategy?
- Who handles the maintenance on the HVAC in the server room?
- When was the HVAC last serviced?
- Tell me about your fire suppression.
It has been my experience as an IT manager and a Disaster Recovery Specialist who does many audits; the majority of Sysadmins do a horrible job of Hardware and software management much to the loss of the company and chagrin of the CFO.
Desktops last about 5 years, Laptops 3. When they are put into service a clock should start running to replace it in X years. You don’t want employees working on outdated equipment, and you don’t want to install new software on old computers as the license may very well die with the computer.
I have seen too many companies try to get everything they can out of a box. Amortize the box and when the IRS says it is dead, let it go. If there is a use for it in some non-critical function, “user discretion,” but add no more software and remove it from critical areas.
I have seen many people struggling along on a machine that is well past its usable life. Loosing files or data or waiting around for the machine to catch up cost money. While it may be soft dollars those soft dollars turn into real dollars quickly if you lose enough data and or time.
I used to install older computers in the break room with internet access and the usual windows Facebook type games. Employees could use them for their private needs before or after their shift or while on break or lunch, and they were non-critical and on their own V-Lan where company data could not be accessed!
Not everyone in the company needs a full version of Office? A lot of companies have a standard load for all computers. That should be re-visited as it is wasteful. While Microsoft would like you to purchase everything for every computer that is simply laziness and wasteful.
Software and Hardware management is in itself a job and proper management of it will produce and ROI. This is necessary also to provide a budget requirement which the CFO might cringe when he or she sees the request but, at least it is planned and not a surprise!
- What antivirus software is on them? How did you decide on that software?
- Are the workstations locked down?
- Do any users have admin rights? If so, why?
- Are the USB ports locked down?
- Are the CD burners locked down?
- What ports are allowed through the firewall?
- Is the firewall updated to the latest software?
- Are traps from the firewall being sent to a syslog server?
- Who has access to their workstation PC from home? Why?
- Who has access to their home PC from work? Why?
- What software is on each workstation?
I run an inventory program like Spiceworks or some other commercially available software, to obtain an inventory of all of the software on all of the boxes and then go through the task of identifying each executable. I have found numerous Trojans and viruses, remote control software, games galore, software that was not licensed and oh yes, software that they used and did not know that they had as it was installed by previous regimes. This type of activity is mandatory if you want to recover in the case of a disaster. It is also mandatory if you want to be licensed properly and not have your neck on the line if some employee gets upset and calls the software police.
Recently the SBA has been advertising a lot trying to get employees to snitch on their company. The rewards to the snitch are inconsequential as the penalties and fines to the company are enormous. Having that inventory and those licenses and even receipt in a safe place I would think to be a really good idea.
Some companies are so cheap that they use free anti-virus software which is not worth what you paid for it. I fight viruses daily. Free is not an option. If you think that it is, you are diluted and clearly, don’t know what you are doing.
Free software by definition cannot be maintained as well as commercial software. Who in the hell has money to pay for programmers and security experts and then give the product away?!
Good Anti-Virus software is Patriotic
I made the argument the other night at a speaking engagement that it is actually patriotic to use good anti-virus software. Why? If millions of computers are taken over at the drop of a hat by some “bad guys” and they target let’s say the FAA or the FEDS, or some other institution and are able to cripple the banking industry, or what have you, and your computer is part of the problem; what then. A Trojan could be sitting on your computer unknown to you, just waiting for the instruction to start a DOS attack. Stop being cheap and buy the damned software and protect your computer(s) from being controlled by “evil.”
If a government had more than two neurons firing in their collective heads, they would create a “government approved” anti virus software and give it to its citizens. Now I know how that would be received by most, if I had a choice I would buy my own as I really don’t want anything big brother has to offer on my computer, but lets face facts. You probably have things on your computer right now made by the Russian Mafia or worse! I am certain that a government grant could be created to support a group of “white hat hackers” to help keep America Safe from cyber terrorism. If you do this remember whose idea it was…
Here are a few more questions for you CIO, /owner types who might actually have some skin in the game.
- Do you have licenses for that software?
- Where is that software?
- Where are the licenses kept?
- Can we prove that we bought a license for each and every piece of software in the building? If so, do it. If not, why not?
- How many employees use laptops?
- Are they secure?
- Are they encrypted?
- Are USB drives or thumb drives that are necessary for business use, encrypted?
- Do the laptops have up-to-date anti-virus software on them?
- How old are they?
- Do they use a VPN to get into the servers from outside of the office?
- How secure is their VPN? What challenges, if any are there?
- Do you use security tokens?
- Can you show me a map of the building depicting which PC is hooked up to which drop?
- If you are using VOIP can you show me that same map for the phones?
- Is the map updated as changes occur?
- Describe your backup policies and procedures.
- Where is the data being sent off-site?
- Are we using the cloud for backup?
- Walk me through the procedure of getting access to the data if this building is blown away.
- Walk me through the procedure of restoring the servers in another location.
- Tell me who can do this if the Sysadmin is not available?
- Have we tested a restore of the data, if so when was the last test and where are the results; if not, why not?
These few questions and comments are off the top of my head and it took about ten minutes to list them. There are plenty more but, this gives you a small flavor of the kinds of information you should already have and that I gather in a disaster recovery project.
The simple facts are that IT people are loath to document anything. It is kind of like editing your own work, you know what you meant to say and your mind fills in the blanks. Documentation should be written in such a way that a technical person not familiar with your company should be able to pick up the document and pieces and re-build your company without you there.
Often I am met with complete truculence and arrogance and lots of attitude by the IT staff of a company that I do a DR for. They don’t want me there as they don’t want me messing around in their sandbox. Truth be told they don’t want the the facts that they are remiss in their jobs to get to their boss who thinks everything is running perfectly, until it isn’t!
If you happen to watch or ever have watched Hells Kitchen, or Kitchen Nightmare, or know who Chef Ramsay is than, you have a clue of who I am, without the foul mouth. I take IT departments and fix them, and I take no prisoners (no excuses). Not only do I fix the hardware and software components, but I fix the personnel issues as well. It may be a training issue or an employee that is a poor fit. It may be a lack of people as most companies try to run too thin on staff. There should be no one person who is sacrosanct. In a disaster you may lose them, so we need things documented in such a way that a rent-a-geek can restore your company. If there is no documentation, I create it. Through a test of the DR, we can then hone that documentation to a fine point.
I am a troubleshooter. Not only am I a problem solver; I have been in management of IT for a large part of my life. I get to the bottom of issues and take corrective action. IT is ancillary to the business. IT is a tool that has to be running smoothly; like a Swiss watch. Your job as CEO is to run the company, not IT. I have built data centers from the ground up, as well as re-built them while the business kept going all over the country.
From Data, fire suppression, HVAC, power requirements, UPS requirements, floor height, easy access to the equipment, MDF and IDF design’s Data and Voice, from the east coast to the west from the north to south. I have worked in Union areas of the country to the Wild West where “anything goes.” Been there done that.
Go ask your IT people some of these questions and see if you are satisfied. After 30 years in this business, I would be surprised if you were.
From me, or someone like me, among the deliverables, will be the documentation that so many just don’t do. Without that documentation, you are playing with galloping dominoes. Your risk might be small as you yourself know something about it, or it may be huge in that you, like most who run a company, run it from 20,000 feet, through your management. There are seldom any pleasant surprises in business.
Has anyone at your company done a risk assessment? Where are you located geographically? Are you in an area that is prone to earthquakes, Hurricanes or Typhoons? How about tornadoes or fire?
One of the largest risks to a company surprisingly is none of the above. It is employee error. I have worked for companies where the Owners were the issue. One company had their child who played video games work on the equipment and of course screwed it up constantly. Stay away from those companies as they don’t want to hear the truth. Their child is perfect, knows everything about anything so it must be the fault of the internet or the software or something else. I worked for companies where the owners themselves who ran the company, also thought they were the end all be all of IT. Pride comes before a fall; and believe me, when you own a company you really don’t want to have that fall. Stick to what you know best and leave the technical things that change daily to those that keep up with it. We who know this stuff are constantly involved with forums and our peers. What works today may not work tomorrow. Unless you can devote your life to this, let those of us who do, do it!
One owner takes a passing interest in the latest greatest through a magazine and orders or asked his IT guy to make it so. If you have a yes-man working for you, do your self a favor and fire him. Your people who do this for a living should have the ability to say no. If they say no, you should listen to them. If you want a second opinion, call your VAR. If those two don’t jive call another. Bottom line is you never install REV 1.0 of anything into production, ever! If your guy cant be honest with you, get real and hire a person who will tell you “no!” It may save you tens of thousands of dollars, if not your company. I have had yes men working for me in the past and got rid of them. I depend on Team Cooperation, and that means I need their input. While humbling oneself to listen to a subordinate can be a challenge at times, they may know something that you don’t.
I once worked for a guy who ran a company selling and servicing office equipment. This was actually my first real job out of school. The guy was from Georgia and had been a tank commander in WWII. His manner was gruff, but he was sincere as the day was long. We became close over the years as I have always made it a point to look at what successful people are doing, how they got there, and basically what made them tick.
He promoted me to the position of service manager of one of his locations. He drove me over there to introduce me to the new team and show me around. While on the road, he told me that one secret of a successful person is to hire people smarter, or at least as smart as you were. To me, that was probably one of the most salient bits of advice that I could pass on. That means that the man had humility and, also he must have thought something of me.
While I still struggle with humility today, I am aware of it and work on it.
Hours of Operation.
I had a guy interview with me. Towards the end of the interview, he asked me if there would be any overtime as he had obligations after work and on weekends. This guy clearly had no clue about the job for which he was applying. Hourly jobs are Burger King, not Sysadmin or Network specialist, etc. We get paid well because this becomes the biggest part of our life! If you are a 9 to 5 guy, don’t look at IT as a career.
As anyone who has been in IT any time at all can attest; this is not a nine-to-five job. One never knows when something will stop working and you are suddenly pulling an all-niter to fix something. With VMware and the technology we have today, we can minimize that risk which is something that we do through proper configuration of the servers, building in some redundancy and keeping up with the age of our hardware.
Once you get past a twelve hour day, statistics show that you are much more error-prone, thus shooting yourself in the foot; and possibly the company. Best practice planning and implementation from the beginning mitigates this risk. Having up to date documentation as well as partnerships with VAR’s will allow you to recover faster, and employ fewer full-time people. Staff augmentation through a VAR is an excellent way to keep the number of FTE’s down but, that relationship really needs to be solid.
If you want to experience what “cold running blood is” come in late at night to update some software on the server, reboot it and then you see the prompt, drive 0 not found. This was before the days of raid. This was when ginning a server started with installing 25 5.25 inch floppies followed by a 12-hour compsurf. We have come a long way since then, and so have the folks who create viruses. This is one of the most dynamic industries that I am aware of. One really must be dedicated to be any good at this.
By dedicated, I mean just that. Keep up with what is going on through periodicals, peers in the industry, and again I can’t stress this enough at least one good VAR.
On one of my data center re-builds a vendor was doing our cable plant. They ran long into the night and someone made a mistake. Instead of pulling the old data lines and stopping, they cut and pulled the phone lines as well. On another cable job that I was aware of about 3 in the morning a 32 pair conductor cable got stuck. Instead of seeing why the installer reared back and pulled for everything that he was worth. He snapped an ionized water line and flooded the computer room in a huge hospital. Water poured out of the elevator shaft like it was some sort of an elaborate fountain. Thank goodness that was not my job.
Much like driving less than 500 miles a day on vacation is a good idea; so are the number of hours worked by each person, as mistakes happen. Make sure you have adequate staff to do the job, especially when you are taking on a new project. How do you do that? Proper project management methodologies and relationships with VARS… That is another story…
That is another story…
Here is an example of what a sysadmin is as defined by this site.
The System Administrator (SA) is responsible for effective provisioning, installation/configuration, operation, and maintenance of systems hardware and software and related infrastructure. This individual participates in technical research and development to enable continuing innovation within the infrastructure. This individual ensures that system hardware, operating systems, software systems, and related procedures adhere to organizational values, enabling staff, volunteers, and Partners.
This individual will assist project teams with technical issues in the Initiation and Planning phases of our standard Project Management Methodology. These activities include the definition of needs, benefits, and technical strategy; research & development within the project life-cycle; technical analysis and design; and support of operations staff in executing, testing and rolling-out the solutions. Participation on projects is focused on smoothing the transition of projects from development staff to production staff by performing operations activities within the project life-cycle.
This individual is accountable for the following systems: Linux and Windows systems that support GIS infrastructure; Linux, Windows and Application systems that support Asset Management; Responsibilities on these systems include SA engineering and provisioning, operations and support, maintenance and research and development to ensure continual innovation.
SA Engineering and Provisioning
- Engineering of SA-related solutions for various project and operational needs.
- Install new / rebuild existing servers and configure hardware, peripherals, services, settings, directories, storage, etc. in accordance with standards and project/operational requirements.
- Install and configure systems such as supports GIS infrastructure applications or Asset Management applications.
- Develop and maintain installation and configuration procedures.
- Contribute to and maintain system standards.
- Research and recommend innovative, and where possible automated approaches for system administration tasks. Identify approaches that leverage our resources and provide economies of scale.
Operations and Support
- Perform daily system monitoring, verifying the integrity and availability of all hardware, server resources, systems and key processes, reviewing system and application logs, and verifying completion of scheduled jobs such as backups.
- Perform regular security monitoring to identify any possible intrusions.
- Perform daily backup operations, ensuring all required file systems and system data are successfully backed up to the appropriate media, recovery tapes or disks are created, and media is recycled and sent off site as necessary.
- Perform regular file archival and purge as necessary.
- Create, change, and delete user accounts per request.
- Provide Tier III/other support per request from various constituencies. Investigate and troubleshoot issues.
- Repair and recover from hardware or software failures. Coordinate and communicate with impacted constituencies.
- Apply OS patches and upgrades on a regular basis, and upgrade administrative tools and utilities. Configure/add new services as necessary.
- Upgrade and configure system software that supports GIS infrastructure applications or Asset Management applications per project or operational needs.
- Maintain operational, configuration, or other procedures.
- Perform periodic performance reporting to support capacity planning.
- Perform ongoing performance tuning, hardware upgrades, and resource optimization as required. Configure CPU, memory, and disk partitions as required.
- Maintain data center environmental and monitoring equipment.
- Bachelor (4-year) degree, with a technical major, such as engineering or computer science.
- Systems Administration/System Engineer certification in Unix and Microsoft.
- Four to six years system administration experience.
- Position deals with a variety of problems and sometimes has to decide which answer is best. The question/issues are typically clear and require determination of which answer (from a few choices) is the best.
- Decisions normally have a noticeable effect department-wide and company-wide, and judgment errors can typically require one to two weeks to correct or reverse.
RESPONSIBILITY/OVERSIGHT –FINANCIAL & SUPERVISORY:
- Functions as a lead worker doing the work similar to those in the work unit; responsibility for training, instruction, setting the work pace, and possibly evaluating performance.
- No budget responsibility.
- Interpret and/or discuss information with others, which involves terminology or concepts not familiar to many people; regularly provide advice and recommend actions involving rather complex issues. May resolve problems within established practices.
- Provides occasional guidance, some of which is technical.
WORKING CONDITIONS/PHYSICAL EFFORT:
- Responsibilities sometimes require working evenings and weekends, sometimes with little-advanced notice.
- No regular travel required.
This is close, but I would add to this list… I see nothing in this description about documenting anything. Maybe that is why it is not done in so many places? Does your SA do this type of thing?