March 2011 Newsletter
Our hearts, prayers, and thoughts go out to all those in Japan who are suffering from the devastating earthquakes, tsunami, and now nuclear issues. As the days unfold, the human toll is going to grow, and it is just unbearable to see the suffering on their faces as we watch the news. Helping others is always good for your soul, but in times like these, no matter how big or small your help is, I am sure it will be appreciated. Look at how quickly things changed for those people in northern Japan. It could happen in an instant to any of us.
As human beings, we are affected by the human suffering in Japan, and as IT professionals, there are also many lessons we should learn from what happened. Disasters happen, and we need to be prepared for them with proper testing. There is probably no other country who is more prepared for earthquakes than Japan. Their infrastructure takes earthquakes into account: e.g., their buildings are not as rigid to actually allow them to sway, and their citizens practice evacuations. No matter how much you prepare and test, sometimes you can’t account for everything. Look at the nuclear reactors. Those are designed to withstand an earthquake, backup pumping stations were in place, and generators to power the pumping stations were ready in case of lost power, yet they didn’t realize that the tsunami would flood their generators. They probably tested those generators all the time, as well as the backup pumps. They just never thought about the water from the tsunami. This is what disaster recovery testing is all about. You test different scenarios, you validate your procedures, you try to account for all situations. In this month’s newsletter, I want to address some of the items you should be thinking about.
We have packed a lot of information into this newsletter, and I hope that you find this useful. This issue of our newsletter has six articles. In the first, I want to continue our series on backups with focusing on your tapes. The second article is on connecting your UPS more than just IBM i. The third article is on tape compatibility. The fourth article asks “Would you like to have High Availability but don’t have a second machine?” The fifth article lists some of the upcoming events in which iTech Solutions will be participating. The last article is for your reference with updated PTF information.
Of course, if you are still on V5R4, send Pete an email and he can help you upgrade to V6R1 or V7R1, with over 300 V6R1 upgrades done to date you know iTech Solutions has the expertise and know how.
iTech Solutions can help you improve performance, upgrade i5/OS, perform security audits, implement a High Availability solution, Health Checks, Systems Management, Remote Administration, PTF management, Blade installations, iSCSI Configurations, Backup/Recovery, upgrade an existing machine, or upgrade to a new machine. If you are thinking of LPAR or HMC, then think iTech Solutions. We have the skills to help you get the most out of your System i.
In last month’s newsletter, I discussed tape management, but in light of everything that has happened recently I want to reemphasize this. No matter how good your backups are, if you don’t properly handle your tapes, there won’t be a tape you can count on for restoring. I want to make sure that point is well understood. No matter what your backup methodology is, no matter how much or how little you backup each night, no matter what type of tape drives, encryption or no encryption, if you don’t have the tape when it is time to restore, you are not going to be able to restore!
So, I want you to answer a few questions, because your job will depend on the answer if you ever have to rely on those tapes for recovery. Where are your tapes? How can they be identified? How far are they from your computer? Who else knows where they are stored and how to retrieve them in a recovery situation? Please don’t think you have to answer to us, you need to answer to yourself. Can you sleep at night knowing that the tapes that backup your company’s computers are safe, secure, retrievable, and then usable? If the answer to any of those is no, stop what you are doing and fix this.
You might think I am being a little direct and pushy, but I am trying to make sure that in a recovery you can use your backup media to recover your system. I am at a different customer almost every day, and I can tell you that most of them don’t move their tapes off-site. This is one of the biggest problems we see today. I don’t care if you use BRMS, GO Backup, GO SAVE 21 each night, or have written your own backup procedures. If the tapes stay in the computer room, if the tapes stay in the tape drive, if the tapes aren’t moved off-site, then your backups for disaster recovery are useless. I am not going to sugar coat this: if you aren’t getting your tapes off-site each night, we are going to grade you an “F” for backups when we do an iTech Solutions Health Check.
Today so much of our information is stored electronically, how can a company function and carry on without their computers? The key to recovery is properly planning backups, doing them regularly, moving backup tapes off-site, and then performing a recovery test. If you do a special backup right before your Disaster Recovery test, then you really haven’t tested your DR. You need to be testing your DR with a regular set of tapes. When was the last time that you performed a DR test? Did you have objectives for the test established before your test? Did you meet those objectives? If you have never done a test of your recovery, then you have no idea if you are able to recover your system from your current backup methodology. Don’t wait for a real disaster to happen when it is too late. Contact iTech Solutions, and schedule a DR test. We can bring a machine to you for testing, or you can come to our location with your tapes for testing, or we can do it remotely, but you must schedule a DR test before an event happens when you are forced to recover. I can tell you that of all the customers who have come to us to test their recovery, most can’t recover everything the first time. We then work with the customer and their backups making recommendations on what they should be doing with their backups, they make the changes we recommend, and then they are able to successfully recover the second time.
Accidents happen. As System Administrators we are tasked with being keepers of the data. We don’t own it, we don’t control it, we shouldn’t update it, but we need to take care of the data. Your job doesn’t stop with the backup, it starts with the backup.
If you are interested in an iTech Solutions Health Check, or in scheduling a test of your recovery using your backups, please send an email to Pete. You won’t know how good your backup is until you have to recover it. Don’t wait until it’s too late. Test your recovery before you need to rely on it for a real recovery.
Connecting your UPS to more than just your IBM i.
I know you have your Power Systems IBM i (AS/400, iSeries) plugged into your UPS, and you have the signal cable from the UPS to your machine so that the UPS can signal your machine when the utility power is lost. Furthermore, I know you have a program on your machine to handle and process the UPS signals to bring the machine down gracefully when utility power is lost for a specified duration. Chances are you don’t have all of the above; in fact I would bet that only 10% of the people reading this newsletter have everything above. If you are unsure, please contact John about a review. Most customers we run into either don’t have the cable connected, or don’t have a UPS Monitoring program to manage the signals that the UPS sends. I would highly recommend you get the above working first and foremost, and we can help you with this if you are unsure.
The above is just the basics. So what else is there? Well there is a lot more that needs to be connected to your UPS. Let me explain. If your machine’s console is either the HMC or Operations Console, then those two devices (and their monitors) must also be on your UPS. Otherwise how will you shut your machine down or connect to it to manage it during a power outage? Last week, I received a call from a customer who asked, “How do I turn my AS/400 off?” I had to ask, “Why?”, and he said, “Well all the power in the industrial park is off, and we want to shut the machine down because we don’t have the UPS cable connected.” I said to them, “The same way you would normally do it. Just sign onto the PC that runs Operations Console in the computer room….” The customer interrupted me and said, “We took that off the UPS because the UPS was beeping as the batteries were getting old and couldn’t handle everything on the UPS.” I don’t need to continue, as you see what happened already. Their console wasn’t plugged into the UPS any longer so they couldn’t manage their machine. Not a good scenario. In addition, if you are using Operations Console through your routers, you need to make sure that all your routers are also on a UPS. I know, there are a lot of things to think about, but this is the kind of review you need to complete to insure you are still able to connect to and manage your machine when you lose power.
If you are unsure, and would like a verification, contact John via email and request an iTech Solutions Health Check.
What do you do when you have old tapes in one format, but your new machine comes in with a different media type or incompatible media? We see this happening more and more as customers on older machines are trading them in for new Power7 machines with LTO4 tape technology. Some of those customer’s older machines had ¼ inch QIC or possible LTO1 tape drives. LTO4 drives can read and write LTO4 and LTO3 tapes, but can only read LTO2. It can’t read LTO1. This is causing a problem with either old tape archives, or even during conversions/migrations from the old to the new machine. iTech Solutions has loaner tape drives that we provide our customers during a new machine installation so to make the migration easier. During a recent migration from a 270 to a new Power7 machine, we were able to install one of our LTO3 tape drives in the customers old machine (270), and do their final backup on media that could be read in the new machine, but the last backup was run in less than 20% of the time it took to run the backup on the customer’s original media in the old machine. This greatly reduced their outage window during the migration from their old machine to their new machine. In addition, iTech Solutions has many different types of tapes and formats and can convert tapes from one media to another. If you find yourself with a tape for which you no longer have the tape drive, give us a call and get the tape converted.
| Would you like to have High Availability (replication) but don’t have a second machine?
Having good backups isn’t high availability. It’s a necessary practice, but backups require you to restore your system. Don’t forget to understand how long it will take to restore your system, plus the time to get another system. In addition, your recovery will only be up to your last backup. What happens to all the transactions you entered since your last backup? If you backup your entire system every morning at 2:00am, but the system crashes at 4:00pm the next afternoon, all the transactions during the 14 hour interval would be lost. High Availability software allows you to replicate transactions as they are entered on your current machine (known as the source machine) using remote journaling to another machine, (known as the target machine) upon which they are read from the journal and “played back” to update the files/objects on this target machine. So when a user updates a file on the source machine with your application software, the high availability software will then modify the same file on the target system with that same update.
Of course to do this you need a remote system and high availability software. I have seen some customers who have two systems in the same computer room and are using high availability software. This isn’t true high availability, since there is at least a single point of failure (everything is in one machine room). That second system (the target system) needs to be in a different location. What happens if you don’t have a second location or a second machine? iTech Solutions offers a virtualized recovery options for the IBM i (AS/400, iSeries) offering the benefits of second machine at a remote location with rapid recovery for business continuity at a price point that every business can afford. This virtualized server can be managed by your staff, or it can be managed by our staff.
To protect your company and your data, partner with iTech Solutions. We can provide your business with:
A cost-effective high availability solution for your most critical systems.
It has been shown that over 80% of the customers that lose their machines go out of business within two years after the machine failure. Don’t let this happen to you. Team up with iTech Solutions and save your business. Contact Pete for more information and details.
Join us at the Northeast User Group Conference for IBM i on April 12 & 13 in Framingham, MA.
May 18th iTech Solutions will be at LISUG in Woodbury, NY. Please stop by and visit us to discuss your needs and requirements, and how iTech Solutions can help you.
Pete Massiello will be presenting in Minneapolis, MN at the Annual COMMON conference May 1st to May 4th the following sessions:
- Getting Started with IBM Systems Director Navigator.
- Performance fundamentals: Tips and Techniques.
- Understanding the HMC, FSP, IBM i, and Firmware.
- Virtual i Partitions Hosted by IBM i.
- What you need to know to upgrade your IBM i to 6.1 & 7.1.
|Release levels and PTFs|
People are always asking me how often they should be performing PTF maintenance, and when is the right time to upgrade their operating system. I updated this article from last month with the current levels of PTFs. Let’s look at PTFs. First, PTFs are Program Temporary Fixes that are created by IBM to fix a problem that has occurred or to possibly prevent a problem from occurring. In addition, some times PTFs add new functionality, security, or improve performance. Therefore, I am always dumbfounded as to why customers do not perform PTF maintenance on their machine at least quarterly. If IBM has come out with a fix for your disk drives, why do you want to wait for your disk drive to fail with that problem, only to be told that there is a fix for that problem, and if you had applied the PTF beforehand, you would have averted the problem. Therefore, I think a quarterly PTF maintenance strategy is a smart move. Many of our customers are on our quarterly PTF maintenance program, and that provides them with the peace of mind of knowing their system is up to date on PTFs. Below is a table of the major group PTFs for the last few releases. This is what we are installing for our customers on iTech Solutions Quarterly Maintenance program.
7.1 6.1 V5R4 V5R3
Cumul. Pack 10229 10215 10292 8267
Tech. Refresh 1
Grp Hipers 26 84 147 169
DB Group 5 19 30 24
Java Group 5 15 26 23
Print Group 2 20 43 20
Backup/Recov. 7 21 39 33
Security Group 4 19 15 7
Blade/IXA/IXS 4 18 14 –
Http 5 16 25 17
TCP/IP 2 11 18 16
The easiest way to check your levels is to issue the command WRKPTFGRP. They should all have a status of installed, and you should be up to the latest for all the above, based upon your release. Now there are more groups than the ones listed above, but these are the general ones that most people require. We can help you know which group PTFs you should be installing on your machine based upon your licensed programs. Here is a nice tidbit. The Cumulative PTF package number is broken down as YDDD, where Y is the year and DDD is the day it was released. Therefore, if we look at the cumulative package for V5R4, the ID is 9104. We can determine that it was created on the 104th day of 2009, which is April 14, 2009. Look at your machine and this will give you a quick indication of just how far out of date in PTFs you may be. I left V5R1 off the list, because if you are on V5R1, you don’t need to be worrying about PTFs, you really need to be upgrading your operating system. The same can be said for V5R2 and V5R3, but there are still customers who are on those releases.
If you have an HMC, you should be running V7R7.2M0 with Service Pack 1. If your HMC is a C03, then it should stay at V7R3.5 SP2.
For your Flexible Service Processor (FSP) that is inside your Power 5 or Power5+ (520, 515, 525, 550, 570), the code level of the FSP should be 01_SF240_403. Power 6 (940x M15, M25, & M50 machines, and 8203-E4A & 8204-E4A) customers should be running EL350_085. For Power6 (MMA, 560, and 570 machines) your FSP should be at EM350_085. If you have a Power6 595 (9119-FMA) then you should be on EH350_085. POWER7 the firmware level is AL720_082 or AM720_064 depending on your model.
If you need help with upgrading your HMC or FSP just give us a call. We will be happy to perform the function for you or assist you in doing it. Contact Pete Massiello.