July 2008 Newsletter

Greetings!
 

We hope you are enjoying your summer and perhaps taking a little vacation time as well.  It has certainly been quite a busy summer for us with lots of new Power 6 520s, and Power5+ 515 and 525s being installed.  In addition, new customers are coming out of the closets asking us to do i5/OS upgrades to V5R4.  Many of our existing customers are upgrading to IBM i 6.1.

This issue of our newsletter has four articles.  In the first, we’ll take a look at automatically setting the time on your machine. As performance is an issue in most shops, our second article deals with the number of disk arms on your system.  The third article deals with having a UPS since the summer is always a time for brown-outs and black-outs. The last article is for your reference with updated PTF information for your use.

iTech Solutions can help you improve performance, upgrade i5/OS, perform security audits, implement a High Availability solution, VoIP, Systems Management, PTF management, Blade installations, iSCSI Configurations, upgrade an existing machine, or upgrade to a new machine.  If you are thinking of LPAR or HMC, then think iTech Solutions.  We have the skills to help you get the most out of your System i.  For more information on any of the articles below please visit us at iTech Solution or contact us at info@itechsol.com . We would also like to know what you think of this newsletter and any items you would like us to discuss in future issues.

Who has the Time?? 
I was in a customer’s data center the other day, and I overheard the operator mumbling under his breath, “Every *&^% machine has a different time, how am I supposed to match events on these machines?”.  I thought I would go over and talk to him.  I asked him if he had ever heard of Network Time Protocol.  I had to duck, as I thought he was going to throw the clock at my head.  I hung around and explained it to him, that the Network Time Protocol (NTP) is a protocol for synchronizing the clocks of computer systems over packet-switched TCP/IP networks. NTP uses UDP port 123 as its transport layer.  On i5/OS we can configure TCP/IP to use Simple Network Time Protocol (SNTP) to adjust our time to keep all our machines with the same time or even one machine with the correct time.  i5/OS can be a server, where it can provide the time to others, or it can be a client and get the time from another NTP server.  In addition to all this, it’s quick and easy to setup.  I will run through the steps required to configure your machine to get the time from an NTP server.  To do this, we are going to use the Change SNTP Attributes command, CHGNTPA.  Let’s enter that command on a command line, CHGNTPA, and hit F4.  The first parameter is the remote system to get the time from.  If you have a NTP server in your organization you can use that, otherwise I always use SUNDIAL.COLUMBIA.EDU and would enter that name for the remote system.  You need to change the parameter Client Autostart to *YES, Client Poll Interval to 60, Client Minimum Adjustment to 20, and Client Maximum Adjustment to 20 as well.   Then we have to start the TPC/IP service, by entering STRTCPSVR SERVER(*NTP).  That’s it! It’s that simple.  Once this is done, the System Value QTIMADJ is now set to QIBM_OS400_SNTP.  Now, I would make sure that you can Ping your time server before starting this.  If you try to Ping the server that I use, you will get back a confirmation from hickory.cc.columbia.edu and that is fine.  Obviously, you need to be able to ping and do SNTP through your firewall.  We performed these steps on the customer’s machines, and all of a sudden their machines both had the same time. You can do this if you have a single or multiple i5 machines.  This is great to do if you are doing High Availability and Replication as you are usually looking at time stamps of messages when you have a problem, and knowing the machines have the exact same time makes problem determination much easier.  I hope this gets your machines back on time.  You can also easily setup your HMC to get the time if you are on version 7 of the HMC.  If you aren’t at version 7, you don’t know all the great new features and interfaces that you are missing.  Give iTech Solutions a call, and let us upgrade your HMC for you.
  Arms, Arms, Arms
 

 

I could start the article off saying that you can never have enough arms.  That would be disk arms, and not human limbs, although I think we can all imagine how easy it would be to get a few more chores done around the house if we had another set of arms.  Using that same analogy, our i5 can get a lot more work done when it has a lot of disk arms as well. This is because we can spread the load over more arms and not keep our disk arms so busy reading and writing data from our disks.
We need to talk about number of arms and the speed of the arms. We could also discuss the I/O adapters (Disk Controllers) that give work to the arms, but we will leave that for another month.  In case you didn’t know, IBM has announced that as of February 1, 2009 it will  stop selling (which means “not sold, but still supported”) 70GB SCSI and 282GB SCSI drives.  So the only SCSI drive will be 141GB. This is due to the fact that IBM is transitioning to SAS drives, as the industry is getting away from SCSI drives.   On some of the older systems, you will see disk sizes of 8GB, 17GB, 35GB, and many recent machines have 70GB disk drives.  So, in order to have a lot of disk space, the customer usually purchased many disk drives.  Each disk drive has an arm to get the data to and from the disk drive.  The more disk drives a machine has, the more disk I/O requests that can be performed per second (up to the maximum that the controller or I/O Adapter can handle).  So the more disk drives the better.  You can think of disk drives as assistants.  If you have 1000 papers to file with 5 assistants, you would give each assistant 200 papers each.  But if you had 10 assistants, each one would only need to file 100 papers.  This is a great way to think about the disk drives doing the I/O.  The older and smaller disk drives only spun at 7,200 and 10,000 RPMs (Revolutions Per Second).  As you can imagine, the faster that the disk is spinning, the faster that “spot” on the disk where our data is stored comes around for the disk arm to read it.  We actually measure this rotational delay in milliseconds. The newer drives spin at 15,000 RPMs, so you can see how much faster the data can be accessed.    Back to our assistants example, if we had assistants that moved through the office more quickly, and were able to get to the file cabinets more quickly, then they could file those same papers much faster.

This increase in the size of disk is more of a problem for the small customer than it is for the larger customer. Let me explain why. Many times when we are configuring a new system for a customer, their existing system may have 8 drives of 17.5GB for a total of 140GB.  If they are using Raid5, they have a useable disk space of 122GB for that machine.  Now we come in to configure a new replacement system, and the customer says, “We expect our disk capacity will double over the next 3 years. Therefore let’s size a box with about 250GB of disk.”  We wouldn’t want to create a system with only three 141GB drives in a Raid-set, so the customer would have usable disk space of 282GB. We would need to go with a minimum of four 141GB drives with Raid5, which would provide 564GB in total with 423GB usable with Raid5.  The customer says they only need 250GB of disk space, but we are recommending almost double that.  Why?  Because it’s not about capacity, it’s about the number of arms to handle requests.  It’s important that we look at the usage of the disk arms and not just the capacity, which is where most people tend to concentrate.  Therefore, with these larger disk drives, sometimes we have to put more disk capacity into a configuration, not because we want to sell more disk space, but because the customer needs the arms to handle the load of transactions being sent to disk. If you do a WRKDSKSTS, the last column on the right is the utilization of the disk.  This is so much more important than the capacity where people tend to focus.  When iTech Solutions comes in to configure a new machine, we like to spend a good amount of time looking at how much I/O you are actually performing, not just the capacity.  That is where the performance planning comes in.  We have a lot of experience with this type of analysis.  The last thing you want is to purchase a new machine that doesn’t have enough arms to handle your workload.

 UPS

No, I don’t mean the guys in the brown clothes with the brown trucks!  I am talking about Uninterruptible Power Supplies.  Summer is the time of year for brown-outs and black-outs here in the northeast, and July is a good month to discuss this topic.  While a UPS gives you protection from the small voltage spikes and the momentary power losses, a generator gets you though the longer power outages.   I think that every AS/400, iSeries, or i5 should be plugged into some sort of UPS power protection system.  If your machine is plugged into a UPS, when the power from the street is lost, the machine will run off of the batteries in the UPS.  How long that the machine can run is really determined by the capacity of the batteries as well as the load (how much is plugged in and pulling voltage).  A UPS can come in many sizes, and you have to match the size of the UPS with the requirements of what in your data center is plugged into the UPS, along with how long do you want to continue to run.

i5/OS can be alerted that the power to the UPS has been lost, and the UPS is currently running off its batteries.  I go to many shops and see the UPS and the machine is plugged into the UPS, but the signal cable is unattached.  The signal cable tells the machine the status of the UPS, i.e. if the UPS is running on street power or off the batteries.  If your machine is plugged into the UPS, and the UPS has no way to tell your i5 that its batteries are about to run out, then when the batteries do run out the machine will just crash.  We never want our machines to crash, as that involves an Abnormal IPL and error recovery on the way back up, which can really increase your IPL time.  Therefore, you should also insure that your signaling cable from your UPS to the i5 machine is connected.  This way, when the UPS looses power from the street, it can signal i5/OS that it is providing power from its batteries and now would be a good time to start an orderly shutdown of the machine.  You can write your own UPS Monitoring program if you want very fine control over bringing your machine down.  We have written these for a few of our customers.

One important note, if your signaling cable is not connected, do not connect the cable during the middle of the day, as in some instances it will cause your machine to think there has been a loss of power and start a shutdown of the machine.  The time to connect the cable is during your planned maintenance window.  You might also want to test what would happen in the event of a real power outage.  First, your machine is connected via the signal cable to the UPS, and ALL of your plugs are plugged into the UPS.  Make sure you are not running anything on your i5 and that you have all your users signed off, just in case.  Then disconnect your UPS from its power, and your machine should still be running (great way to test the batteries of your UPS). Then your UPS should signal the i5 to start to power down, and the machine will shutdown.

By the way, we recommend plugging your console into the UPS, and your Ethernet switches as well, if your UPS has the capacity for the extra load.  This way, in the event of losing power, your connections will not terminate, nor will your Ethernet lines go into a Failed state, preventing connection when the power comes back on.

Release levels and PTFs
People are always asking me how often they should be performing PTF maintenance, and when is the right time to upgrade their operating system.  I updated this article from last month with the current levels of PTFs. Let’s look at PTFs.  First, PTFs are Program Temporary Fixes that are created by IBM to fix a problem that has occurred or to possibly prevent a problem from occurring.  In addition, some times PTFs add new functionality, security, or improve performance.  Therefore, I am always dumbfounded as to why customers do not perform PTF maintenance on their machine at least quarterly.  If IBM has come out with a fix for your disk drives, why do you want to wait for your disk drive to fail with that problem, only to be told that there is a fix for that problem, and if you had applied the PTF beforehand, you would have averted the problem.  Therefore, I think a quarterly PTF maintenance strategy is a smart move.  Many of our customer’s are on our quarterly PTF maintenance program, and that provides them with the peace of mind of knowing their system is up to date on PTFs.  Below is a table of the major group PTFs for the last few releases.  You might notice that this week, IBM just created a new Security PTF Group, so I have added this to our list, as we are installing this for our customers on iTech Solutions Quarterly Maintenance program.

Releases
6.1                   V5R4              V5R3              V5R2
Cumulative Package                     8190               8183               8085               6080

Group Hipers                                   15                    78                    153                 189

Database Group                              5                      16                    22                    25

Java Group                                       3                      15                    21                    27

Print Group                                      3                      23                    15                    7

Backup/Recovery                            2                      21                    29                    31

Security Group                                2                      2                      3                      –

The easiest way to check your levels is to issue the command WRKPTFGRP.  They should all have a status of installed, and you should be up to the latest for all the above, based upon your release.  Now there are more groups than the ones listed above, but these are the general ones that most people require.  We can help you know which group PTFs you should be installing on your machine based upon your licensed programs. Here is a nice tidbit.  The Cumulative PTF package number is broken down as YDDD, where Y is the year and DDD is the day it was released.  Therefore, if we look at the cumulative package for V5R4, the ID is 8183. We can determine that it was created on the 183rd day of 2008, which is July 1st, 2008.  Look at your machine and this will give you a quick indication of just how far out of date in PTFs you may be.  I left V5R1 off the list, because if you are on V5R1, you don’t need to be worrying about PTFs, you really need to be upgrading your operating system.  The same can be said for V5R2 and V5R3, but there are still customers who are on those releases.

If you have an HMC, you should be running V7.3.3, with PTF MH01105 installed. For your Flexible Service Processor (FSP) that is inside your Power 5 or Power5+ (520, 515, 525, 550, 570), the level should be 01_SF240_338. Power 6 customers will have the latest FSP code installed since those processors are new.  If you need help with upgrading your HMC or FSP just give us a call.  We will be happy to perform the function for you.

Leave a Reply

Your email address will not be published. Required fields are marked *

*