POWER9 POWERVM Incomplete State During I/O Concurrent Repair or Dynamic LPAR of IBM i owned hardware

Pete Massiello, President, iTech Solutions

Problem

This is only occurring on Power9 servers, and you need to get the FSP firmware updated, as well as a patch for IBM i.  If you don’t have the patches, do not use concurrent maintenance, or dynamic LPAR on POWER9 servers under the following conditions:

  1. The resource is a physical PCIe adapter, an EMX0 cable, fanout module, chassis management card, or midplane.
  2. The resource physically owned by an IBM i partition
  3. The firmware fix is not applied.

Without the fix applied, the server can go to an incomplete state during the operation requiring server level IPL to recover.

An incomplete state can occur only on a POWER9 PowerVM managed server at some time after one of the following operations:

  1. Dynamic Logical Partition (DLPAR) operation to remove or move a physical PCIe adapter from a logical partition (LPAR).
  2. Concurrent replacement of a physical PCIe adapter assigned to an LPAR.
  3. Concurrent replacement of a Cable Card, cable, fanout module, mid-plane, or chassis management card from an EMX0 I/O drawer.

The condition is set up when one of these operations is blocked by the hypervisor because the resource is still in use.  When concurrent maintenance is performed, the hypervisor ends with a return code of 0x0300.  When the partition is powered down and the operation retried, the server can go to an incomplete state.  Currently, the only known trigger for this problem resides in IBM i operating system.

Once the server is in an incomplete state from this defect, the server must be re-IPLed to recover management full operational capability.

Resolving the Problem

A firmware fix can be applied to prevent this incomplete state from occurring.  This fix is provided in the following system firmware fix levels:

FW Level Released or Planned Date
FW910.50 Vx910_xxx Planned
FW920.50 Vx920_118 Released November 25 2019
FW930.20 Vx930_xxx Planned
FW940.00 Vx940_027 Released November 22 2019

Future firmware releases contain this fix.

The following IBM i APARs must be applied to partitions owning PCIe hardware in order to prevent the problem from being triggered.

APAR MA47837 LIC-OTHER-SRCB6005120-INCORROUT POWER9 DLPAR-add fails

R740           In progress

R730            MF66695      Not yet in a cumulative package

R720            MF66544      Not yet in a cumulative package

APAR MA47943 LIC-OTHER-SRCB6006965-INCORROUT DLPAR remove after add

R740         In progress

R730         MF66865      Not yet in a cumulative package

R720         In progress

If you need help in upgrading your firmware, or applying the PTFs for IBM i, please contact us.   We also offer a subscription package where we will do 3 sets of PTFs over 2 years, and 1 OS upgrade over that same 2 year period, all for only $295 a month.  Take the worry out of PTFs.

Tagged with: , , , , , , ,
One comment on “POWER9 POWERVM Incomplete State During I/O Concurrent Repair or Dynamic LPAR of IBM i owned hardware
  1. Diego Kesselman says:

    Pete, nice article, but we have this issue with latest vHMC connected to a brand new S914 with latest server firmware. Any clue?

Leave a Reply

Your email address will not be published. Required fields are marked *

*