I’m trying to debug a PCIe card that is misbehaving when given invalid instructions:
[44337.238118774,3] PHB#0030[8:0]: brdgCtl = 00000002
[44337.238194214,3] PHB#0030[8:0]: deviceStatus = 00060020
[44337.238243488,3] PHB#0030[8:0]: slotStatus = 00402000
[44337.238280889,3] PHB#0030[8:0]: linkStatus = a0840008
[44337.238323609,3] PHB#0030[8:0]: devCmdStatus = 00100107
[44337.238364182,3] PHB#0030[8:0]: devSecStatus = 00000800
[44337.238418968,3] PHB#0030[8:0]: rootErrorStatus = 00000000
[44337.238454884,3] PHB#0030[8:0]: corrErrorStatus = 00000000
[44337.238495038,3] PHB#0030[8:0]: uncorrErrorStatus = 00000000
[44337.238527350,3] PHB#0030[8:0]: devctl = 00000020
[44337.238562198,3] PHB#0030[8:0]: devStat = 00000006
[44337.238599169,3] PHB#0030[8:0]: tlpHdr1 = 00000000
[44337.238638934,3] PHB#0030[8:0]: tlpHdr2 = 00000000
[44337.238675689,3] PHB#0030[8:0]: tlpHdr3 = 00000000
[44337.238736732,3] PHB#0030[8:0]: tlpHdr4 = 00000000
[44337.238767844,3] PHB#0030[8:0]: sourceId = 00000000
[44337.238805307,3] PHB#0030[8:0]: nFir = 0000000000000000
[44337.238846664,3] PHB#0030[8:0]: nFirMask = 0030001c00000000
[44337.238890950,3] PHB#0030[8:0]: nFirWOF = 0000000000000000
[44337.238934677,3] PHB#0030[8:0]: phbPlssr = 0000001c00000000
[44337.238975390,3] PHB#0030[8:0]: phbCsr = 0000001c00000000
[44337.239016337,3] PHB#0030[8:0]: lemFir = 0000000110000080
[44337.239054651,3] PHB#0030[8:0]: lemErrorMask = 0000000000000000
[44337.239095165,3] PHB#0030[8:0]: lemWOF = 0000000000000080
[44337.239136147,3] PHB#0030[8:0]: phbErrorStatus = 00000a8000000000
[44337.239177044,3] PHB#0030[8:0]: phbFirstErrorStatus = 0000020000000000
[44337.239217947,3] PHB#0030[8:0]: phbErrorLog0 = 2148000098000240
[44337.239260276,3] PHB#0030[8:0]: phbErrorLog1 = a008400000000000
[44337.239303788,3] PHB#0030[8:0]: phbTxeErrorStatus = 0000000000000000
[44337.239342215,3] PHB#0030[8:0]: phbTxeFirstErrorStatus = 0000000000000000
[44337.239384268,3] PHB#0030[8:0]: phbTxeErrorLog0 = 0000000000000000
[44337.239425141,3] PHB#0030[8:0]: phbTxeErrorLog1 = 0000000000000000
[44337.239464823,3] PHB#0030[8:0]: phbRxeArbErrorStatus = 0000000800000000
[44337.239508182,3] PHB#0030[8:0]: phbRxeArbFrstErrorStatus = 0000000800000000
[44337.239545050,3] PHB#0030[8:0]: phbRxeArbErrorLog0 = ff10030000000100
[44337.239585980,3] PHB#0030[8:0]: phbRxeArbErrorLog1 = 00002000449f0a00
[44337.239630313,3] PHB#0030[8:0]: phbRxeMrgErrorStatus = 0000000000000000
[44337.239672299,3] PHB#0030[8:0]: phbRxeMrgFrstErrorStatus = 0000000000000000
[44337.239711936,3] PHB#0030[8:0]: phbRxeMrgErrorLog0 = 0000000000000000
[44337.239754141,3] PHB#0030[8:0]: phbRxeMrgErrorLog1 = 0000000000000000
[44337.239794980,3] PHB#0030[8:0]: phbRxeTceErrorStatus = 6000000000000000
[44337.239884589,3] PHB#0030[8:0]: phbRxeTceFrstErrorStatus = 2000000000000000
[44337.239929101,3] PHB#0030[8:0]: phbRxeTceErrorLog0 = 4000000000000000
[44337.239979895,3] PHB#0030[8:0]: phbRxeTceErrorLog1 = 0000000000000000
[44337.240025603,3] PHB#0030[8:0]: phbPblErrorStatus = 0000000000020000
[44337.240067517,3] PHB#0030[8:0]: phbPblFirstErrorStatus = 0000000000020000
[44337.240110980,3] PHB#0030[8:0]: phbPblErrorLog0 = 0000000000000000
[44337.240150225,3] PHB#0030[8:0]: phbPblErrorLog1 = 0000000000000000
[44337.240192242,3] PHB#0030[8:0]: phbPcieDlpErrorLog1 = 0000000000000000
[44337.240234460,3] PHB#0030[8:0]: phbPcieDlpErrorLog2 = 0000000000000000
[44337.240275206,3] PHB#0030[8:0]: phbPcieDlpErrorStatus = 0000000000000000
[44337.240319756,3] PHB#0030[8:0]: phbRegbErrorStatus = 0000004000000000
[44337.240364406,3] PHB#0030[8:0]: phbRegbFirstErrorStatus = 0000004000000000
[44337.240405459,3] PHB#0030[8:0]: phbRegbErrorLog0 = 8800003c00000000
[44337.240446260,3] PHB#0030[8:0]: phbRegbErrorLog1 = 0000000000000200
[44337.240489527,3] PHB#0030[8:0]: PEST[000] = 8000b03800000000 8000000000000000
[44337.240542518,3] PHB#0030[8:0]: PEST[1ff] = 3740002a02000000 0000000000000000
Now, I have a long dump of register values. Is there a document I can follow to make sense of these, or a decoder tool?
I’ve looked at “POWER9_PCIe_controller_v11_27JUL2018_pub.pdf”, but that uses different names than what OPAL uses in the register dump, so I suspect this is not the correct document (also, are bit numbers in this document counted from MSB or LSB?).