Decoding PCIe controller errors

I’m trying to debug a PCIe card that is misbehaving when given invalid instructions:

[44337.238118774,3] PHB#0030[8:0]:                  brdgCtl = 00000002
[44337.238194214,3] PHB#0030[8:0]:             deviceStatus = 00060020
[44337.238243488,3] PHB#0030[8:0]:               slotStatus = 00402000
[44337.238280889,3] PHB#0030[8:0]:               linkStatus = a0840008
[44337.238323609,3] PHB#0030[8:0]:             devCmdStatus = 00100107
[44337.238364182,3] PHB#0030[8:0]:             devSecStatus = 00000800
[44337.238418968,3] PHB#0030[8:0]:          rootErrorStatus = 00000000
[44337.238454884,3] PHB#0030[8:0]:          corrErrorStatus = 00000000
[44337.238495038,3] PHB#0030[8:0]:        uncorrErrorStatus = 00000000
[44337.238527350,3] PHB#0030[8:0]:                   devctl = 00000020
[44337.238562198,3] PHB#0030[8:0]:                  devStat = 00000006
[44337.238599169,3] PHB#0030[8:0]:                  tlpHdr1 = 00000000
[44337.238638934,3] PHB#0030[8:0]:                  tlpHdr2 = 00000000
[44337.238675689,3] PHB#0030[8:0]:                  tlpHdr3 = 00000000
[44337.238736732,3] PHB#0030[8:0]:                  tlpHdr4 = 00000000
[44337.238767844,3] PHB#0030[8:0]:                 sourceId = 00000000
[44337.238805307,3] PHB#0030[8:0]:                     nFir = 0000000000000000
[44337.238846664,3] PHB#0030[8:0]:                 nFirMask = 0030001c00000000
[44337.238890950,3] PHB#0030[8:0]:                  nFirWOF = 0000000000000000
[44337.238934677,3] PHB#0030[8:0]:                 phbPlssr = 0000001c00000000
[44337.238975390,3] PHB#0030[8:0]:                   phbCsr = 0000001c00000000
[44337.239016337,3] PHB#0030[8:0]:                   lemFir = 0000000110000080
[44337.239054651,3] PHB#0030[8:0]:             lemErrorMask = 0000000000000000
[44337.239095165,3] PHB#0030[8:0]:                   lemWOF = 0000000000000080
[44337.239136147,3] PHB#0030[8:0]:           phbErrorStatus = 00000a8000000000
[44337.239177044,3] PHB#0030[8:0]:      phbFirstErrorStatus = 0000020000000000
[44337.239217947,3] PHB#0030[8:0]:             phbErrorLog0 = 2148000098000240
[44337.239260276,3] PHB#0030[8:0]:             phbErrorLog1 = a008400000000000
[44337.239303788,3] PHB#0030[8:0]:        phbTxeErrorStatus = 0000000000000000
[44337.239342215,3] PHB#0030[8:0]:   phbTxeFirstErrorStatus = 0000000000000000
[44337.239384268,3] PHB#0030[8:0]:          phbTxeErrorLog0 = 0000000000000000
[44337.239425141,3] PHB#0030[8:0]:          phbTxeErrorLog1 = 0000000000000000
[44337.239464823,3] PHB#0030[8:0]:     phbRxeArbErrorStatus = 0000000800000000
[44337.239508182,3] PHB#0030[8:0]: phbRxeArbFrstErrorStatus = 0000000800000000
[44337.239545050,3] PHB#0030[8:0]:       phbRxeArbErrorLog0 = ff10030000000100
[44337.239585980,3] PHB#0030[8:0]:       phbRxeArbErrorLog1 = 00002000449f0a00
[44337.239630313,3] PHB#0030[8:0]:     phbRxeMrgErrorStatus = 0000000000000000
[44337.239672299,3] PHB#0030[8:0]: phbRxeMrgFrstErrorStatus = 0000000000000000
[44337.239711936,3] PHB#0030[8:0]:       phbRxeMrgErrorLog0 = 0000000000000000
[44337.239754141,3] PHB#0030[8:0]:       phbRxeMrgErrorLog1 = 0000000000000000
[44337.239794980,3] PHB#0030[8:0]:     phbRxeTceErrorStatus = 6000000000000000
[44337.239884589,3] PHB#0030[8:0]: phbRxeTceFrstErrorStatus = 2000000000000000
[44337.239929101,3] PHB#0030[8:0]:       phbRxeTceErrorLog0 = 4000000000000000
[44337.239979895,3] PHB#0030[8:0]:       phbRxeTceErrorLog1 = 0000000000000000
[44337.240025603,3] PHB#0030[8:0]:        phbPblErrorStatus = 0000000000020000
[44337.240067517,3] PHB#0030[8:0]:   phbPblFirstErrorStatus = 0000000000020000
[44337.240110980,3] PHB#0030[8:0]:          phbPblErrorLog0 = 0000000000000000
[44337.240150225,3] PHB#0030[8:0]:          phbPblErrorLog1 = 0000000000000000
[44337.240192242,3] PHB#0030[8:0]:      phbPcieDlpErrorLog1 = 0000000000000000
[44337.240234460,3] PHB#0030[8:0]:      phbPcieDlpErrorLog2 = 0000000000000000
[44337.240275206,3] PHB#0030[8:0]:    phbPcieDlpErrorStatus = 0000000000000000
[44337.240319756,3] PHB#0030[8:0]:       phbRegbErrorStatus = 0000004000000000
[44337.240364406,3] PHB#0030[8:0]:  phbRegbFirstErrorStatus = 0000004000000000
[44337.240405459,3] PHB#0030[8:0]:         phbRegbErrorLog0 = 8800003c00000000
[44337.240446260,3] PHB#0030[8:0]:         phbRegbErrorLog1 = 0000000000000200
[44337.240489527,3] PHB#0030[8:0]:                PEST[000] = 8000b03800000000 8000000000000000
[44337.240542518,3] PHB#0030[8:0]:                PEST[1ff] = 3740002a02000000 0000000000000000

Now, I have a long dump of register values. Is there a document I can follow to make sense of these, or a decoder tool?

I’ve looked at “POWER9_PCIe_controller_v11_27JUL2018_pub.pdf”, but that uses different names than what OPAL uses in the register dump, so I suspect this is not the correct document (also, are bit numbers in this document counted from MSB or LSB?).