Skip to content

Commit c5b48fa

Browse files
ukaszsuryasaimadhu
authored andcommitted
EDAC, sb_edac: Fix channel reporting on Knights Landing
On Intel Xeon Phi Knights Landing processor family the channels of the memory controller have untypical arrangement - MC0 is mapped to CH3,4,5 and MC1 is mapped to CH0,1,2. This causes the EDAC driver to report the channel name incorrectly. We missed this change earlier, so the code already contains similar comment, but the translation function is incorrect. Without this patch: errors in DIMM_A and DIMM_D were reported in DIMM_D errors in DIMM_B and DIMM_E were reported in DIMM_E errors in DIMM_C and DIMM_F were reported in DIMM_F Correct this. Hubert Chrzaniuk: - rebased to 4.8 - comments and code cleanup Fixes: d0cdf90 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support") Reviewed-by: Tony Luck <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Hubert Chrzaniuk <[email protected]> Cc: linux-edac <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: <[email protected]> # v4.5.. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Lukasz Odzioba <[email protected]> [ Boris: Simplify a bit by removing char mc. ] Signed-off-by: Borislav Petkov <[email protected]>
1 parent 29b4817 commit c5b48fa

File tree

1 file changed

+11
-4
lines changed

1 file changed

+11
-4
lines changed

drivers/edac/sb_edac.c

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -552,9 +552,9 @@ static const struct pci_id_table pci_dev_descr_haswell_table[] = {
552552
/* Knight's Landing Support */
553553
/*
554554
* KNL's memory channels are swizzled between memory controllers.
555-
* MC0 is mapped to CH3,5,6 and MC1 is mapped to CH0,1,2
555+
* MC0 is mapped to CH3,4,5 and MC1 is mapped to CH0,1,2
556556
*/
557-
#define knl_channel_remap(channel) ((channel + 3) % 6)
557+
#define knl_channel_remap(mc, chan) ((mc) ? (chan) : (chan) + 3)
558558

559559
/* Memory controller, TAD tables, error injection - 2-8-0, 2-9-0 (2 of these) */
560560
#define PCI_DEVICE_ID_INTEL_KNL_IMC_MC 0x7840
@@ -1286,7 +1286,7 @@ static u32 knl_get_mc_route(int entry, u32 reg)
12861286
mc = GET_BITFIELD(reg, entry*3, (entry*3)+2);
12871287
chan = GET_BITFIELD(reg, (entry*2) + 18, (entry*2) + 18 + 1);
12881288

1289-
return knl_channel_remap(mc*3 + chan);
1289+
return knl_channel_remap(mc, chan);
12901290
}
12911291

12921292
/*
@@ -2997,8 +2997,15 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
29972997
} else {
29982998
char A = *("A");
29992999

3000-
channel = knl_channel_remap(channel);
3000+
/*
3001+
* Reported channel is in range 0-2, so we can't map it
3002+
* back to mc. To figure out mc we check machine check
3003+
* bank register that reported this error.
3004+
* bank15 means mc0 and bank16 means mc1.
3005+
*/
3006+
channel = knl_channel_remap(m->bank == 16, channel);
30013007
channel_mask = 1 << channel;
3008+
30023009
snprintf(msg, sizeof(msg),
30033010
"%s%s err_code:%04x:%04x channel:%d (DIMM_%c)",
30043011
overflow ? " OVERFLOW" : "",

0 commit comments

Comments
 (0)