skip to Main Content

I am NOT a programmer but a System Integrator with experience since DOS

I bought a used Barebone PC and it has some minor issues:
It is sometimes crashing, which is not connected to the RAM

its running debian KVM (proxmox) on the HOST and on top CentOS and Windows VMs

I have this error in mcelog on debian

Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 8 TSC 25f5e6ef72
MISC 12dc0 ADDR 372c9000007c2f6
TIME 1614950322 Fri Mar  5 14:18:42 2021
MCG status:
MCi status:
Corrected error
MCi_MISC register valid
MCi_ADDR register valid
Threshold based error status: green
MCA: corrected filtering (some unreported errors in same region)
**Generic CACHE Level-3 Generic Error**
STATUS 8c2000800001110b MCGSTATUS 0
MCGCAP c09 APICID 0 SOCKETID 0
MICROCODE ca
CPUID Vendor Intel Family 6 Model 142 Step 9

Question:

Is it generally possible to disable only L3 cache? the CPU otherwise might work
I was reading another article on stackoverflow where the cache completely got disabled L1L2L3
and the machine was too slow for running X

i found this trick, do i disable cache with this?

x:~# cat /proc/mtrr
reg00: base=0x080000000 ( 2048MB), size= 2048MB, count=1: uncachable
reg01: base=0x07c000000 ( 1984MB), size=   64MB, count=1: uncachable
reg02: base=0x07b800000 ( 1976MB), size=    8MB, count=1: uncachable
x:~# echo disable=00 > /proc/mtrr
x:~# echo disable=01 > /proc/mtrr
x:~# echo disable=02 > /proc/mtrr
x:~# cat /proc/mtrr
x:~#

I am Curious, if this is my first long lasting stackoverflow post,
maybe unknown will again delete it because unknown has not learned about freedom of speech 🙂
censorship forever!

2

Answers


  1. Intel has not publicly disclosed how to only disable the L3 cache on most processors, including the Core i7-7567U. Disabling the MTRRs does effectively disable all of the three levels of caches on your processor because all accesses become of type UC (meaning uncacheable), with one possible exception discussed below.

    The /proc/mtrr file only list the enabled variable-range MTRRs. However, it doesn’t show you all of the MTRRs. Any other variable-range MTRRs are disabled and you don’t have to worry about them. The fixed-rage MTRRs are still enabled though. These specify memory types of fixed ranges in the bottom 1 MB of the physical address space. Disabling all of the MTRRs listed by /proc/mtrr won’t disable or affect the fixed-range MTRRs. It’s typical for some of the fixed ranges to have cacheable memory types.

    According to the relevant memory type resolution rules, a physical memory address not contained in the range of any enabled MTRR has the memory type specified in the lowest 8 bits of the IA32_MTRR_DEF_TYPE MSR register. This type is UC on most or all x86 production systems. You can determine the default type by executing sudo rdmsr -a 0x2ff and checking the lowest 8 bits of the output for each logical core. Note that the MTRRs are actually per physical core, but rdmsr offers no switch to only run one one of the logical cores per physical core.

    If you want to disable all MTRRs, the best way is to set bit 11 to zero by executing wrmsr -a 0x2ff 15 0x400, which forces the entire physical address space to be UC. You don’t need to change anything in /proc/mtrr and it’s better to just keep it as is. The -a option is important here because you usually want memory types to not be dependent on which core the code happens to be running on.

    There are still a couple of issues with this simple approach. Modern processors include additional MTRRs specific for memory ranges used in system management mode (SMM). These MTRRs can only be modified in SMM. When enabled, any accesses outside of SMM to the memory ranges configured in its MTRRs are ignored. On my system, the memory type specified for the SMM range is WB, so it’s cacheable.

    [Temporary notice: Thinking more about it, I’m not sure whether IA32_MTRR_DEF_TYPE[11] controls also the SMM range registers. I’ll have to check with Intel. If it doesn’t, then on processors that support SMM MTRRs, which include yours, the only way to disable caching entirely is by setting CR0.CD to 1. If it does, then no problem.]

    Another issue is that I don’t think wrmsr -a 0x2ff 15 0x400 adheres to the recommended procedure for changing MTRRs consistently on all cores of the system, so I would only try it for experimental purposes. On production systems, you may have to write a kernel module to do it properly as described in the manuals.

    I don’t think it’s required to writeback and invalidate the caches before disabling the MTRRs because UC accesses are looked up in the caches as well. But I’m unable to find a statement from the Intel or AMD manuals to confirm this at this time.

    Login or Signup to reply.
  2. Intel CPU Family 6 Model 142 (0x8E) refers to Core processors of the 7-9th generations. All of these processors have an "inclusive L3" cache — all lines in any L1 or L2 cache must also be cached in the L3. "Disabling" the L3 could only work if there were a mode bit that prevented the L3 from caching data, while still allowing the L3 directory to perform its function in managing cache coherence.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search