The kernel is the core of any Operating system, making it the happiest place for any malware to run its malicious code. If the malware finds a place to stay in the kernel it is far from the reach of any antivirus software, the whole system is doomed Basically. That is due to the unaltered and utmost privileges it gets running on ring 0.
Today, we will see what feature the Linux Kernel provides to prevent any malware from finding its happy place inside the Kernel. The kernel self-protection describes many features to prevent any rootkit or malware to stick into the kernel and operate in ring 0 or what I like to call as The ring of GOD !!
Attack Surface Reduction
If we want to reduce the attacks, we should limit the ways by which it can happen. This means finding and fixing any weaknesses in our software and networks. By reducing the number of ways hackers can get in, we can significantly decrease the risk of them accessing our systems without permission and stealing our data.
Executable code and read-only data must not be writable
Any areas of the kernel with executable memory must not be writable, this includes additional places such as kernel module, JIT memory, etc.
To support things like instruction alternatives, breakpoints, kprobes, etc. There is a temporary exception in which the memory is temporarily made writable during the update and then returned to the original permissions.
Most architectures have these options on by default and not user selectable. Except for ARCH BTW !
Function pointers and sensitive variables must not be writable
Many such variables can be made read-only by setting them const
so that they live in the .rodata
section instead of the .data
section of the kernel, gaining the protection of the kernel’s strict memory permissions as described above.
When being updated, only the CPU thread performing the update would be given uninterruptible write access to the memory.
Segregation of kernel memory from user space memory
The kernel must never execute user space memory. The kernel must also never access user space memory without explicit expectation to do so.
By blocking user space memory in this way, execution and data parsing cannot be passed to trivially controlled user space memory, forcing attacks to operate entirely in kernel memory.
This can be implemented both by hardware-based restrictions eg. x86’s SMEP/SMAP or via emulation eg. ARM’s Memory Domain
Restricting access to kernel modules
The kernel should never allow an unprivileged user the ability to load specific kernel modules. i.e. only the root user should have access to the kernel loading feature.
Memory integrity
Many memory structures in the kernel are regularly abused to gain execution control during an attack.
By far the most commonly understood is that of the stack buffer overflow in which the return address stored on the stack is overwritten.
Stack buffer overflow
The classic stack buffer overflow involves writing past the expected end of a variable stored on the stack, ultimately writing a controlled value to the stack frame’s stored return address.
The most widely used defence is the presence of a stack canary between the stack variables and the return address CONFIG_STACKPROTECTOR
, which is verified just before the function returns. Other defences include things like shadow stack.
Stack depth overflow
A less well-understood attack is using a bug that triggers the kernel to consume stack memory with deep function calls or large stack allocations. With this attack, it is possible to write beyond the end of the kernel’s preallocated stack space and into sensitive structures.
Two important changes need to be made for better protection: moving the sensitive thread_info structure elsewhere and adding a faulting memory hole at the bottom of the stack to catch these overflows.
Kernel Address Space Layout Randomization (KASLR)
Since the location of kernel memory is almost always instrumental in mounting a successful attack, making the location non-deterministic raises the difficulty of an exploit. (Note that this, in turn, makes the value of information exposures higher, since they may be used to discover desired memory locations.)
Text and module base
By relocating the physical and virtual base address of the kernel at boot-time. Additionally, offsetting the module loading base address means that even systems that load the same set of modules in the same order every boot will not share a common base address with the rest of the kernel text.
Stack base
If the base address of the kernel stack is not the same between processes, or even not the same between syscalls, targets on or beyond the stack become more difficult to locate.
Dynamic memory base
Much of the kernel’s dynamic memory (e.g. kmalloc, vmalloc, etc) ends up being relatively deterministic in layout due to the order of early-boot initializations.
Structure layout
By performing per-build randomization of the layout of sensitive structures, attacks must either be tuned to known kernel builds or expose enough kernel memory to determine structure layouts before manipulating them.
Kernels 4.14 and older printed the raw address using
%p
. As of 4.15-rc1 addresses printed with the specifier%p
are ;hashed before printing