Loader: Improve physical memory mapping performance at startup #113

JonasKruckenberg · 2024-09-15T09:33:17Z

The loader automatically maps all of physical memory at a chosen offset in kernel space (for having access to page tables, introspection of the kernels elf file etc).
The current implementation is quite sub-par though as it maps the huge range using the most fine-grained 4k pages (wasting a lot of cycles and memory in the process). Instead we should be mapping with megapages or gigapages (the region allocated by the virt_alloc for mapping is already gigapage-aligned).

Maybe we can even figure out a smart way to map the region using the fewest pages possible by mapping as much as we can using gigapages, then stepping down to megapages and map as much as possible using those and only map using regular pages for the rest of the requested range.

alegnani · 2024-10-25T22:09:39Z

I was having a look at the issue and realized there would need to be quite a lot of changes:
The current implementation relies on walk_mut to traverse the page table hierarchy and calls on_leaf at level 0. As we want to map superpages we would need to call this function earlier. If we are only performing translation we can get away by checking if the attributes of the PTE match Mode::ENTRY_FLAGS_LEAF. This is not going to work when we first map the page and we would need to pass additional context to force walk_mut to call on_leaf early. This could be fixed by either passing a level at which we want to stop our walk or not use the walk function at all. The latter would probably have better performance as it doesn't require us to walk the page tables every time.

Just wanted to have your opinion before hacking away.

JonasKruckenberg · 2024-10-26T11:18:38Z

First off: heya ✌️
second: yeah what you're saying sounds about right. Thinking ahead we'll definitely benefit from being able the explicitly specify the level to map at (e.g. for wasm linear memories we'll always map 6gib slots)
if you can find a way to do this without having to recursively walk the page table every time even better!

also: the current mapping code is a bit overengineered bc I thought we'd need to support mutating the page table tree which turns out we don't really, so dont hesitate to do bigger sweeping changes if thats needed for faster mapping, no one is emotionally attached to that code :)

alegnani · 2024-12-08T10:51:02Z

Finally managed to work on superpage support.
I implemented the algorithm that splits a memory region into a region that is mappable with a superpage (in this case a 1GiB or 2MiB page) and 2 optional padding regions as part of the map_range function.

Unfortunately the virtual and physical memory regions used to call map_range from map_physical_memory are not at the same offset from a superpage, meaning they can only be mapped with 4K pages. This is because both the physical and virtual addresses have to be correctly aligned.
A (stupid) example could be physical memory 0x0..0x1000000 (which is aligned on 2MiB boundary) and virtual memory 0x1000..0x1001000 (0x1000 offset from 2MiB boundary).

This could be fixed in map_physical_memory by breaking up the physical memory and having multiple mappings. Continuing the example above we would map virtual 0x1000...0x1000000 to physical 0x1000..0x1000000 and virtual 0x1000000..0x1001000 to physical 0x0..0x1000.

Is there any other way to make this work? If not, are there any downsides to this approach?

JonasKruckenberg · 2024-12-12T21:08:53Z

Hey @alegnani that seems like a totally suitable solution to me, but tbh I don't fully grasp the nuances from just a description so feel free to open a pull request and we could discuss there further ;) at the same time, I have been working on a different implementation in the meantime (bc it was blocking more useful higher level virtmem features) and came up with this #158 (specifically this

k23/libs/pmm/src/address_space.rs

Line 60 in 12ad8bd

pub fn map_contiguous(

) if you want you can have a look at that implementation and see if that aligns with yours/what we could do to improve it further maybe 🤔

JonasKruckenberg added good first issue Good for newcomers enhancement New feature or request labels Sep 15, 2024

alegnani mentioned this issue Dec 13, 2024

Super page support #166

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loader: Improve physical memory mapping performance at startup #113

Loader: Improve physical memory mapping performance at startup #113

JonasKruckenberg commented Sep 15, 2024

alegnani commented Oct 25, 2024

JonasKruckenberg commented Oct 26, 2024 •

edited

Loading

alegnani commented Dec 8, 2024

JonasKruckenberg commented Dec 12, 2024

Loader: Improve physical memory mapping performance at startup #113

Loader: Improve physical memory mapping performance at startup #113

Comments

JonasKruckenberg commented Sep 15, 2024

alegnani commented Oct 25, 2024

JonasKruckenberg commented Oct 26, 2024 • edited Loading

alegnani commented Dec 8, 2024

JonasKruckenberg commented Dec 12, 2024

JonasKruckenberg commented Oct 26, 2024 •

edited

Loading