Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Back in 1982, when Intel released the 80286, they added 4 privilege levels to the segmentation scheme (rings 0-3), specified by 2 bits in the Global Descriptor Table (GDT) and Local Descriptor Table (LDT).

In the 80386 processor, Intel added paging, but surprisingly, it only has 2 privilege levels (supervisor and user), specified by a single bit in the Page Directory Entry (PDE) and Page Table Entry (PTE).

This means that an OS that only uses paging (like most modern OSes) is unable to benefit from the existence of rings 1 and 2, which could be very useful, for example, for drivers. (Win9x, for example, frequently crashed because it was loading buggy unchecked drivers into ring 0).

From the POV of portability, the existence of rings 1 and 2 is a quirk of the x86 architecture and portable OSes shouldn't use them, because other architectures only have 2 privilege levels.

But I am sure that portability to other platforms is not what Intel engineers were thinking back in 1985 when they were designing the 386.

So why didn't Intel allow paging to have 4 privilege levels, like segmentation?

question from:https://stackoverflow.com/questions/66053495/why-does-x86-paging-have-no-concept-of-privilege-rings

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.0k views
Welcome To Ask or Share your Answers For Others

1 Answer

One guess that occurs to me is that Intel intended that when Ring 1 code is running, it is the supervisor, "supervising" ring 3 code. Not ring 1 running under ring 0.

If the ring 1 code wants to call ring 0 code, it can call through a call-gate, and the ring 0 code can change CR3 to a page table that includes mappings for physical pages that weren't present in the page table the ring 1 or 2 code was using.

I really don't know a lot about this stuff, but https://wiki.osdev.org/Task_State_Segment shows that the TSS includes a CR3 field, so using hardware task-switching I'm guessing that calling through a call-gate can trigger the CR3 change directly. (So the call target does not already have to be mapped, otherwise ring 1 / 2 code could have modified it. Or it could be mapped read-only, along with the page table itself and the GDT, to stop the ring 1 code from taking over ring 0 by modifying it.)

This means that an OS that only uses paging [...] unable to benefit from the existence of rings 1 and 2

That's your mistake: you can't "only use paging". Even making interrupt handling from user-space work on a normal x86 OS (with a flat memory model) requires setting up TSS stuff to set ESP to the kernel stack pointer when switching to kernel mode, even if you don't otherwise use hardware task-switching.

x86 has "task gates" and "call gates" and all kinds of really complex stuff I hope I don't ever have to fully understand, but I expect that spending some time reading up on it might shed some light on the kind of things the architects of 386 thought OSes might want to do.

Separate from my previous guess (about ring 1 supervising ring 3), perhaps Intel expected OSes to use segmentation to separate ring 1 / 2 from ring 0 memory in the same page table if desired1. As you say, they probably weren't trying to create something that portable microkernel OSes could just use as a bonus.

A kernel has the luxury of deciding the layout of virtual address space, so it could well assign chunks of that for use by ring 1 code, setting up CS/DS/ES/SS appropriately when calling it.

I think that would have to mean a non-flat model, though, because x86 segmentation makes addresses go from 0..limit, not e.g. allowing access to a range of virtual addresses from low..high without changing the meaning of a pointer.

Footnote 1:

Is it necessary to have full memory protection between ring 0 and ring 1? An OS might use ring 1 for semi-trusted code.

Some privileged instructions require ring 0 so ring 1 would stop that from happening by accident. IO privilege level can be set separately to allow cli and in/out in ring > 0, but other instructions like invlpg, lgdt, and mov cr, reg require actual ring 0.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...