Skip to content

🚀 Cosmos Gen3 re-write proposal  #3088

Open
@ascpixi

Description

@ascpixi

Important

The prototype for the Gen3 re-write is located at valentinbreiz/nativeaot-patcher. We plan to graduate from the prototype in the future to cosmosos/gen3 (the actual repository name may change in the future).

Historically, Cosmos has had 2 major revisions: gen1, and gen2. I'd like to outline the need for a new major version, and potential major improvements that can be done with a rewrite.

I'm also looking for the feedback and approval of the maintainers (coredevs).

The need for a rewrite

In the past, Cosmos has been developed with actual, standalone kernels in mind - developers could use the toolkit to quickly develop domain-specific kernels. Nowadays, Cosmos is primarily used by people wanting to learn systems programming via a high-level language.

Cosmos, in its current state, is not viable for its original use-case - that is, domain specific kernels. The reasons for this are the following:

  • lack of stability. Kernels using Cosmos experience a high amount of unintended behavior, with the cause originating from Cosmos itself, and not the consuming projects.
  • lack of performance. The main reason for this is the IL2CPU compiler, which does not perform any kind of optimizations.
  • strict reliance on x86. Cosmos assumes that it is going to be compiled for x86 - not x86-64,nor ARM64. For domain specific kernels, architectures like ARM64 are dominant.

The current development efforts focus on the wrong thing - instead of improving the core design of Cosmos, or working on its stability issues, most of the manpower is dedicated to new features.

Issues with the current code-base

A rewrite, which would reuse select components of Cosmos, would take less time than to create several, small PRs to slowly bring the codebase to an acceptable level. I'd like to highlight some of the issues with its current state, namely:

Lack of thread safety

The entirety of Cosmos isn't designed for multi-threading in mind - no code has any kind of synchronization, and some methods even use hlt to wait:

public void Wait(uint timeoutMs)
{
waitSignaled = false;
RegisterTimer(new PITTimer(SignalWait, timeoutMs * 1000000UL, false));
while (!waitSignaled)
{
CPU.Halt();
}
}

In order to make such parts multithreaded without the use of a "big kernel lock", some portions of the project would have to be redesigned.

Poor API design

The public API surface of Cosmos could be redesigned for more extensibility, as well as to correct some OOP quirks, such as the Device class:

namespace Cosmos.HAL {
public abstract class Device {
}
}

If the API would be largely changed, it would require all users of Cosmos to update their projects to use the new, updated API. Such breaking changes should be limited to major versions - in our case, gen 3.

Inconsistent code style and readability

There hasn't been any standardized code style for the repository, and as such, different portions of the codebase use different code styles. The most prominent example of this are some locals using system Hungarian notation, which does not benefit C# code - the types are already apparent, as opposed to e.g. C. Some other locals, however, use the .NET code style.

A large amount of code also has been designed with older C# versions in mind - resulting in the lack of use of fluent APIs and poor code quality. Principles like DRY and composition over inheritance are also often overlooked.

The public API documentation is also in a poor state - for example:

/// <summary>
/// To string.
/// </summary>
/// <returns>string value.</returns>
public override string ToString()
{
return "Partition";
}

Such poor documentation is widespread throughout the repository.

Reliance on IL2CPU

Cosmos currently heavily relies on its compiler, IL2CPU. However, it is the largest cause of odd bugs, lack of debugging support, and performance problems. It also does not follow conventions set by other compilers.

Poor Linux support

Currently, Linux support is quite poor - while one can compile Cosmos kernels under Linux, debugging them is extremely hard. On Windows, Cosmos's Visual Studio debugger is meant to solve this problem, but currently, the debugger isn't fully functional - and works on only one IDE.

Megalithic

Cosmos has no modularization. This means that every kernel, no matter if it uses it, has e.g. the network stack. Additionally, executing binaries from inside the kernel is made quite hard.

No defined ABI

While Cosmos does follow cdecl as its calling convention, it does not follow any ABI. This makes interoperability with native libraries (e.g. zlib) especially difficult.

Potential improvements in Gen3

First, let's address how some of the problems outlined in the previous sections can be avoided in the rewrite.

Design with thread safety in mind

All code that could access potentially contended resources would need to make use of some sort of synchronization - be it atomic operations, locks, etc. This could require specialized algorithms for concurrent operations.

An IPL system could also be introduced. Interrupt controllers like the APIC already support such mechanisms out-of-the-box.

Modularized approach

Instead of providing code that might not be of interest of all users to all kernels, Cosmos could split these components to NuGet packages - e.g. the network stack could be seperated.

New ILC-based compilation pipeline

This change would replace IL2CPU with ILC - the compiler used for Native AOT compilation. The pipeline would integrate with the existing plug concept.

1. Transform standard library assemblies according to the plug set

The plug set refers to a collection of assemblies with classes and methods marked with attributes recognized by the plug weaver. The plug weaver is responsible for transforming one or more target assemblies by replacing the method bodies and adding/modifying fields according to one or more plug assemblies.

Private fields and methods can be accesses from plugs via the [Expose] attribute.

2. Compile main kernel code

ILC is used to compile the main kernel code to an intermediate object file. The consuming assembly is set as the main one, and the plugged standard libraries are provided (alongside any other references defined by the author).

3. Link

Plugs should also be able to specify that methods should be included as imports. With the compilation process, the user can also specify other object files to link alongside the main one.

Note

The ability to automatically compile assembly files to object files and link them could be implemented with an optional NuGet MSBuild extension, named e.g. Cosmos.AsmCompiler.

Plugs would not provide native code directly, like it's the case now - the default MSBuild rules would include the appropriate native object file depending on the architecture, and include the right plug set for the architecture.

Addendum A. ABI

Using ILC for compilation also makes Cosmos use a popular, solid ABI - the System V ABI. This means that interop with Unix libraries is greatly simplified.

Addendum B. Compiling on non-Linux hosts

The ILC compiler emits code with a different binary format, ABI, and dependent symbols, depending on the RID. From my experience and past research, linux-(arch) is the most documented, extensible, and solid compilation target. However, as ILC doesn't support cross-compilation, we would need authors to install tools like WSL on Windows.

Addendum C. Debugging

Compiling for a Unix target also unlocks the ability to use QEMU and GDB to debug Cosmos kernels. This means that the following features would be supported with no extra effort:

  • breakpoints on line numbers
  • backtracing
  • viewing local variables
  • catching triple faults
  • stepping line-by-line

Given GDB also defines a protocol, this can further be integrated with IDEs. A large amount of IDEs have GDB support.

Making Cosmos more like scaffolding, rather than a ready kernel

Currently, Cosmos defines an extremely large part of what a kernel should do. Authors that use Cosmos to learn cannot learn how to make systems like a VFS or a thread scheduler by themselves, and users already experienced with these concepts are limited by Cosmos's ready implementations. Crucial design decisions that define a kernel are already taken by Cosmos, depriving authors from choosing the design of their own operating system.

As a consequence of modularization, the base Cosmos kernel would not include all of these ready components. For those using Cosmos for education, interactive guides could be written to teach systems programming concepts. This is enticing to beginners that view lower level languages like C as "too complicated". Experienced developers, on the other hand, could write their own, custom, components.

All existing APIs, for VFS and whatnot, would be present as optional NuGet packages, for those looking to migrate or to use pre-made components.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions