Essentials of computing for the new coders
Everything new software engineers need to know about computing.
A computer program is a sequence of instructions for a computer to do a particular task.
A task can be a simple as calculating a simple value or something more sophisticated, like playing an audio file or browsing the web.
All computer programs work in the same way; they receive an input and return an output.
For instance, when you type in a URL into your browser’s address bar, you’re performing an input, and the web page you get is the output.
This mutual interaction is called Input and Output, a.k.a I/O.
In this guide, we’ll discuss what exactly happens – in plain English – from the time you open a program until it sends you back some output.
# What happens when you open a program
When you open a program, the operating system allocates a space (as big as needed) on the memory to the program.
This space will contain a data structure that keeps everything about the program for as long as it’s running.
Next, the OS puts the instructions of the program, as well as its data into that space.
What’s loaded into the memory, isn’t the actual program, but it’s an instance of the program, which has transformed into an executable unit.
If programs had soul, this could be it.
This unit is called a process.
Since we usually have tens of programs running simultaneously, the operating system uses scheduling algorithms, to fairly distribute system resources between programs.
From here, the CPU (which is the brain of the computer) takes over and executes the instructions of the process that’s scheduled to run, one after another.
We’ll cover these steps in more detail in the coming sections.
# How it started
Operating systems make it easy to run tens of programs with a few mouse clicks.
This wasn’t always the case, though. There was a time when they were no operating system.
In the early days of computing, computers were only able to run one program at a time, loaded into memory by a human operator.
Programmers gave their program as a deck of punched cards to an operator and came back a few hours later (probably after having a coffee) for the printed results.
Over time, computers got faster and faster and could execute way more than what bear hands could feed them.
Operating systems were invented to make this slow and human-intervened operation as automatic as possible, and use the available computing resources to the fullest.
# Device Drivers
Back in the days, a program written for a certain device model couldn’t be used for another model.
Imagine a program could only print the output to a certain model of printer, but not the other.
There was no API specification for manufacturers to implement standardized hardware APIs.
Programmers had to know everything about a certain model of hardware to program for it.
And not every device worked the same way.
At the same time, sharing software was becoming a thing; software companies were selling software products to universities, banks, department stores, etc.
However, the hardware inconsistencies were still the nightmare of many programmers; they couldn’t test their code on every single device, and doing so wasn’t practical in the first place.
Operating systems fixed this problem as well, by abstracting away the complexities associated with computer hardware.
They provided a set of contracts computer manufacturers had to implement to be supported by the operating system.
As a result, any manufacturer needed to supply a program with the hardware that implemented a standard API.
These programs are called device drivers.
Device drivers are installed on the operating system and are the interface between computer programs and the underlying hardware.
Nowadays, the device drivers are provided by the hardware manufacturers – either in the form of CDs (when you buy the device) or it could be already included in the operating system’s drivers library.
You may have probably installed a driver or two for your modem, graphic card, sound card, or printer when installing a fresh operating system.
# How a bunch of electronics circuits can understand a program
Although computer programs are instructions for computers, these instructions are initially written in a human-friendly – high-level – programming language.
That said, a written program cannot be instantly executed by a machine.
To be executed by a machine, a computer program has to be translated into a set of low-level instructions that a machine can understand.
This low-level equivalent is called machine code, or native code.
Every application installed on your computer is already translated to machine-ready code.
The machine code is in the form of primitive instructions hardwired into the CPU’s architecture.
These instructions are known as the CPU’s instruction set.
CPU instructions are basic such as memory-access instructions, bit manipulation, incrementation, decrementation, addition, subtraction, division, multiplication, comparisons, etc.
The CPU is extremely fast at executing these operations, though. A CPU can execute more than two billion instructions per second.
Each basic instruction doesn’t necessarily mean something on its own, but they become meaningful when they all come together at a higher level.
Imagine a 3D printer, which extrudes plastic materials bit by bit, and layer after layer, until the final model appears. You can think of those plastic bits as the result of each instruction that makes sense when they are put together at the end.
# Introducing compilers
Now, the question is how a high-level program is translated to machine-code?
Compilers are programs that translate a program written in a high-level programming language into a set of primitive instructions a CPU can execute.
Each programming language has its own set of compilers, for instance, a program written in C++, cannot be compiled to machine code with a Python compiler, and vice versa.
Additionally, a C++ program that has been compiled to run on Windows, cannot run on Linux.
That said, programmers need to pick their compiler program based on the programming language and the target machine.
# Types of compilers
Compilers basically pass through the source code line by line, until a few hundred lines of high-level code are translated into thousands of low-level and machine-ready code.
Depending on the features of the source language, some compilers might need to pass through the code more than once.
This categorizes compilers into single-pass and multi-pass compilers.
In the early days of computing, the design of the compiler was mostly inspired by the resource limitations as a full compilation in one pass wasn’t possible.
This led the scientists to split the compilation process into multiple passes, which involved passing through the source code multiple times – each time for a different purpose.
Nowadays, although there’s no hardware limitation issue, some modern language features are complicated enough to require a multi-pass approach anyway.
# Compiling phases
Designing a compiler for a programming language is a complicated job.
That said, compiler creators split the compiler’s functionality into multiple independent sub-systems or phases.
Each phase has its own responsibility and might involves multiple passes over the source code.
The phases are:
- The front end
- The middle end
- The back end
In the front end phase, the source program is translated into an intermediate representation, known (IR) .
IR isn’t the source program anymore but isn’t the target machine code either, yet.
IR is reused across all phases until the opcode is eventually generated.
Lexical analysis, syntax analysis, and semantic analysis, as well as type checking (in type-safe languages), is done during this phase to make sure the program has been written correctly.
The middle end, however, is more involved in code optimization, including dead-code elimination (removing a code that has no effect on the program).
The back end phase gets the output of the middle end and performs another round of optimization but this time specific to the target machine.
Finally, the final representation is generated, such as an opcode or byte code.
This representation is the instructions finally decoded and executed by the CPU.
Although the front end, middle end, and back end are interfaced, they work independently of each other.
This enables compiler creators to mix & match various phases to make special compilers for different source programs and the target machine.
For instance, the same front end can be mixed with different back ends, to make C++ compilers for Mac OS, Windows, and Linux operating systems.
Developing compilers as decoupled phases means smaller programs to maintain, which results in more reliable compilers.
An instance of a program loaded into the memory is called a process.
When you run a program, the operating system allocates a subset of the main memory (RAM in this case) to the process.
This space contains every piece of information the process needs to run, and every piece of information it will populate during run-time.
This information is contained in a data structure called Process Control Block (PCB).
A PCB contains the following information about a process:
- The program’s instructions in machine-ready code.
- The program’s inputs and outputs
- The data generated during the run time – this section is called the heap
- A call stack to keep track of called functions (subroutines) in the program
- A list of descriptors (a.k.a handles) that the process is using, including open files, database connections, etc
- The program’s security settings, like who is allowed to run the program, etc
Any time you open a program, the operating system creates a PCB for the program on the memory.
That said, if you run the same program three times, you’ll have three separate processes loaded into the memory.
For example, Google Chrome manages each tab as an independent process.
This means once you open a new tab in Google Chrome, a new process is created.
# Process security
For security reasons, a process cannot access the memory space (PCB) of another process.
This memory-level isolation ensures a buggy or a malicious process wouldn’t be able to tamper with another process.
This strategy is called Memory Protection.
In Google Chrome’s case, an unresponsive tab wouldn’t affect the other tabs as they are managed in independent processes.
We’ll discuss memory protection in more detail in the next sections.
# Computer’s main memory
Memory – a.k.a the main memory or RAM – stores process instructions and processes data, and is one of the main components of a computer.
Memory is the only storage system CPU can access, for storing and retrieving data.
That’s why it’s called the Main Memory.
Memory consists of many physical blocks to store data.
Each block has a unique address for reference, e.g. 1, 2, 3, 4, and so forth.
Each time the CPU needs to fetch data (like process instructions or data) from the memory, it sends the memory a set of signals over a set of wires called address bus and control bus.
These signals include the memory address (in binary format) sent over the address bus and a read signal over the control bus.
Once the memory receives the read request from the CPU, it fetches the data and returns it over the data bus, which is a set of electronic wires to transmit data.
Address bus, a control bus, and a data bus are the components of a larger signal transmission network, called a system bus. I’ll get back to the system bus in the next sections.
# CPU, the brain of your computer
Central Processing Unit, known as the CPU is where the execution of instructions happens.
The CPU is the brain of your computer.
CPU interacts with the memory (RAM) over the system bus to store and retrieve data while executing each process’s instructions.
CPU consists of two main units, Control Unit (CU) and Arithmetic/Logical Unit (ALU).
The CU is a component of the CPU that controls the whole operation within the CPU.
CU is where the instructions are fetched (from memory), decoded, and executed.
ALU is another component of the CPU, which contains electrical circuits to carry out arithmetic and logical operations.
Arithmetic operations include primitive mathematical operations, such as addition, subtraction, division, and multiplication.
Logical operations usually involve operations used to make a decision.
For instance, comparison, which is basically comparing two values based on a comparison operator such as
Additionally, multiple comparisons can be combined by using the logical operators, such as AND, OR, AND OR, etc
# What are CPU registers?
Registers are special storage units (the same as memory blocks) inside the CPU.
The data fetched from the memory is temporarily placed inside the registers, so CPU can pick ’em up.
Imagine a carpenter who keeps a big box of nails on a shelf.
He usually puts a handful of nails into a box and puts it on his working desk – for quicker access.
When he needs to nail a piece of wood he picks a few from the small box and puts them aside.
In this analogy, the nail box on the shelf is a persistent storage device, like the hard disk. The small box on the table is the memory. The area he temporarily puts the nails is a register.
The number of registers inside a CPU depends on the CPU’s architecture, it can be 8, 16, 32, or more.
Some registers are general-purpose and are used to store any value.
However, some registers are specialized for a particular purpose; for example, to store memory addresses, data, or program instructions.
# How CPU executes an instruction
CPU performs tasks in cycles, called Fetch-Decode-Execute cycles a.k.a instruction cycles.
You already know the instructions of each process reside in the memory, and the operating system uses scheduling algorithms to determine the next runnable process.
Each instruction cycle consists of three phases: Fetch, Decode, and Execute.
The CPU fetches the next instruction (of a process) from the memory, decodes it (to know what exactly has to be done), and finally executes it.
This is one cycle.
Let’s see what exactly happens during each phase.
# The fetch phase
The program counter (PC) is a special CPU register, which always contains the memory address of the next instruction to be executed.
On each fetch-decode-execute cycle, first, the CPU copies the memory address stored in the PC into another register called Memory Address Register (MAR).
Then, sends this address to the memory.
But what does it mean when we say “it sends the data”?
Here’s an explanation:
The data is transmitted across the system in the form of electrical signals over a network of wires on the computer’s motherboard or cables.
This data transmission network is called the system bus.
System bus transmits data between the CPU and other components.
The address bus is a component of the system bus, which is used to transmit an address (in binary format) to the memory.
The address bus is used any time the CPU needs to access the memory.
So whatever value is on the address bus, it’s a memory address.
Along with the memory address, the CPU sends a read signal to the memory as well, to indicate what it wants to do with that address.
This signal is sent across another component of the system bus, called the control bus; the control bus is also a set of wires to transmit control signals.
Once the memory receives the memory address from the address bus entry, and the control signal (read) from the control bus entry, it sends the data back to the CPU, over the data bus, which is used to transmit the actual data.
The returned data in this case is a process instruction.
Consequently, the returned data is copied into another register called Memory Data Register (MDR).
Until now, two different registers have been used, one for keeping a memory address (MAR), and one for keeping the fetched data (MDR).
Now, the instruction stored in the MDR is copied into another register called Instruction Register (CIR).
The fetch phase is done here. The next thing to do is to decode CIR content.
At this point, the PC register is incremented to point to the next memory address, which contains the next instruction to be used in the next cycle.
However, under certain conditions, PC can also jump over some memory blocks to point to an instruction somewhere else; This usually happens when the CPU encounters a flow control instruction, such as an if or for loop in a program.
# The Decode Phase
During the decode phase, CU decodes the instruction residing in the CIR register, to determine what data is needed to execute the fetched instruction.
The CU then sends signals to other components within the CPU, such as the ALU and FPU if an arithmetic/logical or floating-point operation is needed to be executed.
The decoding process allows the CPU to determine what instruction is to be performed so that the CPU can tell how many operands it needs to fetch to perform the instruction.
# The Execute phase
The last phase in the instruction cycle is to execute the instruction fetched and decoded by now.
If the decoded instruction involves arithmetic/logical operations, the ALU is used.
To recap, the CPU keeps reading the PC content on every cycle, fetches the instruction stored in that address, decodes it, and executes it.
This cycle is repeated as long as the computer is switched on.
The fetch-decode-execute cycle begins as soon as the system is turned on, with an initial PC value that is predefined by the system’s architecture.
This initial address in PC points to special instructions in the read-only memory (ROM), to start the system’s firmware and boot up the operating system.
# CPU Clock
Like any digital system, the CPU uses a binary number system (1’s and 0’s) to deal with data. These ones and zeros are determined by the state of billions of transistors (on/off) in the CPU.
If you’re wondering what binary is, it’s a number system, which uses 1 and 0 to represent a numerical value, as opposed to a decimal system, which uses numbers 0 to 9, to represent a number.
For instance, number 3 in binary would be 011, and 4 would be 100.
These ones and zeros are presented as electric signals with two different states, a high voltage to represent 1, and a low voltage to represent 0.
When executing instructions, the transistors that exist in the CPU keep switching between ON and OFF (upon receiving a signal), to represent binary values.
Let’s suppose an arithmetic operating is done, and as a result, a set of transistors should go on and off to represent the result in binary format.
However, transistors do not operate in real-time, meaning it takes a short while for a transistor to change its state.
That said, it takes a while until a certain value is reflected in a set of transistors.
Now, if the CPU refers to this value, while the said transistors are still transitioning, we might get unexpected or even invalid results.
To solve this problem, the CPU manufacturers came up with the idea of CPU clocks to give the CPU a sense of time and make CPU operations synchronous.
This synchronous approach allows the CPU to wait for a period of time before checking the transistors’ state. This period is long enough to ensure all transistors are already in their new state
This period is called one clock cycle. A fetch-decode-execution cycle happens during one clock cycle.
This is the CPU’s perception of time, just like seconds that we use to measure our time.
Clock cycles are in the form of electrical pulses generated by an internal oscillator circuit at a fixed rate.
On the other hand, the time between receiving one pulse and the other is one clock cycle. A beep sound is also generated the same way as a clock pulse.
# Clock Rate
The clock rate of a CPU is the number of clock pulses per second.
The higher it is, the more instructions the CPU can fetch, decode, and execute. This is not always the case, though as there are other parameters involved in defining the CPU’s performance – such as its architecture, cache storage capacity, etc.
Overclocking means increasing the CPU’s clock rate (if possible), to get the highest performance out of the current CPU you already have – without purchasing a faster CPU.
# Multitasking: the reason you can surf the web while listening to music
Multitasking is a common feature of modern operating systems, which involves switching between processes, to ensure each process will get a slice of the CPU’s time in a given period of time.
To achieve this, the operating system puts a process (that can wait) on hold, and starts (or resumes) another one.
This cycle goes on for as long as the system is switched on.
# A bit of history on OS multitasking
One of the early forms of multitasking in operating systems was cooperative multitasking, where each process could take up the CPU time for as long it needed, and voluntarily gave up the CPU time to another waiting process.
There’s a caveat to this approach, though; a poorly written program could use the CPU’s time infinitely without sharing it with other processes; or if it crashed due to some bugs, the whole system would crash.
Cooperative multitasking used to be a scheduling scheme in early versions of Microsoft Windows and Mac OS; however, except for specific applications, it’s no longer used.
# Multitasking in the modern time
The multitasking scheme used in today’s most systems is of type Preemptive Multitasking, where the operating system has full control over the resource allocation and determines how long a process should have the CPU time.
The operating system keeps track of the states of processes residing in the memory and uses scheduling algorithms to choose the next process to run.
Each process can be in one of these states during its life in the memory:
- CreatedThe process was just created
- ReadyThe process is residing in the memory
- RunningThe process instructions are being executed by the CPU
- BlockedThe process execution has been paused due to an interrupt signal, such as I/O
- TerminatedThe process has been terminated and is about to be removed from the memory
The act of switching between processes is called context switching; context switching involves storing the state of the current process – to make sure it can be resumed at a later point.
After that, the operating system starts (resumes) another process.
# How does context switching work?
The context switching is done as a result of a CPU interrupt.
But what is an interrupt? you may ask.
An interrupt is an action taken by the CPU to pause a running process.
It’s in the form of a signal transmitted to the operating system to take necessary actions on the running process.
An interrupt is issued in response to hardware or software events.
This categorizes interrupts into hardware interrupts and software interrupts, respectively.
Upon receiving an interrupt signal (from the CPU), the operating system’s scheduler forcibly swaps the running process out of the CPU, and schedules another process to run.
# When a hardware device causes an interrupt
Hardware Interrupts happen due to an electronic or physical state change in a hardware device (e.g a mouse click or keyboard keypress) that requires immediate attention from the CPU.
For example, when you move your mouse, an interrupt request (IRQ) is transmitted to the CPU, as soon as the move is detected by the mouse electronics.
Interrupt requests are detected by devices embedded in the CPU, such as the CPU timer, which continuously checks for incoming hardware interrupts.
During each instruction cycle (Fetch-Decode-Execute cycle), the processor samples the interrupt trigger signal, to figure out to which device the signal belongs; it can be a keyboard, a mouse, hard disk, etc.
Upon receiving an interrupt request, the CPU interrupts the current process as soon as possible (not always instantly), and sends the appropriate interrupt signal to the operating system’s kernel.
The kernel takes over and considering the interrupt signal, schedules another process to run.
Not all interrupt requests are served the same way, though.
Some requests don’t need to be handled instantly, while some are time-sensitive and have to be taken care of in real-time.
That’s why when you move your mouse, you never have to wait for a few seconds to see your mouse pointer is repositioned.
# What is a Software Interrupt
A software interrupt is issued by the CPU when it reaches a certain instruction with one of the following characteristics:
- The instruction initiates an I/O operation, for instance, to read input from a keyboard, or write the output to disk, or the screen
- The instruction requests a system call (to use the low-level services of the OS)
- The instruction leads to an error, such as dividing a number by zero
You’re probably wondering what is an I/O operation, and why it leads to a CPU interrupt.
Let’s break it down.
I/O, also known as Input and Output refers to every interaction between the CPU and the outside world.
Devices used in this interaction are called Peripheral devices or I/O devices; screens, printers, keyboards, hard disks, network cards are all considered peripheral devices.
An example of this two-way communication is when you use a keyboard to put data into the computer and use a screen to get the output.
I/O isn’t just a human-to-machine interaction, though; the communication between the CPU and other components in the system is also considered I/O.
For instance, when a file is being fetched from the hard disk (disk I/O), or an image is being downloaded from the Internet (network I/O).
I/O devices are much slower than the CPU.
The reason is CPU works by electric current, while I/O devices might require mechanical movements (like hard disks), or affected by network latency (like network cards).
Let’s suppose a process initiates an I/O operation, for instance waiting for the user to input a value.
Now, if the CPU couldn’t issue an interrupt, and switch to another task, while a slow I/O operation was in progress, it would end up remaining idle until the I/O operation was completed.
Imagine your Spotify would stop working while you were writing a URL in your browser’s address bar.
Think of the CPU as a chess grandmaster who’s playing with 20 people simultaneously. She would move to the next player, while the current opponent is thinking about the next move – and gets back when they have made the move. Otherwise, the game would take more than a couple of days.
Multi-tasking was the answer to this significant time difference between the slow and ultra-fast components in a computer system – allowing the faster part to switch to other tasks, while the slow part is still working.
# How the Operating System Handles an Interrupt Signal
Upon receiving a software interrupt signal from the CPU, the kernel sets the current process status to blocked, and the current execution state is swapped out of the CPU.
Next, it looks up the interrupt signal in a table called IVT, to find the interrupt handler associated with that interrupt.
An interrupt handler is a piece of code, which handles the respective interrupt event. IVT is a data structure, which contains a list of interrupts with their associated interrupt handlers.
The interrupt handler runs as a process, just like any normal process.
Once the interrupt handler’s execution is done, the interrupted process is scheduled to resume execution.
For instance, when an instruction is about writing data to a file (like a program saving a file), the CPU issues an interrupt signal to the OS.
Consequently, the OS blocks the current process, looks up the appropriate interrupt handler in the IVT, and schedules the interrupt handler to run – which involves placing data on the physical storage device.
Once the data is written to the physical disk, the CPU detects an interrupt request from the disk, which indicates the completion of a data transfer.
Consequently, the CPU issues disk interrupt signals to the operating system, which eventually causes the blocked process to resume execution.
# Threads and Multi-threading
The instructions of a process, are treated (by the OS) as a stream of instructions, called a thread, or a thread of execution.
That said, the terms process instructions and thread instructions can be used interchangeably.
The thread instructions are normally executed sequentially; however, there are times when some instructions don’t have to wait until their turn because they are completely independent of the other instructions.
Let’s suppose we have a process with 100 instructions, however, the last ten instructions can be executed independently and don’t have to wait until the first ninety are executed.
To benefit from the OS multitasking capabilities, one approach is to put the last 10 instructions in a separate process within the same program.
Most programming languages provide features to help you write programs as multiple processes; so once the program runs, multiple processes will be created in the memory (that will benefit from multi-tasking)
However, this is not the only way to use the OS multitasking features.
If we just want our independent instructions to run concurrently, and memory protection is not a concern, we can create another thread within the main process; So it’ll be one process with two threads of execution.
Every process has at least one thread of execution.
Threads are described as lightweight processes because switching between threads doesn’t require a context switch, as they share their parent process data.
Now, if we put those last 10 instructions in another thread, we’ll have one process with two threads.
The operating system treats a thread just like a process with one exception; a context switch wouldn’t be necessary when switching between threads.
Multithreading is actually multitasking at the process level, and it’s the programmer who decides whether the instructions should be split into multiple threads or not.
The memory protection doesn’t apply to the threads of a process, as they all share the same Process Control Block, such as inputs, heap, descriptors, etc.
Since threads share the same resources, it is important to understand how to synchronize access to those resources so that the work of one thread isn’t overridden or corrupted by another thread.
# A Process with multiple threads vs multiple processes
To improve the performance of a program, and use the OS multitasking capabilities, the programmer either writes a program as multiple threads within one process as multiple independent processes.
But which approach is the best?
Well, that depends; each strategy has its advantages over the other, under different circumstances.
Creating and switching between threads is cheaper than a process; The reason is threads share the parent process context, and no Process Control Block has to be created for each thread.
However, using multiple independent processes yields better memory management.
For instance, in case of memory shortage, an inactive process can be swapped to disk, to free up the memory needed for another process.
That cannot happen with multiple threads.
Another benefit of using multiple processes is memory protection, which prevents a process to affect other processes.
Let’s make it more clear with an example:
Google Chrome manages each tab as a separate process; this means anytime you open a new tab, a new process is created.
If the tabs were programmed as multiple threads, a non-responsive tab would affect the other tabs, and finally the web browser itself because threads use the same memory space.
However, if each tab is managed as a process, thanks to the process-level memory protection, bugs, and glitches in one tab (one process), won’t have any effect on the other tabs (other processes).
Although creating/maintaining multiple processes has more overhead comparing to threads, the Google Chrome team traded off this fixed-cost overhead with a more reliable application.
# Inter-process communication
Although processes cannot access each other’s data, the operating system may provide mechanisms to enable processes to interact with each other in a safe and predictable way – if needed. This is called inter-process communications.
# Memory management
Any limited resource needs resource management, and the main memory is no exception.
Memory management is about keeping track of allocated and free memory spaces, to determine how the memory should be allocated to new processes.
If there’s not enough free space available on the memory to keep all running processes, then, idle processes may have their memory swapped out to a location of the disk, called the backing store – to make space for the running processes.
Computer programs don’t have an idea of memory physical addresses; They use a virtualized address space, and as far as they know, they are the only program residing in the memory.
Every memory address that’s used across the system is a logical memory address, This address is mapped to the actual physical address by MMU (Memory management unit) upon receiving a read/write request; MMU is a chipset on RAM.
Alright, now, let’s go through two different approaches the operating systems take to allocate memory to processes.
# Single Contiguous Allocation
Single contiguous allocation is the simplest memory management technique, where the whole memory (except for a small portion reserved for the OS) is allocated to one program at a time.
For example, MS-DOS operating system allocated memory this way.
A system using a single contiguous allocation may still do multitasking by swapping the whole memory content (temporarily moving it to the disk) to switch between processes.
# Partitioned Allocation
In Partinioed allocation, the memory is split into partitions and allocated to multiple processes at a time.
Partition allocation can be done in two ways:
One approach is to divide the memory into equal-sized partitions and allocate each partition to a process.
The problem with this approach is that not all processes need the same amount of memory, and the leftovers will remain unused as long as the process owns the space.
This might limit the number of simultaneous processes we can have in the memory.
A better approach, however, is to allocate memory to processes based on how much space they need.
Then, keep a list of free memory blocks (known as holes) and allocate the best possible hole (in terms of size) when a certain process is created.
This approach is the most common approach we see today in operating systems.
When choosing the best possible hole, three different strategies are taken into account to find the best hole possible:
First Fit: To search the list of holes until a hole, big enough for the respective process, is found.
Best Fit: To search the list of holes to find the smallest hole which satisfies the process’s requirements.
This is helpful to keep the bigger holes intact for processes that need a big chunk of memory.
Worst fit: Finding a hole, which is much larger than what the process requires.
The first fit and best fit are the preferred approaches when allocating memory space to a process.
# Memory protection
As mentioned earlier, each memory space that is given to a certain process is a subset of the main memory.
However, since the allocation is done through a virtualized layer, each process thinks it’s the only process using the memory and doesn’t know other processes even exist.
This level of separation prevents a malicious program (such as malware) from interfering with other program’s data and making them crash.
This strategy is called memory protection.
With memory protection, a process cannot modify the data of another process in the memory – either with good or bad intentions.
As a result, if a program crashes due to some bugs or an external issue, the other process in the memory or even the OS itself wouldn’t be affected.
For instance, if a Google Chrome tab crashes, your data in MS Word is guaranteed to remain intact.
But how does this mechanism work?
Here’s the explanation:
Any running process is associated with two values, base, and limit. These two values define a range (of memory addresses) that the instructions of a certain process can access.
As a result, any memory access instruction within the process is checked against these two registers by the CPU.
If memory access is attempted outside of this range, a fatal error is issued by the CPU and the process is ended immediately.
This limitation only applies to user programs (programs you write or install).
However, OS itself is exempt from this memory access limitation as it’s supposed to handle memory allocation for all processes.
Other system programs, such as OS utilities and drivers are normally exempt from this limitation as well.
to be continued …