Memory

Memory
	Chapter 3. System

Endians

Depending on memory chips and CPU, the smallest amount of data that can be read is one Byte (or a multiple of Bytes). The Bytes can be numbered to get the address. A 8 bit CPU can access 8 bit = 1 Byte at once using an address. Also 8 bit CPU have to deal with 16 bit data (as integer numbers). To do this two accesses to the memory is required with two sequential addresses. When a program runs the program counter counts up and increments the address (it is natural that it counts up and not down). The two bytes read, will have to be put together to form a single 16 bit value. However there are two possibilities to do this:

	First Byte accessed	Second Byte accessed
First (lower) Address used	High Byte	Low Byte
Second (higher) Address used	Low Byte	High Byte
	Big Endian	Little Endian

The two terms big and little endian are used to distinguish between the two methods.

As example x86 CPU's are little endians and ARM, 68k are big endians.

The terms are quite confusing, since the CPU that end its access with the low byte is called big endian and the one that ends the access with the high bite is called little endian.

My experience: Each time I think I know what an big and what a little endian is. I'm wrong and discover that it is the opposite.

To help remember: Big endians will read the big end of the 16 bit value (= high byte) first. So the 16 bit value has two ends (as a sausage) but one end is big the other little. If you get the big end first it is a big endian, if you get the little end first then it is a little endian.

The same problematic exist also for:

Bits forming a byte
Bytes forming a 64 bit value
Data structures, unions an bit fields in C (and other) programming languages.
All kinds of serial communications, where bit after bit and byte after byte are transferred.

Some CPU's can not or can not fast access a 16 bit value (or 64bit, or ...) when the first byte is on a odd address. Therefore compilers put all 16 bit values (data structures, labels functions) on even addresses. 64 bit processors use a multiple of 4 bytes to align data to memory.

When first a 8 bit value is put in memory a even address will be taken. A second 8 bit value would be put in the following odd address, however a 16 bit value has to be put in a even address as well to have it fast and easy accessible, therefore a hole will be created, a byte that will not be used. Such bytes are called padding bytes.

Memory usage

To see what you have available type:

free

total used free shared buffers cached

Mem: 1033292 843088 190204 0 56700 369556

-/+ buffers/cache: 416832 616460

Swap: 1554604 0 1554604

Or if you are a human type free -h

An alternative commando is cat /proc/meminfo

Show that the system has 1'033'292kByte or 1GByte RAM. Out of that 843'088kByte is used and 190'204kByte is free. Some of the RAM used, is used for buffers 56700kByte and 369556kByte for cache. This gives 426'256kByte and so the remaining 416'832 is used for the applications running on the system. When the RAM is completely used up, then the system crashes. To prevent that the RAM can be expanded to the hard disk, having some space of the hard disk used as swap. On the system above 1'554'604kByte of the hard disk is reserved for the swap and nothing is used. Again if using swap the memory is used up, then the system crashes and unfortunately the harddisk containing the swap space will stressed continuously and might therefore have a reduced lifetime. Therefore it is recommended to use a smaller maybe old harddisk for the swap space and have the data on a other disk.

The program vmstat as many options to see how the memory is used, so type man vmstat.

cat /var/log/dmesg | grep Memory will show how much memory is detected when the PC boots.

Finally there is dmidecode --type 17 (man dmidecode tells that 17 are the memory devices)

Swap

Some disk space is often uses as swap, if RAM gets full, data is been put onto this swap partition. As rule of thumb, swap partition should be 2 times the size of the RAM. swapon /dev/sd<?> makes it available.

This sounds cool but it puts stress on the drive and causes wear. In case of hard disks additional noise will be produced.

Important

In the age of solid state disks it should take care since those devices have limited write cycles, the filesystem however can mark broken cells as bad blocks. This however makes the situation worse since the trouble gets hidden to the user. SSD usually support SMART Data so checks are advisable. Anyway the better solution than swapping to SSD is using more RAM and remove the swap space.

The swap can be turned on and off on the fly by swapoff -a and swapon -a

Instead of a swap partition a swap file can be used that can be wherever wanted and does not require that the partition is altered:

Create the file: dd if=/dev/zero of=/swapfile bs=1024 count=1048576

the size will be bs=1024 * count=1048576 = 1GByte

or simpler dd if=/dev/zero of=/swapfile bs=1024 count=16M to get 16G swap.

Then mkswap /swapfile to convert it into a swap file.

Turn it on swapon /swapfile

To have it permanently available add it to /etc/fstab to have it activated next boot:

/swapfile       none    swap    sw      0       0

Or turn it on and off swapoff /swapfile manually on demand when it is necessary ().

To see how enthusiastic the kernel is to use the swap cat /proc/sys/vm/swappiness

60 is the default 100 means the kernel really likes to use the swap and 0 means the kernel avoids to use the swap. This is a kernel parameter but can be overwritten by echo 0 > /proc/sys/vm/swappiness or to have it persistent add it to /etc/sysctl.conf

vm.swappiness=0

Finally swapon -s or cat /proc/swaps or free shows you what you have.

Memory Management Unit (MMU)

When a program is started, it is copied into memory space (ram) and runs as a process scheduled by the kernel. An other program (not intentionally or intentionally) could write into the ram area where the program resides and would cause unpredictable side effects (crash, malfunction, ...). To prevent such side effects, the single processes have there own virtual memory space, kept isolated from each other using a piece of hardware, the MMU.

The programmers do not have to care about the MMU, since they have to deal just with virtual addresses. The MMU translates the virtual addresses of every individual process into the real unique physical addresses. A virtual address is therefore split in an upper and a lower address part. The lower (offset) is kept as it is, but the upper (page number) part is replaced by the MMC. It is simple as this, however the lower part has a limited size (for x86 4kByte, 2MBytes or 4MBytes) so it is too little to hold most of the programs, additionally there are many processes running in parallel. This block is called a page and the upper part of the virtual address is used as the page number.

The table to hold the page numbers and the corresponding upper address bits of the physical memory (page number) is called Translation Lookaside Buffer (TLB) and its rows of entries are called Page Table Entries (PTE).

Looking up the page number in a table slows down the program execution. Therefore for the implementation of the TLB a Content-Addressable Memory (CAM) inside the physical MMC device is used. Physical memory is chip area and therefore cost. Smaller processors can therefore not afford an MMU and native Linux can not run on those devices. However Linux versions are available that emulate an MMU, as uclinux .

Since the TLB has not a infinitive size, there are many reasons that a virtual address can not be translated to the physical address since no entry can be found. In such cases the MMU creates a page fault interrupt and the operating system (Linux kernel) has to react. The Linux kernel has its own additional page table in regular memory, where it knows much more about the memory demands than that what fits into the MMU's TBL. Maybe the the desired page is even cached to the hard disk swap space. If swapping is used the Linux kernel takes the page from the hard disk and puts it into physical memory and updates the TLB. If there was no free space in RAM, pages will be overwritten, the kernel has to update its page table accordingly. The overwritten TLB entries will be recovered at the next access. When the Linux kernel has found the physical address it leaves the page fault interrupt routine and restarts the last command that should be successfully now, since the TLB has been updated.

If also the operating system is not able to find a corresponding address, it creates a segmentation fault that might appear on the screen.

If all that works every program can see its 4 GByte (32 address bits) virtual address space that looks as follows

Address	Segments
Hex FFFF FFFF (1 GByte)	Linux	Here are all interrupt routines and kernel stuff. Note it is physically not duplicated for each process! All virtual addresses point to the same physical addresses in this area.
	Stack	Return addresses of functions calls, but also local variables and other temporary data
	Unused area	Since stack and heap grow this is the between area, as long there is something free there is no crash
	Heap	Data
Hex 0000 0000	Code	Here is the program code to be executed, but also constant data.

To initialize such an address space Linux requires that the program is formatted in a known way. The Executable and Linking Format (ELF) is used.

Type objdump -h<path and program name> to get an example of what is known. Or even better readelf --all<path and program name> to get all.

Historically Intel processors had weired historically grown methods to access memory. Luckily Linux does not support them and requires the the protected mode to address memory. This makes use of the MMU. Therefore all other address modes do not have to be considered, except when the CPU gets reseted then for backward compatibility issues it will behave as a old Intel processor and has to be first configured including initializing the MMU to go into the protected mode.