Skip to content

Memory Management

Memory tuning in Linux is complex because memory usage and I/O throughput are tightly coupled. In most cases, the majority of memory is being used to cache file contents from disk. This is deliberate - Linux aggressively caches disk data in RAM to avoid re-reading from the slower physical disk.

Consequences:

  • Changing memory parameters has a large effect on I/O performance
  • Changing I/O parameters has an equally large effect on the virtual memory subsystem
  • Optimize one without considering the other and you may make things worse

Terminal window
free -h # human-readable (GiB/MiB)
free -m # in megabytes
free -s 2 # update every 2 seconds
total used free shared buff/cache available
Mem: 7763 3178 646 1022 3938 3262
Swap: 7762 1034 6728
ColumnMeaning
totalTotal installed RAM
usedUsed by processes
freeCompletely unused RAM
sharedMemory shared between processes (tmpfs, shmem)
buff/cacheUsed for kernel buffers and file cache (reclaimable)
availableMemory available for new processes without swapping (= free + reclaimable cache)
Terminal window
cat /proc/meminfo
MemTotal: 7949804 kB
MemFree: 669748 kB
MemAvailable: 3355456 kB
Buffers: 28 kB
Cached: 3777140 kB
SwapCached: 13160 kB
Active: 2357428 kB
Inactive: 3249488 kB
Active(anon): 1659132 kB # anonymous (heap/stack) active
Inactive(anon): 1201760 kB
Active(file): 698296 kB # file-backed active cache
Inactive(file): 2047728 kB
Unevictable: 583624 kB
Mlocked: 220 kB
SwapTotal: 7949308 kB
...
ToolPurposePackage
freeBrief summary: total, used, free, cache, availableprocps
vmstatDetailed memory + swap + I/O + CPU, updateableprocps
pmapPer-process memory map showing segments + sizesprocps
Terminal window
pmap -x PID # extended output with RSS per segment
pmap -d PID # device format

vmstat is a multi-purpose tool that reports memory, paging, I/O, CPU, and process information in one view.

Terminal window
vmstat [options] [delay] [count]
vmstat 2 4 # report every 2 seconds, 4 times
vmstat -SM -a 2 4 # in MB, show active/inactive memory
vmstat -p /dev/sda3 2 4 # per-partition I/O stats
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
4 0 1048576 910912 28 4061280 6 16 62 42 52 151 4 2 94 0 0

Column key:

FieldMeaning
rProcesses waiting for CPU (run queue length)
bProcesses in uninterruptible sleep (I/O wait)
swpdVirtual memory used (swap)
freeIdle memory
buffKernel buffer cache
cachePage cache
siSwap-in rate (kB/s from disk to RAM)
soSwap-out rate (kB/s from RAM to disk) - if non-zero, you’re memory-constrained
bi / boBlock I/O in/out (sectors/s)
us/sy/id/waCPU: user/system/idle/I/O wait %

If so (swap-out) is persistently non-zero, the system is under memory pressure. If r consistently exceeds the number of CPU cores, you are CPU-bound.

Active vs Inactive Memory:

With -a, vmstat shows active and inactive memory instead of buff/cache:

  • Active: recently used pages; may be clean or dirty
  • Inactive: not recently used; likely clean and released first under pressure

Linux implements a virtual memory system - processes can be given more memory than physically exists. This works in two ways:

  1. Memory overcommit (COW): Many processes never use all requested memory. When a child process is forked, it inherits the parent’s address space via Copy-On-Write (COW) - no physical copy is made until either process modifies a page. This allows fork() to be near-instantaneous even for large processes.

  2. Swapping: When physical RAM is exhausted, the kernel moves less active memory pages from RAM to a swap partition or file on disk. Pages are brought back (swapped in) on-demand when accessed again.

General guidance: equal to installed RAM. Systems with large RAM may use less swap; hibernation requires at least as much swap as RAM.

Terminal window
# View current swap usage
cat /proc/swaps
swapon --show # tabular, with priority
# Runtime free memory check
free -h
# Create and enable a swap file
sudo dd if=/dev/zero of=/swapfile bs=1G count=4 # create 4 GB file
sudo chmod 600 /swapfile
sudo mkswap /swapfile # format as swap
sudo swapon /swapfile # enable
# Disable swap (flush back to RAM first)
sudo swapoff /swapfile
# Add to /etc/fstab to persist across reboots:
# /swapfile none swap sw 0 0

Linux supports multiple swap areas, each with a priority. Lower priority areas are not used until higher priority areas fill:

Terminal window
swapon -p 10 /dev/sdb1 # enable with priority 10

At any moment, most memory is used to cache file contents. These file-backed pages never need to be swapped because their backing store is the file on disk. Instead, dirty file-backed pages (modified content not yet written to disk) are flushed to disk rather than swapped.

Only anonymous memory (heap, stack, mmap without a file) is swapped to the swap area when reclaimed.


The /proc/sys/vm directory exposes kernel VM tuning knobs. You can change them live by writing directly or using sysctl:

Terminal window
ls /proc/sys/vm/ # list available parameters
sysctl vm.swappiness # read a parameter
sysctl -w vm.swappiness=10 # set temporarily
echo "vm.swappiness=10" >> /etc/sysctl.conf # persist across reboots
ParameterDefaultEffect
vm.swappiness60How aggressively to swap (0 = prefer RAM, 100 = swap heavily). Values 10-20 recommended for desktops.
vm.dirty_ratio20% of RAM that can hold dirty (unwritten) pages before all I/O is blocked
vm.dirty_background_ratio10% of RAM with dirty pages before background writeback starts
vm.overcommit_memory0Overcommit policy (see below)
vm.vfs_cache_pressure100Tendency to reclaim memory from directory/inode cache

Three primary tuning areas:

  • Flush behavior: how many dirty pages are allowed and how often they are written to disk
  • Swap behavior: when to swap anonymous pages vs keep file-backed pages in RAM
  • Overcommit level: how much memory allocation beyond physical RAM + swap is allowed

When the system exhausts both RAM and swap, it faces a choice:

  1. Refuse allocations - fail malloc() calls; applications crash
  2. Use swap - slower but extends capacity
  3. Overcommit, then use OOM Killer - Linux’s default approach

Linux allows memory overcommitment: granting allocation requests beyond what RAM + swap can hold, because many processes never fully use their allocated memory (COW, buffers never filled). The kernel tracks this and selects a victim to kill only when the overcommit becomes real.

Overcommit Policy: /proc/sys/vm/overcommit_memory

Section titled “Overcommit Policy: /proc/sys/vm/overcommit_memory”
ValueBehavior
0 (default)Allow overcommit, but refuse obvious overcommits. Root gets more headroom than regular users.
1Allow all memory requests unconditionally. Maximum overcommit.
2Disable overcommit. Fail if total allocations exceed swap + (overcommit_ratio % of RAM).

Each process has an oom_score in /proc/[pid]/oom_score. Higher scores = more likely to be killed.

The score is based on:

  • How much memory the process uses
  • How long it has been running
  • Whether it is a privileged process
Terminal window
cat /proc/$(pgrep firefox)/oom_score # check a specific process's score
# Protect a critical process from being OOM-killed (-1000 = never kill):
echo -1000 > /proc/PID/oom_score_adj
# Make a process more killable (positive values, up to 1000):
echo 500 > /proc/PID/oom_score_adj
Terminal window
dmesg -T | grep -i "oom\|killed process"
journalctl -k | grep -i oom