Memory Management
Memory and I/O Are Intrinsically Linked
Section titled “Memory and I/O Are Intrinsically Linked”Memory tuning in Linux is complex because memory usage and I/O throughput are tightly coupled. In most cases, the majority of memory is being used to cache file contents from disk. This is deliberate - Linux aggressively caches disk data in RAM to avoid re-reading from the slower physical disk.
Consequences:
- Changing memory parameters has a large effect on I/O performance
- Changing I/O parameters has an equally large effect on the virtual memory subsystem
- Optimize one without considering the other and you may make things worse
Viewing Memory Usage
Section titled “Viewing Memory Usage”free - Memory Summary
Section titled “free - Memory Summary”free -h # human-readable (GiB/MiB)free -m # in megabytesfree -s 2 # update every 2 seconds total used free shared buff/cache availableMem: 7763 3178 646 1022 3938 3262Swap: 7762 1034 6728| Column | Meaning |
|---|---|
total | Total installed RAM |
used | Used by processes |
free | Completely unused RAM |
shared | Memory shared between processes (tmpfs, shmem) |
buff/cache | Used for kernel buffers and file cache (reclaimable) |
available | Memory available for new processes without swapping (= free + reclaimable cache) |
/proc/meminfo - Detailed Breakdown
Section titled “/proc/meminfo - Detailed Breakdown”cat /proc/meminfoMemTotal: 7949804 kBMemFree: 669748 kBMemAvailable: 3355456 kBBuffers: 28 kBCached: 3777140 kBSwapCached: 13160 kBActive: 2357428 kBInactive: 3249488 kBActive(anon): 1659132 kB # anonymous (heap/stack) activeInactive(anon): 1201760 kBActive(file): 698296 kB # file-backed active cacheInactive(file): 2047728 kBUnevictable: 583624 kBMlocked: 220 kBSwapTotal: 7949308 kB...Memory Monitoring Tools
Section titled “Memory Monitoring Tools”| Tool | Purpose | Package |
|---|---|---|
free | Brief summary: total, used, free, cache, available | procps |
vmstat | Detailed memory + swap + I/O + CPU, updateable | procps |
pmap | Per-process memory map showing segments + sizes | procps |
pmap -x PID # extended output with RSS per segmentpmap -d PID # device formatvmstat - Virtual Memory Statistics
Section titled “vmstat - Virtual Memory Statistics”vmstat is a multi-purpose tool that reports memory, paging, I/O, CPU, and process information in one view.
vmstat [options] [delay] [count]
vmstat 2 4 # report every 2 seconds, 4 timesvmstat -SM -a 2 4 # in MB, show active/inactive memoryvmstat -p /dev/sda3 2 4 # per-partition I/O statsprocs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 4 0 1048576 910912 28 4061280 6 16 62 42 52 151 4 2 94 0 0Column key:
| Field | Meaning |
|---|---|
r | Processes waiting for CPU (run queue length) |
b | Processes in uninterruptible sleep (I/O wait) |
swpd | Virtual memory used (swap) |
free | Idle memory |
buff | Kernel buffer cache |
cache | Page cache |
si | Swap-in rate (kB/s from disk to RAM) |
so | Swap-out rate (kB/s from RAM to disk) - if non-zero, you’re memory-constrained |
bi / bo | Block I/O in/out (sectors/s) |
us/sy/id/wa | CPU: user/system/idle/I/O wait % |
If so (swap-out) is persistently non-zero, the system is under memory pressure. If r consistently exceeds the number of CPU cores, you are CPU-bound.
Active vs Inactive Memory:
With -a, vmstat shows active and inactive memory instead of buff/cache:
- Active: recently used pages; may be clean or dirty
- Inactive: not recently used; likely clean and released first under pressure
Virtual Memory and Swap
Section titled “Virtual Memory and Swap”Linux implements a virtual memory system - processes can be given more memory than physically exists. This works in two ways:
-
Memory overcommit (COW): Many processes never use all requested memory. When a child process is forked, it inherits the parent’s address space via Copy-On-Write (COW) - no physical copy is made until either process modifies a page. This allows
fork()to be near-instantaneous even for large processes. -
Swapping: When physical RAM is exhausted, the kernel moves less active memory pages from RAM to a swap partition or file on disk. Pages are brought back (swapped in) on-demand when accessed again.
Swap Recommended Size
Section titled “Swap Recommended Size”General guidance: equal to installed RAM. Systems with large RAM may use less swap; hibernation requires at least as much swap as RAM.
Managing Swap
Section titled “Managing Swap”# View current swap usagecat /proc/swapsswapon --show # tabular, with priority
# Runtime free memory checkfree -h
# Create and enable a swap filesudo dd if=/dev/zero of=/swapfile bs=1G count=4 # create 4 GB filesudo chmod 600 /swapfilesudo mkswap /swapfile # format as swapsudo swapon /swapfile # enable
# Disable swap (flush back to RAM first)sudo swapoff /swapfile
# Add to /etc/fstab to persist across reboots:# /swapfile none swap sw 0 0Swap Priority
Section titled “Swap Priority”Linux supports multiple swap areas, each with a priority. Lower priority areas are not used until higher priority areas fill:
swapon -p 10 /dev/sdb1 # enable with priority 10What Gets Cached vs Swapped
Section titled “What Gets Cached vs Swapped”At any moment, most memory is used to cache file contents. These file-backed pages never need to be swapped because their backing store is the file on disk. Instead, dirty file-backed pages (modified content not yet written to disk) are flushed to disk rather than swapped.
Only anonymous memory (heap, stack, mmap without a file) is swapped to the swap area when reclaimed.
Tuning the VM Subsystem: /proc/sys/vm
Section titled “Tuning the VM Subsystem: /proc/sys/vm”The /proc/sys/vm directory exposes kernel VM tuning knobs. You can change them live by writing directly or using sysctl:
ls /proc/sys/vm/ # list available parameterssysctl vm.swappiness # read a parametersysctl -w vm.swappiness=10 # set temporarilyecho "vm.swappiness=10" >> /etc/sysctl.conf # persist across rebootsKey Parameters
Section titled “Key Parameters”| Parameter | Default | Effect |
|---|---|---|
vm.swappiness | 60 | How aggressively to swap (0 = prefer RAM, 100 = swap heavily). Values 10-20 recommended for desktops. |
vm.dirty_ratio | 20 | % of RAM that can hold dirty (unwritten) pages before all I/O is blocked |
vm.dirty_background_ratio | 10 | % of RAM with dirty pages before background writeback starts |
vm.overcommit_memory | 0 | Overcommit policy (see below) |
vm.vfs_cache_pressure | 100 | Tendency to reclaim memory from directory/inode cache |
Three primary tuning areas:
- Flush behavior: how many dirty pages are allowed and how often they are written to disk
- Swap behavior: when to swap anonymous pages vs keep file-backed pages in RAM
- Overcommit level: how much memory allocation beyond physical RAM + swap is allowed
OOM (Out of Memory) Killer
Section titled “OOM (Out of Memory) Killer”When the system exhausts both RAM and swap, it faces a choice:
- Refuse allocations - fail
malloc()calls; applications crash - Use swap - slower but extends capacity
- Overcommit, then use OOM Killer - Linux’s default approach
Linux allows memory overcommitment: granting allocation requests beyond what RAM + swap can hold, because many processes never fully use their allocated memory (COW, buffers never filled). The kernel tracks this and selects a victim to kill only when the overcommit becomes real.
Overcommit Policy: /proc/sys/vm/overcommit_memory
Section titled “Overcommit Policy: /proc/sys/vm/overcommit_memory”| Value | Behavior |
|---|---|
0 (default) | Allow overcommit, but refuse obvious overcommits. Root gets more headroom than regular users. |
1 | Allow all memory requests unconditionally. Maximum overcommit. |
2 | Disable overcommit. Fail if total allocations exceed swap + (overcommit_ratio % of RAM). |
How the OOM Killer Selects Victims
Section titled “How the OOM Killer Selects Victims”Each process has an oom_score in /proc/[pid]/oom_score. Higher scores = more likely to be killed.
The score is based on:
- How much memory the process uses
- How long it has been running
- Whether it is a privileged process
cat /proc/$(pgrep firefox)/oom_score # check a specific process's score
# Protect a critical process from being OOM-killed (-1000 = never kill):echo -1000 > /proc/PID/oom_score_adj
# Make a process more killable (positive values, up to 1000):echo 500 > /proc/PID/oom_score_adjViewing OOM Events
Section titled “Viewing OOM Events”dmesg -T | grep -i "oom\|killed process"journalctl -k | grep -i oom