CPU Stacks: Why they grow down

Any running process has several memory regions: code, read-only data, read-write data, et cetera. Some regions, such as code and read-only data, are static and do not change over time. Other regions are dynamic: they can expand and shrink. Usually there are two such regions: dynamic read-write data region, called heap, and a region called stack. Heap holds dynamic memory allocations, and stack is mostly used for keeping function frames.

Optimal layout for two growing regions

Both stack and heap can grow. An OS doesn't know in advance whether stack or heap will be used predominantly. Therefore, an OS must layout these two memory regions in a way to guarantee maximum space for both. And here is the solution:

  1. Layout static memory regions at the edges of process's virtual memory
  2. Put heap and stack on edges too, leave maximum free space in between
  3. Let stack and heap grow towards each other: one grows up, one grows down

This solution is reflected in a hardware design: most CPUs have support for stack, and it grows down. For example, 32-bit Linux implements this layout in the following way:

  • code, shared libraries and read-only data are mapped at the beginning of the memory
  • then goes heap that grows up
  • kernel is mapped at the last 1G
  • stack starts at kernel's lower boundary and grows down

Here's the simplified illustration of the memory map (Windows uses similar layout):

+---+------+------+------+------+---------------------------+-------+--------+
| | DLLs | code | data | heap |---> <---| stack | kernel |
+---+------+------+------+------+---------------------------+-------+--------+
^ ^ ^
0 3G 4G

Note that the first page is never mapped to catch NULL dereferences.

Multi-threaded programs: each thread has it's own stack

That is fine for single-threaded programs. Multi-threading requires separate stack for each thread, cause local variables that are allocated on stack are private for each thread. Therefore, main thread's stack begins at it's usual place - at the kernel boundary. Next thread's stack starts from some offset that defines the maximum stack size of the main thread. Thread API allows to set stack size.

+---+------+------+------+------+-------------+--------+----+--------+--------+
| | DLLs | code | data | heap |--> <-| stack2 | <-| stack1 | kernel |
+---+------+------+------+------+-------------+--------+----+--------+--------+
^ ^ ^
0 3G 4G

Many threads can consume a lot of virtual space, which can be a problem on 32-bit machine. E.g. a program with 2000 threads, each taking a default of 1M stack size, eats about 2G or virtual memory, leaving very little space for heap. In such case, thread stack size should be reduced.

Can stack grow up? Yes!

In the majority of modern architectures, stack grows down. However, there are some processors (e.g. B5000) where stack grows up. Some architecrures (e.g. System Z, RCA1802A) allow to choose stack direction. SEAforth/GreenArrays, SPARC processors have cyclical stack.

To contact: send us a message or ask on the developer forum.