As I am going to play with the VMA code and its main data structure (a red-black tree), I have started to collect some data by using systemtap, and got a bit impressed by the number of times find_vma() is called by some processes, as shows the table below.
| PID |
exec name |
nr calls (5 seconds) |
doing… |
| 21984 |
cc1 |
1570 |
Compiling Linux kernel |
| 11533 |
X |
13252 |
Switching work spaces |
| 23554 |
git-index-pack |
42213 |
Resolving deltas |
So, in this specific case gcc has searched its VMA tree 1570 in five seconds while (probably) compiling some C file. Is that a big number? I am not sure but X, for example, is calling find_vma() more than 2600 times in a second if I switch fast between work spaces in Window Maker. I did not expect this.
Git is the winner so far, more than 8000 searches in one second while resolving deltas in a clone operation.
I also wanted to know what are the most called system calls that may search (or change) the VMAs’ tree, and here goes some more numbers:
| system call |
nr calls (5 seconds) |
| sys_mmap2 |
11039 |
| sys_munmap |
5102 |
| sys_brk |
3574 |
| sys_execve |
392 |
| sys_clone |
255 |
| sys_vfork |
82 |
This snapshot has been taken while compiling the Linux kernel and while the numbers are of no surprise, there are two interesting functions in that table, brk() and vfork(). I thought that their usage would be very rare these days: users of brk() are now using mmap() and I was wondering whether it makes sense to use vfork() in a modern kernel like Linux, with very fast process creation time and copy on write support.
But a quick look at the code shows that sys_vfork() passes the CLONE_VM flag to do_fork(), this in turn will avoid the creation of a new address space for the child. In other words, they will share the same mm_struct and that makes vfork() faster than regular fork() (threads also share the same mm_struct).
Very cool stuff, I am being good at choosing projects.