Initial VMA data and vfork()
As I am going to play with the VMA code and its main data structure (a red-black tree), I have started to collect some data by using systemtap, and got a bit impressed by the number of times find_vma() is called by some processes, as shows the table below.
| PID | exec name | nr calls (5 seconds) | doing… |
| 21984 | cc1 | 1570 | Compiling Linux kernel |
| 11533 | X | 13252 | Switching work spaces |
| 23554 | git-index-pack | 42213 | Resolving deltas |
So, in this specific case gcc has searched its VMA tree 1570 in five seconds while (probably) compiling some C file. Is that a big number? I am not sure but X, for example, is calling find_vma() more than 2600 times in a second if I switch fast between work spaces in Window Maker. I did not expect this.
Git is the winner so far, more than 8000 searches in one second while resolving deltas in a clone operation.
I also wanted to know what are the most called system calls that may search (or change) the VMAs’ tree, and here goes some more numbers:
| system call | nr calls (5 seconds) |
| sys_mmap2 | 11039 |
| sys_munmap | 5102 |
| sys_brk | 3574 |
| sys_execve | 392 |
| sys_clone | 255 |
| sys_vfork | 82 |
This snapshot has been taken while compiling the Linux kernel and while the numbers are of no surprise, there are two interesting functions in that table, brk() and vfork(). I thought that their usage would be very rare these days: users of brk() are now using mmap() and I was wondering whether it makes sense to use vfork() in a modern kernel like Linux, with very fast process creation time and copy on write support.
But a quick look at the code shows that sys_vfork() passes the CLONE_VM flag to do_fork(), this in turn will avoid the creation of a new address space for the child. In other words, they will share the same mm_struct and that makes vfork() faster than regular fork() (threads also share the same mm_struct).
Very cool stuff, I am being good at choosing projects.