The man page for mlockall
on my kernel 3.0 says
mlockall() locks all pages mapped into the address space of the calling process. This includes the pages of the code, data and stack segment, as well as shared libraries, user space kernel data, shared memory, and memory-mapped files. All mapped pages are guaranteed to be resident in RAM when the call returns successfully; the pages are guaranteed to stay in RAM until later unlocked.
and later says
Real-time processes that are using mlockall() to prevent delays on page faults should reserve enough locked stack pages before entering the time-critical section, so that no page fault can be caused by function calls. This can be achieved by calling a function that allocates a sufficiently large automatic variable (an array) and writes to the memory occupied by this array in order to touch these stack pages. This way, enough pages will be mapped for the stack and can be locked into RAM. The dummy writes ensure that not even copy-on-write page faults can occur in the critical section.
I understand that this system call can't guess the maximum stack size that will be reached and thus is unable to lock pages for the stack. But why the first part of the man displayed above says that it's also done for the stack ? Is there an error in this man page, or does it just mean that the locking is done for the initial stack size ?
Yes, locking is done for the current stack pages, but not for all possible future stack pages.
It's explained by that first sentence:
mlockall()
locks all pages mapped into the address space of the calling process.
So if a page is mapped, it will be locked. If not, it won't.
It just mentions the stack in the original sentence because the stack memory is mapped separately from the heap memory. There's no special treatment for the stack, if it's mapped it'll be locked, otherwise it won't. So as the second section you quote says, it's important to grow the stack to the maximum size it will reach whilst your code is running before you call mlockall
.
Actually, from a quick reading of the mm/mlock.c source code, I'd say it simply locks everything: all currently mapped pages.
static int do_mlockall(int flags)
{
struct vm_area_struct * vma, * prev = NULL;
unsigned int def_flags = 0;
if (flags & MCL_FUTURE)
def_flags = VM_LOCKED;
current->mm->def_flags = def_flags;
if (flags == MCL_FUTURE)
goto out;
for (vma = current->mm->mmap; vma ; vma = prev->vm_next) {
vm_flags_t newflags;
newflags = vma->vm_flags | VM_LOCKED;
if (!(flags & MCL_CURRENT))
newflags &= ~VM_LOCKED;
/* Ignore errors */
mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
}
out:
return 0;
}
Despite what larsmans said, I do think it also applies to all future pages if MCL_FUTURE is also specified. In that case 'current->mm->def_flags is updated to include VM_LOCKED.
MCL_FUTURE
is specified, it will lock in future pages after they fault in. But it can't predict which pages will be accessed in the future and lock them in now. So if you're trying to avoid a future hard page fault, you must make sure the page is mapped at the time you call mlockall
. Otherwise, each page can fault once as it is made resident - David Schwartz 2012-04-07 04:05
mlockall
will lock at least that 128k, which is more than any reasonable program will ever need - R.. 2012-04-05 00:56