Instrospection: Exploring memory layout
Moderate
In the final exercise, we will create a user-space program named introspection.c
. The goal of this exercise is to compare the memory layout of a parent process with that of a child process. In addition to information about the memory layout of the processes, we are also interested in the values at these memory locations. Furthermore, the child process should not print anything itself but will share its layout information with the parent via a UNIX pipe.
Task
Your program should print the locations of:
- the stack
- the heap
- the
.text
section (the code of your program) - the
.data
section (the global variables of your program)
Use the following data structures to store addresses and values:
struct memlayout {
void* text;
int* data;
int* stack;
int* heap;
};
struct memvalues {
int data;
int stack;
int heap;
};
Your program will need to perform the following steps:
- Initialize a
struct memlayout
with the addresses from the corresponding program sections in each field of the struct (cf. hints below). - Ensure that all allocated variables are initialized.
-
fork
a new process and set up apipe
shared between the parent and child. - In the child process:
- Initialize a
struct memlayout
in the same way as in the previous exercise. You should reuse the stack and data variables but create a new heap allocation. - Ensure that all allocated variables are initialized (use a different value than in the parent process).
- Copy the values into a
struct memvalues
. - Send the
struct memlayout
followed by thestruct memvalues
to the parent via the pipe.
- Initialize a
- In the parent process:
- Receive the
struct memlayout
andstruct memvalues
from the child process. - Print these structs using
print_mem
(see below). - Initialize a
struct memvalues
with the values from the parent process. - Print the structs from the parent process using
print_mem
.
- Receive the
// who should be either "parent" or "child"
void print_mem(const char* who, struct memlayout* layout, struct memvalues* values) {
printf("%s:stack:%p:%d\n", who, layout->stack, values->stack);
printf("%s:heap:%p:%d\n", who, layout->heap, values->heap);
printf("%s:data:%p:%d\n", who, layout->data, values->data);
printf("%s:text:%p\n", who, layout->text);
}
Before you begin the implementation, take a moment to consider what the output of your program will be. Try to predict the relationship between the addresses and values of variables in the parent and child processes. Compare the printed addresses with Figure 3.4 from the xv6 book.
Hints
You can use the methods below to find the addresses of each section:
- To find an address on the stack, declare a local variable (e.g., of type
int
) within a function. This variable will always be allocated on the call stack of your process, so its address will be a stack address. - To find an address on the heap, allocate an
int
usingsbrk(sizeof(int))
. The return value of this system call provides the previous value of the program break, which is the address of the new allocation. -
To find an address in the
.text
section, store the address of a function. You can obtain the address of a function by simply using its name (without parentheses):void* function_address = (void*)function_name;
- To find an address in the
.data
section, declare a global variable (of typeint
) and retrieve the address of this variable.
The methods described above will return an address from a specific section of the program. These addresses are not necessarily the start or end addresses of these sections, but they will always fall within the range [
section start addr
,section end addr
]. For this exercise, it is not necessary to provide the start address of a section; any address within the section range is sufficient.
Attention points and common mistakes
- Always check the return values of system calls and standard library functions for potential error codes. In this exercise, both
pipe()
andfork()
can return an error code. - Never hardcode pointer addresses in your application (e.g., as hexadecimal numbers like
0xbadc0debadc0de
). Always use the runtime return value ofsbrk()
as a pointer to an address on the heap. Even if you were to hardcode addresses that you looked up withgdb
, this does not guarantee correctness in subsequent executions of your program! The operating system is free to load your application at a different address (even if this is not the case for xv6). - Never return the address of a local, stack-allocated variable in C. When you read from or write to this memory after the function call, that stack memory will have been freed and may even be overwritten by another function call.
- There is a significant difference between
sizeof(struct memlayout)
andsizeof(struct memlayout*)
. The first gives the size in bytes of astruct memlayout
, while the second gives the size in bytes of a pointer. - Uninitialized variables in C have an undefined value. This means that even if you observe different values in practice, the C standard never guarantees that an uninitialized variable has a different value than another variable.