Test project and variables
Recently a curious developer asked: where are global variables initialized? To explain the mechanism I created a simple project based on a Renesas RH850 and examined the startup process. In main.c I defined these variables:
int counter, accumulator = 0, limit_value = 1000000; unsigned char str_aa55[2] = {0xAA,0x55}; unsigned int int_1122334455667788 = 0x11223344; unsigned int int_55667788 = 0x55667788; int bss_val; void main(void){ }
I ran a simulation and, as expected, these variables were initialized before main, i.e. they were automatically assigned. How does that happen? MCU startup code is usually short, even in assembly, so inspecting it may reveal the answer.
RH850 startup assembly
-- Clear local RAM mov ___ghs_ramstart, r6 -- start of local RAM mov ___ghs_ramend, r7 -- end of local RAM mov r0, r11: st.dw r0, 0[r6] addi 8, r6, r6 cmp r7, r6 bl 1b -- Jump to the HW initialisation function jarl ___lowinit, lp -- Jump to the initialisation functions of the library -- and from there to main() jr __start
The first part of this assembly clears local RAM. In short: RAM is zeroed -> ___lowinit is executed -> __start is executed -> main is entered. Since RAM is cleared first, all global variables would be zero at that point, so the actual assignment of nonzero initial values must occur in ___lowinit or __start. Those functions are provided by the toolchain libraries and their source was not visible in this RH850 project. By checking variable values during simulation I located the initialization of global variables to __start. Knowing where the initialization occurs still leaves the question of how it is done, so I continued investigating.
Trying another toolchain: NXP S32K
I created an NXP S32K1xx project with the same global variables:
int counter, accumulator = 0, limit_value = 1000000; unsigned char str_aa55[2] = {0xAA,0x55}; unsigned int int_1122334455667788 = 0x11223344; unsigned int int_55667788 = 0x55667788; int bss_val;
Inspecting the startup code shows a direct call to an initializer:
/* Init .data and .bss sections */ ldr r0, =init_data_bss blx r0
In this toolchain the startup calls init_data_bss directly. init_data_bss initializes global variables. Here is a portion of that function:
void init_data_bss(void){ /* ...... */ /* Data */ data_ram = (uint8_t *)__DATA_RAM; data_rom = (uint8_t *)__DATA_ROM; data_rom_end = (uint8_t *)__DATA_END; /* ...... */ /* BSS */ bss_start = (uint8_t *)__BSS_START; bss_end = (uint8_t *)__BSS_END; /* ...... */ /* Copy initialized data from ROM to RAM */ while (data_rom_end != data_rom) { *data_ram = *data_rom; data_ram++; data_rom++; } /* ...... */ /* Clear the zero-initialized data section */ while (bss_end != bss_start) { *bss_start = 0; bss_start++; } /* ...... */ }
Where do the initial values come from?
The initialized values in the RAM data section are copied from the ROM data area defined by __DATA_ROM. What is __DATA_ROM and where does it come from? It is defined by the linker script. For example:
/* Specify the memory areas */ MEMORY{ /* ... */ /* SRAM_L */ m_data (RW) : ORIGIN = 0x1FFF8000, LENGTH = 0x00008000 m_data_2 (RW) : ORIGIN = 0x20000000, LENGTH = 0x00007000 /* ... */ } .data : AT(__DATA_ROM) { . = ALIGN(4); __DATA_RAM = .; __data_start__ = .; /* Create a global symbol at data start. */ *(.data) /* .data sections */ *(.data*) /* .data* sections */ . = ALIGN(4); __data_end__ = .; /* Define a global symbol at data end. */ } > m_data __DATA_END = __DATA_ROM + (__data_end__ - __data_start__); __CODE_ROM = __DATA_END; /* Symbol is used by code initialization. */ /* Uninitialized data section. */ .bss : { /* This is used by the startup in order to initialize the .bss section. */ . = ALIGN(4); __BSS_START = .; __bss_start__ = .; *(.bss) *(.bss*) *(COMMON) . = ALIGN(4); __bss_end__ = .; __BSS_END = .; } > m_data_2
To summarize: global variables that have nonzero initial values are placed in the .data section, while uninitialized globals (or those initialized to zero) are placed in the .bss section.
Confirming with the map file and image
The map file shows variable names and their addresses or sections:
.data 0x1fff8400 0x42c load address 0x000009cc . = ALIGN (0x4) 0x1fff8400 __DATA_RAM = . __data_start__ = . *(.data) *(.data*) .data.limit_value 0x1fff8400 0x4 ./src/main.o limit_value .data.str_aa55 0x1fff8404 0x2 ./src/main.o str_aa55 *fill* 0x1fff8406 0x2 .data.int_11223344 0x1fff8408 0x4 ./src/main.o int_11223344 .data.int_55667788 0x1fff840c 0x4 ./src/main.o int_55667788 .bss 0x20000000 0x28 . = ALIGN (0x4) 0x20000000 __BSS_START = . __bss_start__ = . *(.bss) *(.bss*) .bss.accumulator 0x2000001c 0x4 ./src/main.o accumulator *(COMMON) COMMON 0x20000020 0x8 ./src/main.o bss_val counter 0x20000028 . = ALIGN (0x4) __bss_end__ = . __BSS_END = . __DATA_ROM = . 0x000009cc
From the map file we can see that __DATA_ROM corresponds to address 0x000009cc, so the initial values for variables such as
int limit_value = 1000000; unsigned char str_aa55[2] = {0xAA,0x55}; unsigned int int_1122334455667788 = 0x11223344; unsigned int int_55667788 = 0x55667788;
are stored at that ROM address. You can then inspect the generated hex file to confirm the binary data.

Conclusion
The mechanism is straightforward: the linker places initial values for .data in a ROM region (defined by __DATA_ROM). At startup the code copies that ROM data into RAM and clears the .bss region. The linker script directive .data : AT(__DATA_ROM) instructs the linker to emit the initialized data at the __DATA_ROM address, and the startup code performs the copy at runtime.
Understanding linker scripts, the generated map file, and startup code is essential for tracing low-level initialization and debugging boot issues on MCUs. A systematic review of these topics is valuable when diagnosing platform or toolchain behavior.
ALLPCB