(* arch/arm/boot/bootp/init.S - 有人說會先執行這支程式, 目前不清楚這支程式在什麼情形下會執行, 在我的平台上沒有.)
要瞭解一支程式架構,先從目錄結構與 Makefile 下手是最快的方法,kernel 也不例外,與 arm 相關的開機程式碼,會放在 arch/arm/boot 裡面。
linux kernel image 大致上來說可分為無壓縮跟有壓縮兩種,通常無壓縮格式是沒在使用的。
Linux kernel image 常見格式表 | |
---|---|
Format | Description |
zImage | zipped image |
xipImage | execute in place image |
uImage | uboot wrapped image |
在 ARM 平台裡,zImage 為最常見的格式,從 boot 目錄裡的 Makefile,可以知道與 zImage 相關的程式是放在 arch/arm/boot/compressed 裡面,從 compressed 目錄裡的 Makefile 可以知道 zImage 主要是由 head.o + misc.o + piggy.o 還有一些與平台相關的程式所組成,其中比較重要的檔案為:
File | Description |
---|---|
head.o | kernel 進入點 |
misc.o | 解壓縮副函式 |
piggy.o | 壓縮的 kernel image |
*.o | 其他與平台相關的程式碼 |
參考 Documentation/arm/booting.txt 文件,
可以知道 boot loader 在呼叫 kernel 之前,必需有以下條件:
- 關畢所有 DMA 相關的設備,以免記憶體被無意義的網路封包或資碟資料干擾。這會省下你許多小時的除錯時間。
- CPU 暫存器設定
r0 = 0
r1 = machine type ID
r2 = 指向 tagged list 在 RAM 的實體位址
- CPU 模式
所有中斷都必需關畢 (IRQs and FIQs)
CPU 必須在 SVC mode (Angel 是一個特別的例外)
- Caches, MMUs
MMU 必須關畢
Instruction cache (I-cache) 可以開啟或關畢
Data cache (D-cache) 必須關畢
- boot loader 被預期呼叫 kernel image 的方法是,直接跳到 kernel image 的第一個指令。
以下是各檔案的執行順序與大致上會做的事:
- arch/arm/boot/compressed/head.S
- 從 boot loader 跳過來後,第一個執行的位置會是在 start: 這個 label。
- 為了保留不使用 boot loader 能開機的能力,會先 nop 八次來保留 ARM 的中斷向量表。
- 保存 r1、r2 的數值到 r7 與 r8。
- 關畢 FIQ 與 IRQ。
- link 與平台相關的程式來執行,如 "head-xscale.S"。
- 記算計憶體裡的偏移量並修正各節區。
- 配置 C 的 runtime environment。
- 設置與開啟 mmu。
- 檢查 compressed image size.
- 設置參數到 r0 到 r3, 呼叫 arch/arm/boot/compressed/misc.c 裡的 decompress_kernel.
- 從 boot loader 跳過來後,第一個執行的位置會是在 start: 這個 label。
- arch/arm/boot/compressed/misc.c
- 此檔案主要提供 decompress_kernel 函式模式供 arch/arm/boot/compressed/head.S 呼叫
- 宣告印字元與印字串等函式, 在 decompress_kernel 中透過這幾個函式印出 decompress kernel 等字串.
- 呼叫 lib/inflate.c 裡的 gunzip 函式來解壓縮 kernel image
- 返回 arch/arm/boot/compressed/head.S
- 此檔案主要提供 decompress_kernel 函式模式供 arch/arm/boot/compressed/head.S 呼叫
- arch/arm/kernel/head.S
-
---> 判斷 processor & machine ID
- arch/arm/kernel/head-common.S
-
---> 判斷 machine ID 錯誤顯示函式 (low level debug 開啟時)
- init/main.c
-
---> start_kernel
詳細的追蹤內容:
arch/arm/boot/compressed/head.S
.section ".start", #alloc, #execinstr /* * sort out different calling conventions */ .align start: .type start,#function .rept 8 mov r0, r0 .endr b 1f .word 0x016f2818 @ Magic numbers to help the loader .word start @ absolute load/run zImage address .word _edata @ zImage end address 1: mov r7, r1 @ save architecture ID mov r8, r2 @ save atags pointer
註解說為了不同的呼叫慣例, 所以要 .align 以供對齊,
參考: http://wiki.debian.org/ArmEabiPort#Struct_packing_and_alignment
一開始程式會執行 mov r0, r0 (即 nop)八次 (.rept = repeat, .endr = end of repeat),
這部份有兩種說法,一個是因為 ARM11 有 8 stages pipeline,另一種說法是保留中斷向量表(interrupt vector table)。
我個人是認為要保留 Linux kernel 能直接開機的能力而保留中斷向量表的空間,因為要 flush pipeline 不需要用到八個 nop。
接下來指令是 b 1f (f = forward) 跳到後面的 lable "1:"
在 lable "1:" 之前, 有三個 .word 資訊如下:
zImage 檔頭資訊 | ||
---|---|---|
zImage 偏移位置 | 內容 | 說明 |
0x24 | 0x016F2818 | 用來識別 ARM Linux zImage 的 Magic number |
0x28 | start | zImage 開始位置(通常為0) |
0x2C | _edata | zImage 結束位置(通常為檔案大小) |
參考 http://www.simtec.co.uk/products/SWLINUX/files/booting_article.html
從 label "1:" 開始會將 r1 (architecture ID) 跟 r2 (atags pointer) 的值保存在 r7 跟 r8.
#ifndef __ARM_ARCH_2__ /* * Booting from Angel - need to enter SVC mode and disable * FIQs/IRQs (numeric definitions from angel arm.h source). * We only do this if we were in user mode on entry. */ mrs r2, cpsr @ get current mode tst r2, #3 @ not user? bne not_angel mov r0, #0x17 @ angel_SWIreason_EnterSVC swi 0x123456 @ angel_SWI_ARM not_angel: mrs r2, cpsr @ turn off interrupts to orr r2, r2, #0xc0 @ prevent angel from running msr cpsr_c, r2 #else teqp pc, #0x0c000003 @ turn off interrupts #endif
這段程式主要是用來關畢所有中斷.
關於 Angel 可以參考這個網址:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0066d/Babdcdih.html
參考 arm architecture reference manual:
32-bit 的 cpsr 暫存器:
Name | Desc | Bits |
---|---|---|
cpsr_f | flags field | [31:24] |
cpsr_s | status field | [23:16] |
cpsr_x | extension field | [15:8] |
cpsr_c | control field | [7:0] |
cpsr bit field | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Bit | 31 | 30 | 29 | 28 | 27 | 26-----8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
N | Z | C | V | Q | DNM(RAZ) | I | F | T | M4 | M3 | M2 | M1 | M0 |
32-bit 模式設定 | |
---|---|
M[4:0] | Mode |
0b10000 | User |
0b10001 | FIQ |
0b10010 | IRQ |
0b10011 | Supervisor (SVC) |
0b10111 | Abort |
0b11011 | Undefined |
0b11111 | System |
26-bit 的 pc 暫存器:
Bit 31 30 29 28 27 26 25------------2 1 0
N Z C V I F program counter M1 M0
26-bit 模式設定:
M[1:0] Mode
0b00 User
0b01 FIQ
0b10 IRQ
0b11 Supervisor
N Negative
Z Zero
C Carry
O Overflow
I IRQ
F FIQ
T Thumb
/* * Note that some cache flushing and other stuff may * be needed here - is there an Angel SWI call for this? */ /* * some architecture specific code can be inserted * by the linker here, but it should preserve r7, r8, and r9. */ .text adr r0, LC0 ldmia r0, {r1, r2, r3, r4, r5, r6, ip, sp} subs r0, r0, r1 @ calculate the delta offset @ if delta is zero, we are beq not_relocated @ running at the address we @ were linked at. /* * We're running at a different address. We need to fix * up various pointers: * r5 - zImage base address * r6 - GOT start * ip - GOT end */ add r5, r5, r0 add r6, r6, r0 add ip, ip, r0
註解提到一些與平台相關的程式會被 link 加在這裡(如 head-xscale.S), 但必需保留 r7, r8 與 r9.
這段程式碼主要是在計算 offset delta 並重新定址 zImage 的 base 跟 GOT(global offset table) address.
r0: LC0 runtime address
r1: LC0 linked address
r0 - r1: delta offset
adr 是一個假指令, 會自動幫你算出偏移量並將自動換成 add, sub, mov 或 mvn 等指令。
這邊是算出 LC0 offset 然後 load 到各暫存器裡。
LC0 如下:
.type LC0, #object LC0: .word LC0 @ r1 .word __bss_start @ r2 .word _end @ r3 .word zreladdr @ r4 .word _start @ r5 .word _got_start @ r6 .word _got_end @ ip .word user_stack+4096 @ sp LC1: .word reloc_end - reloc_start .size LC0, . - LC0
r0: LC0 runtime address
r1: LC0 linked address
r2: BSS start
r3: BSS end
r4: memory physical address
r5: kernel start address
r6: GOT start
ip: GOT end
sp: stack pointer
stack pointer 要加 4096,這是因為 stack 是高位往低位倒著長的,所以先指到 stack 最尾端。
在檔案最後一行可以看到:
user_stack: .space 4096其他暫存器後面會用到,如 r4 在設置 mmu 時會用到。
#ifndef CONFIG_ZBOOT_ROM /* * If we're running fully PIC === CONFIG_ZBOOT_ROM = n, * we need to fix up pointers into the BSS region. * r2 - BSS start * r3 - BSS end * sp - stack pointer */ add r2, r2, r0 add r3, r3, r0 add sp, sp, r0 /* * Relocate all entries in the GOT table. */ 1: ldr r1, [r6, #0] @ relocate entries in the GOT add r1, r1, r0 @ table. This fixes up the str r1, [r6], #4 @ C references. cmp r6, ip blo 1b #else /* * Relocate entries in the GOT table. We only relocate * the entries that are outside the (relocated) BSS region. */ 1: ldr r1, [r6, #0] @ relocate entries in the GOT cmp r1, r2 @ entry < bss_start || cmphs r3, r1 @ _end < entry addlo r1, r1, r0 @ table. This fixes up the str r1, [r6], #4 @ C references. cmp r6, ip blo 1b不知道 ZBOOT 是什麼,不過可以知道這段是在重新定址(relocate) BSS 與 GOT。 GOT (Global Offset Table) 是用來存放 object 的指針,因為 linker 不會知道程式在執行時,object 會放在記憶體哪裡,所以必須透過 GOT 來取得 object 的位址。 GOT 參考: http://bottomupcs.sourceforge.net/csbu/x3824.htm 最後的 blo 就是 bcc,這段程式碼等於 if (r6 < ip) goto 1: (b = before 往上跳)。
not_relocated: mov r0, #0 1: str r0, [r2], #4 @ clear bss str r0, [r2], #4 str r0, [r2], #4 str r0, [r2], #4 cmp r2, r3 blo 1b
這一段就很簡單了,把 BSS 節區全部設為 0,因為 C 語言規範裡,BSS (uninitialized data) 應該為 0。
/* * The C runtime environment should now be setup * sufficiently. Turn the cache on, set up some * pointers, and start decompressing. */ bl cache_on
C 的 runtime environment 設置到這應該足夠了。把 cache 打開,設置一些指標,並開始解壓縮。
這裡只有一行 bl cache_on,bl 會把現在 pc 存到 lr 裡以供返回使用。
/* * Turn on the cache. We need to setup some page tables so that we * can have both the I and D caches on. * * We place the page tables 16k down from the kernel execution address, * and we hope that nothing else is using it. If we're using it, we * will go pop! * * On entry, * r4 = kernel execution address * r6 = processor ID * r7 = architecture number * r8 = atags pointer * r9 = run-time address of "start" (???) * On exit, * r1, r2, r3, r9, r10, r12 corrupted * This routine must preserve: * r4, r5, r6, r7, r8 */ .align 5 cache_on: mov r3, #8 @ cache_on function b call_cache_fn
為了後面的 table,以 .align 5 來對齊,r3 = #8 指到 cache_on 函式。
/* * Here follow the relocatable cache support functions for the * various processors. This is a generic hook for locating an * entry and jumping to an instruction at the specified offset * from the start of the block. Please note this is all position * independent code. * * r1 = corrupted * r2 = corrupted * r3 = block offset * r6 = corrupted * r12 = corrupted */ call_cache_fn: adr r12, proc_types #ifdef CONFIG_CPU_CP15 mrc p15, 0, r6, c0, c0 @ get processor ID #else ldr r6, =CONFIG_PROCESSOR_ID #endif 1: ldr r1, [r12, #0] @ get value ldr r2, [r12, #4] @ get mask eor r1, r1, r6 @ (real ^ match) tst r1, r2 @ & mask addeq pc, r12, r3 @ call cache function add r12, r12, #4*5 b 1b
(1) 取得 CPU ID。
(2) 從 proc_types 開始查表比對 CPU 以提供相應的程式。
(3) 透過直接改 pc 暫存器的方式來執行相應的程式。
r12: 目前 proc_types 表格位址
r1: CPU ID
r2: CPU ID 的遮罩
r3: #8 (r12 + #8 則是 cache on)
/* * Table for cache operations. This is basically: * - CPU ID match * - CPU ID mask * - 'cache on' method instruction * - 'cache off' method instruction * - 'cache flush' method instruction * * We match an entry using: ((real_id ^ match) & mask) == 0 * * Writethrough caches generally only need 'on' and 'off' * methods. Writeback caches _must_ have the flush method * defined. */ .type proc_types,#object proc_types:
其它表格省略,只列出會跳躍到的,PXA3xx CPU ID 為 0x690568xx 屬於 ARMv5TE 系列。
Note: 參考 Datasheet Valume I 的 2.16.6.2 裡的 Table 9。
.word 0x00050000 @ ARMv5TE .word 0x000f0000 b __armv4_mmu_cache_on b __armv4_mmu_cache_off b __armv4_mmu_cache_flush
所以接下來會執行 __armv4_mmu_cache_on
__armv4_mmu_cache_on: mov r12, lr bl __setup_mmu mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer mcr p15, 0, r0, c8, c7, 0 @ flush I,D TLBs mrc p15, 0, r0, c1, c0, 0 @ read control reg orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement orr r0, r0, #0x0030 bl __common_mmu_cache_on mov r0, #0 mcr p15, 0, r0, c8, c7, 0 @ flush I,D TLBs mov pc, r12
把 lr 保存在 r12 後用 bl 跳 __setup_mmu,所以先看 __setup_mmu:
__setup_mmu: sub r3, r4, #16384 @ Page directory size bic r3, r3, #0xff @ Align the pointer bic r3, r3, #0x3f00 /* * Initialise the page tables, turning on the cacheable and bufferable * bits for the RAM area only. */ mov r0, r3 mov r9, r0, lsr #18 mov r9, r9, lsl #18 @ start of RAM add r10, r9, #0x10000000 @ a reasonable RAM size mov r1, #0x12 orr r1, r1, #3 << 10 add r2, r3, #16384 1: cmp r1, r9 @ if virt > start of RAM orrhs r1, r1, #0x0c @ set cacheable, bufferable cmp r1, r10 @ if virt > end of RAM bichs r1, r1, #0x0c @ clear cacheable, bufferable str r1, [r0], #4 @ 1:1 mapping add r1, r1, #1048576 teq r0, r2 bne 1b
這個迴圈會為整個 4G memory 配置 translation table 每個 word (4-bytes) 代表 1M memory section,
所以 translation table size 算法:
total size (4G) / section size (1M) = 4K * word(4) = 16K (Bytes)
ARM MMU 要求暫存器裡的數值要對齊 16K 邊界 (boundary align),所以才需要把 0x3fff 設成 0,
但是目前還不清楚為什麼要先清 0xff 再清 0x3f00。
記得前面的 LC0 吧,r4 一直都沒被改過,所以 r4: zreladdr,註解說是 kernel execute address,其實就是 physical memory address,
那這個 zreladdr 是在哪邊決定呢?答案是在 Makefile 裡,透過 arch/arm/boot 與 arch/arm/boot/compressed 裡的 Makefile,
最後可以在 arch/arm/mach-xxx/Makefile.boot 裡找到真正的設定。
用到的暫存器意義:
r0: translation table current (variable)
r1: table descriptor,0x12 表示 type 為 section。
r2: translation table end
r3: translation table start (constant)
r4: zreladdr (equal to r2)
r9: physical memory base address
r10: resonable memory end address (size: 256M)
以我的實例來說,physical memory 是從 0x80000000 開始,用一張圖來表示 first-level section 的記憶體配置會是這樣:
(R9) 0x80000000 +------------------+ \ Start of RAM |Reserved | \ (R3) 0x80004000 +------------------+\ \ |Translation Table | +0x4000 (16K) (R2) 0x80008000 +------------------+/ \ | | \ | | +0x10000000 (256M resonable memory) ---------------------- / Cut Short / ---------------------- / | | / | | / (R10) 0x90000000 +------------------+ /
其他部份請參考 ARM reference manual 的 MMU 一節有詳細說明。
/* * If ever we are running from Flash, then we surely want the cache * to be enabled also for our execution instance... We map 4MB of it * so there is no map overlap problem for up to 1 MB compressed kernel. * If the execution is in RAM then we would only be duplicating the above. */ mov r1, #0x1e orr r1, r1, #3 << 10 mov r2, pc, lsr #20 orr r1, r1, r2, lsl #20 add r0, r3, r2, lsl #2 str r1, [r0], #4 add r1, r1, #1048576 str r1, [r0], #4 add r1, r1, #1048576 str r1, [r0], #4 add r1, r1, #1048576 str r1, [r0] mov pc, lr ENDPROC(__setup_mmu)這一段程式利用 PC 來取得現在執行位置,然後再將現在執行位置開始的 4M 空間設置 cached/buffered, 這是為了 kernel 在 flash 上執行時,也可以 cache,當然現在大部份 kernel 都在 RAM 裡面跑,這樣也只是重覆上面迴圈重設四個 descriptor 而已。
__common_mmu_cache_on: #ifndef DEBUG orr r0, r0, #0x000d @ Write buffer, mmu #endif mov r1, #-1 mcr p15, 0, r3, c2, c0, 0 @ load page table pointer mcr p15, 0, r1, c3, c0, 0 @ load domain access control b 1f .align 5 @ cache line aligned 1: mcr p15, 0, r0, c1, c0, 0 @ load control register mrc p15, 0, r0, c1, c0, 0 @ and read it back to sub pc, lr, r0, lsr #32 @ properly flush pipeline這一段就只是透過 cooperator 來設置 MMU 而已,p15 細節一樣請參考 ARM reference manual MMU 機制。 *會先返回 __armv4_mmu_cache_on:
mov r1, sp @ malloc space above stack add r2, sp, #0x10000 @ 64k max /* * Check to see if we will overwrite ourselves. * r4 = final kernel address * r5 = start of this image * r2 = end of malloc space (and therefore this image) * We basically want: * r4 >= r2 -> OK * r4 + image length <= r5 -> OK */ cmp r4, r2 bhs wont_overwrite sub r3, sp, r5 @ > compressed kernel size add r0, r4, r3, lsl #2 @ allow for 4x expansion cmp r0, r5 bls wont_overwrite mov r5, r2 @ decompress after malloc space mov r0, r5 mov r3, r7 bl decompress_kernel add r0, r0, #127 + 128 @ alignment + stack bic r0, r0, #127 @ align the kernel length /* * r0 = decompressed kernel length * r1-r3 = unused * r4 = kernel execution address * r5 = decompressed kernel start * r6 = processor ID * r7 = architecture ID * r8 = atags pointer * r9-r14 = corrupted */ add r1, r5, r0 @ end of decompressed kernel adr r2, reloc_start ldr r3, LC1 add r3, r2, r3 1: ldmia r2!, {r9 - r14} @ copy relocation code stmia r1!, {r9 - r14} ldmia r2!, {r9 - r14} stmia r1!, {r9 - r14} cmp r2, r3 blo 1b add sp, r1, #128 @ relocate the stack bl cache_clean_flush add pc, r5, r0 @ call relocation code /* * We're not in danger of overwriting ourselves. Do this the simple way. * * r4 = kernel execution address * r7 = architecture ID */ wont_overwrite: mov r0, r4 mov r3, r7 bl decompress_kernel b call_kernelcache 開完後,返回 226 行,接下來就是執行 decompress_kernel 跟 call_kernel 了。
call_kernel: bl cache_clean_flush bl cache_off mov r0, #0 @ must be zero mov r1, r7 @ restore architecture number mov r2, r8 @ restore atags pointer mov pc, r4 @ call kernel
沒有留言:
張貼留言