Machine and Assembly Code

Note

Mecrisp-Stellaris 2.4.2-RA used here

Ra Register Allocator Usage

RA

Register Alocator Register

USAGE

r0

Free scratch register

Saved on interrupt entry by hardware

r1

Free scratch register

Saved on interrupt entry by hardware

r2

Free scratch register

Saved on interrupt entry by hardware

r3

Free scratch register

Saved on interrupt entry by hardware

r4

Inner loop count

Needs Push and Pop when used otherwise

r5

Inner loop limit

Needs Push and Pop when used otherwise

r6

TOS: Top-Of-Stack

Stack design is interrupt safe

r7

PSP: Parameter Stack Pointer

Stack design is interrupt safe

r8

Unused

r9

Unused

r10

Unused

r11

Unused

r12

Unused

Saved on interrupt entry by hardware

r13

SP Return Stack Pointer

r14

LR Link Register

r15

PC Program Counter, always odd

Executing Assembly/Machine Code

Executing and or Inlining Assembly/Machine code is fairly easily handled by Mecrisp-Stellaris and the following workflow example shows one way of doing it.

Methodology

  • Write your assemby code

  • Assemble your code

  • Paste the machine code into a Word

  • Dissasemble the Word

  • Test the Word

Warning

  • You are free to change R0, R1, R2 and R3

  • R4 and R5 need to pushed and popped if used.

  • R6 is the Top of Stack (TOS) on entry and exit. If there are more than one result, they have to be pushed onto to the stack.

  • Stack pointer R7 must be used in an interrupt-safe way.

  • Subroutines cannot be used

  • When you use “[” Mecrisp-Stellaris assumes “everything might happen” and frames your code with push {lr} / pop {pc} opcodes

Notes

  1. When writing to the Stack, the Stack Index Register (R7) must be decremented by 4 first and the existing Stack contents saved (pushed down one level) to the Stack.

::

subs r7 #4 R7 = R7 - 4 R7 is Data Stack Pointer, so decrement it by 4 ( 8 bit bytes) str r6 r7 #0 Save old contents of R6 at new memory location specified by R7

Example 1: 2plus (adds 2 to any number)

Write 2plus.s assemby code

I’m only interested in the assembled code after “start:” because Mecrisp-Stellaris has its own Vector Table and Initialisation Code.

// Minimal assembler example for stm32f051 to be used in a Machine Code Word "2plus"
// Requires svd2as
// Board: stm32f0 Discovery
// 1 June 2018  Copyright <terry@tjporter.com.au> Released under the GPL
//

.cpu cortex-m0                                       // Tell the assembler what model of CortexM this is for
.thumb                                               // Cortex micros only understand thumb(2) code
.text                                                // what follows is code
.include "bitposn-equates.s"                         // a list of ".equ  BIT0,    0x00000001", ".equ  BIT1,    0x00000002" etc
.include "STM32F0xx-tp1.svd.s"                       // Created by svd2as
.syntax unified // use newer style instructions
Vector_Table:  .word     0x20000000                  // stack pointer value when stack is empty 0x20000000
ResetVector:    .word     start + 1                  // Reset Handler

start:
// Input parameter will be saved in r6 (TOS)
        movs r1, #2                            // move 2 into r1
        adds r2, r6, r1                        // add with carry, r1 + r6 and place result into r2. Note: r6 is TOS on entry
        movs r6, r2                            // move the contents of r2 into r6. Note: the result is placed back onto TOS via r6 on exit

Assemble 2plus.s

This text is from 2plus.list produced by this Makefile.

2plus.elf:     file format elf32-littlearm

Disassembly of section .text:

00000000 <Vector_Table>:
   0:  20000000        andcs   r0, r0, r0

00000004 <ResetVector>:
   4:  00000009        andeq   r0, r0, r9

00000008 <start>:
   8:  2102            movs    r1, #2
   a:  1872            adds    r2, r6, r1
   c:  0016            movs    r6, r2

Paste The Machine Code Into 2plus Word

Add comments for readability

: 2plus ( u -- 2+ )
[          \ *execute* mode
$2102 h,   \ movs      r1, #2
$1872 h,   \ adds      r2, r6, r1
$0016 h,   \ movs      r6, r2
]          \ *compile* mode
;

Dissasemble 2plus

Compare this to the 2plus Word ? The Machine Code is unchanged. Mecrisp-Stellaris has framed the code with push {lr} / pop {pc} opcodes

see 2plus

200003AC: B500  push { lr }
200003AE: 2102  movs r1 #2
200003B0: 1872  adds r2 r6 r1
200003B2: 0016  lsls r6 r2 #0
200003B4: BD00  pop { pc }
ok.

Test 2plus

2 2plus . 4  ok.

2plus.s Makefile

ARMGNU?=arm-none-eabi

COPS = -Wall  -Os -nostdlib -nostartfiles -ffreestanding -save-temps
AOPS = --warn --fatal-warnings

all : 2plus.bin

2plus.o : 2plus.s
       $(ARMGNU)-as -mthumb -a=2plus.lst -g --gstabs+ 2plus.s -o 2plus.o

2plus.bin : memmap 2plus.o
       $(ARMGNU)-ld -Ttext 0x0 -o 2plus.elf -T memmap 2plus.o
       $(ARMGNU)-objdump -D 2plus.elf > 2plus.list
       $(ARMGNU)-objcopy 2plus.elf 2plus.bin -O binary

erase:
       st-flash erase

flash:
       st-flash write 2plus.bin 0x08000000

clean:
       rm -f *.bin
       rm -f *.o
       rm -f *.elf
       rm -f *.list
       rm -f *.lst

Download all the Files from the 2plus project

Example 2, more complex code, A blinky

This blinks the Blue led on a STM32F0 Discovery Board.

Methodology

  • Write your assemby code

  • Assemble your code

  • Paste the machine code into a Word

  • Dissasemble the Word

  • Test the Word

Blinky Assembly Code

  • Requires svd2gas to generate the hardware equate statements

// Minimal assembler blinky example for stm32f051
// Requires svd2as
// Board: stm32f0 Discovery
// Blue LED is on GPIOC bit 8
// 8th June 2018  Copyright <terry@tjporter.com.au> Released under the GPL

.cpu cortex-m0                                    // Tell the assembler what model of CortexM this is for
.thumb                                            // Cortex micros only understand thumb(2) code
.text                                             // what follows is code
.include "bitposn-equates.s"                      // a list of ".equ  BIT0,    0x00000001", ".equ  BIT1," etc
.include "STM32F0xx.svd.s"                        // Compliments of svd2gas
.syntax unified // use newer style instructions
Vector_Table:  .word     0x20000000               // stack pointer value when stack is empty 0x20000000
ResetVector:    .word     start + 1               // Reset Handler

start:
        ldr r1, = RCC_AHBENR
        ldr r2, = RCC_IOPCEN                      // Turn on clock for GPIOC
        str r2, [r1]

moder:
        ldr r1, = GPIOC_MODER
        movs r2, 0b01                             // Set GPIOC-8 to output
        lsls r2, GPIOC_MODER8_Shift               // BitWidth 2
        str r2, [r1]


led_on:                                           // Main Loop
        ldr r1, = GPIOC_ODR
        ldr r2, = BLUE_LED_ON                     // Turn LED on
        str r2, [r1]
        ldr r0, = DELAY                           // delay
delay_1: subs r0,r0,#1
        bne delay_1

led_off:
        ldr r1, = GPIOC_ODR
        ldr r2, = BLUE_LED_OFF                    // Turn LED off
        str r2, [r1]
        ldr r0, = DELAY                           // Delay
delay_2: subs r0,r0,#1
        bne delay_2

        b led_on                                  // Jump to main_loop

constants:
        .equ DELAY, 0xfffff
        .equ BLUE_LED_ON, GPIOC_ODR8
        .equ BLUE_LED_OFF, 0x0

Blinky.list Output

blinky.elf:     file format elf32-littlearm

Disassembly of section .text:

00000000 <Vector_Table>:
   0:  20000000        andcs   r0, r0, r0

00000004 <ResetVector>:
   4:  00000009        andeq   r0, r0, r9

00000008 <start>:
   8:  4909            ldr     r1, [pc, #36]   ; (30 <constants>)
   a:  4a0a            ldr     r2, [pc, #40]   ; (34 <constants+0x4>)
   c:  600a            str     r2, [r1, #0]

0000000e <moder>:
   e:  490a            ldr     r1, [pc, #40]   ; (38 <constants+0x8>)
  10:  2201            movs    r2, #1
  12:  0412            lsls    r2, r2, #16
  14:  600a            str     r2, [r1, #0]

00000016 <led_on>:
  16:  4909            ldr     r1, [pc, #36]   ; (3c <constants+0xc>)
  18:  4a09            ldr     r2, [pc, #36]   ; (40 <BIT6>)
  1a:  600a            str     r2, [r1, #0]
  1c:  4809            ldr     r0, [pc, #36]   ; (44 <BIT6+0x4>)

0000001e <delay_1>:
  1e:  3801            subs    r0, #1
  20:  d1fd            bne.n   1e <delay_1>

00000022 <led_off>:
  22:  4906            ldr     r1, [pc, #24]   ; (3c <constants+0xc>)
  24:  4a08            ldr     r2, [pc, #32]   ; (48 <BIT6+0x8>)
  26:  600a            str     r2, [r1, #0]
  28:  4806            ldr     r0, [pc, #24]   ; (44 <BIT6+0x4>)

0000002a <delay_2>:
  2a:  3801            subs    r0, #1
  2c:  d1fd            bne.n   2a <delay_2>
  2e:  e7f2            b.n     16 <led_on>

00000030 <constants>:
  30:  40021014        andmi   r1, r2, r4, lsl r0
  34:  00080000        andeq   r0, r8, r0
  38:  48000800        stmdami r0, {fp}
  3c:  48000814        stmdami r0, {r2, r4, fp}
  40:  00000100        andeq   r0, r0, r0, lsl #2
  44:  000fffff        strdeq  pc, [pc], -pc   ; <UNPREDICTABLE>
  48:  00000000        andeq   r0, r0, r0

Paste Blinky machine code into a Word

Note

1) Although the Mecrisp Stellaris Kernel enables all the GPIOs, GPIOC-8 (the blue led) is enabled again here, which is totally redundant but doesn’t affect anything.

: blinky ( -- )
[
$4909 h,    \  ldr     r1, [pc, #36]   ; (30 <constants>)
$4a0a h,    \  ldr     r2, [pc, #40]   ; (34 <constants+0x4>)
$600a h,    \  str     r2, [r1, #0]
$490a h,    \  ldr     r1, [pc, #40]   ; (38 <constants+0x8>)
$2201 h,    \  movs    r2, #1
$0412 h,    \  lsls    r2, r2, #16
$600a h,    \  str     r2, [r1, #0]
$4909 h,    \  ldr     r1, [pc, #36]   ; (3c <constants+0xc>)
$4a09 h,    \  ldr     r2, [pc, #36]   ; (40 <BIT6>)
$600a h,    \  str     r2, [r1, #0]
$4809 h,    \  ldr     r0, [pc, #36]   ; (44 <BIT6+0x4>)
$3801 h,    \  subs    r0, #1
$d1fd h,    \  bne.n   1e <GPIOC_MODER15_Shift>
$4906 h,    \  ldr     r1, [pc, #24]   ; (3c <constants+0xc>)
$4a08 h,    \  ldr     r2, [pc, #32]   ; (48 <BIT6+0x8>)
$600a h,    \  str     r2, [r1, #0]
$4806 h,    \  ldr     r0, [pc, #24]   ; (44 <BIT6+0x4>)
$3801 h,    \  subs    r0, #1
$d1fd h,    \  bne.n   2a <BIT5+0xa>
$e7f2 h,    \  b.n     16 <GPIOC_MODER11_Shift>
            \ <constants>:
$40021014 , \  andmi   r1, r2, r4, lsl r0
$00080000 , \  andeq   r0, r8, r0
$48000800 , \  stmdami r0, {fp}
$48000814 , \  stmdami r0, {r2, r4, fp}
$00000100 , \  andeq   r0, r0, r0, lsl #2
$000fffff , \  strdeq  pc, [pc], -pc   ; <UNPREDICTABLE>
$00000000 , \  andeq   r0, r0, r0
]
;

Dissasemble the Blinky Word

This Blinky is 68 bytes long, not including the PUSH and POP added by Mecrisp-Stellaris.

see blinky
200003AE: B500  push { lr }
200003B0: 4909  ldr r1 [ pc #24 ]  Literal 200003D8: 40021014
200003B2: 4A0A  ldr r2 [ pc #28 ]  Literal 200003DC: 00080000
200003B4: 600A  str r2 [ r1 #0 ]
200003B6: 490A  ldr r1 [ pc #28 ]  Literal 200003E0: 48000800
200003B8: 2201  movs r2 #1
200003BA: 0412  lsls r2 r2 #10
200003BC: 600A  str r2 [ r1 #0 ]
200003BE: 4909  ldr r1 [ pc #24 ]  Literal 200003E4: 48000814
200003C0: 4A09  ldr r2 [ pc #24 ]  Literal 200003E8: 00000100
200003C2: 600A  str r2 [ r1 #0 ]
200003C4: 4809  ldr r0 [ pc #24 ]  Literal 200003EC: 000FFFFF
200003C6: 3801  subs r0 #1
200003C8: D1FD  bne 200003C6
200003CA: 4906  ldr r1 [ pc #18 ]  Literal 200003E4: 48000814
200003CC: 4A08  ldr r2 [ pc #20 ]  Literal 200003F0: 00000000
200003CE: 600A  str r2 [ r1 #0 ]
200003D0: 4806  ldr r0 [ pc #18 ]  Literal 200003EC: 000FFFFF
200003D2: 3801  subs r0 #1
200003D4: D1FD  bne 200003D2
200003D6: E7F2  b 200003BE
200003D8: 1014  asrs r4 r2 #0
200003DA: 4002  ands r2 r0
200003DC: 0000  lsls r0 r0 #0
200003DE: 0008  lsls r0 r1 #0
200003E0: 0800  lsrs r0 r0 #0
200003E2: 4800  ldr r0 [ pc #0 ]  Literal 200003E4: 48000814
200003E4: 0814  lsrs r4 r2 #0
200003E6: 4800  ldr r0 [ pc #0 ]  Literal 200003E8: 00000100
200003E8: 0100  lsls r0 r0 #4
200003EA: 0000  lsls r0 r0 #0
200003EC: FFFF
200003EE: 000F  lsls r7 r1 #0
200003F0: 0000  lsls r0 r0 #0
200003F2: 0000  lsls r0 r0 #0
200003F4: BD00  pop { pc }
ok.

Test the Blinky Word

blinky

Note

It works fine, the blue led blinks, but it’s a closed loop and requires resetting the board to stop blinking.

Comparison to a Blinky written in Forth

How will the assembly Blinky compare in code size compared to a Native Forth Blinky ?

Note

Answer: the Forth version is exactly twice the size in bytes.

Native Forth Blinky

Certainly much easier to write, only 8 lines of code. I pasted the Bitfields complete with memmory-mappings from svd2forth-v4 (not ready for release yet), making this a simple task.

This ran correctly first try. The code is not the most readable as I tried to emulate the assembly code layout as closely as possible.

: blinky ( -- )                       \ enable GPIOC-8 as a OUTPUT here for fair comparison to blinky-asm
   %1  19  lshift $40021014  bis!     \ RCC_AHBENR_IOPCEN; I/O port C clock enable
   %01 16  lshift $48000800  bis!     \ GPIOC_MODER_MODER8 set as OUTPUT
   begin                              \ endless loop
      %1  8  lshift $48000818   bis!  \ GPIOC_BSRR_BS8, turn on blue led
      $fffff 0 do loop                \ delay $fffff
      %1  24  lshift $48000818  bis!  \ GPIOC_BSRR_BR8, turn off blue led
      $fffff 0 do loop                \ delay $fffff
   0 until
 ;

Native Forth Blinky Assembly Listing

This is 136 Bytes long, making it exactly double the size of the assembly Blinky (68 Bytes).

see blinky
2000040A: 2080  movs r0 #80
2000040C: 0340  lsls r0 r0 #D
2000040E: 3084  adds r0 #84
20000410: 0280  lsls r0 r0 #A
20000412: 6943  ldr r3 [ r0 #14 ]
20000414: 2280  movs r2 #80
20000416: 0312  lsls r2 r2 #C
20000418: 4313  orrs r3 r2
2000041A: 6143  str r3 [ r0 #14 ]
2000041C: 2090  movs r0 #90
2000041E: 04C0  lsls r0 r0 #13
20000420: 3080  adds r0 #80
20000422: 0100  lsls r0 r0 #4
20000424: 6803  ldr r3 [ r0 #0 ]
20000426: 2280  movs r2 #80
20000428: 0252  lsls r2 r2 #9
2000042A: 4313  orrs r3 r2
2000042C: 6003  str r3 [ r0 #0 ]
2000042E: B500  push { lr }
20000430: 2090  movs r0 #90
20000432: 04C0  lsls r0 r0 #13
20000434: 3080  adds r0 #80
20000436: 0100  lsls r0 r0 #4
20000438: 6983  ldr r3 [ r0 #18 ]
2000043A: 2280  movs r2 #80
2000043C: 0052  lsls r2 r2 #1
2000043E: 4313  orrs r3 r2
20000440: 6183  str r3 [ r0 #18 ]
20000442: 3F04  subs r7 #4
20000444: 603E  str r6 [ r7 #0 ]
20000446: B430  push { r4  r5 }
20000448: 2400  movs r4 #0
2000044A: 25FF  movs r5 #FF
2000044C: 022D  lsls r5 r5 #8
2000044E: 35FF  adds r5 #FF
20000450: 012D  lsls r5 r5 #4
20000452: 350F  adds r5 #F
20000454: CF40  ldmia r7 { r6 }
20000456: 3401  adds r4 #1
20000458: 42AC  cmp r4 r5
2000045A: D1FC  bne 20000456
2000045C: BC30  pop { r4  r5 }
2000045E: 2090  movs r0 #90
20000460: 04C0  lsls r0 r0 #13
20000462: 3080  adds r0 #80
20000464: 0100  lsls r0 r0 #4
20000466: 6983  ldr r3 [ r0 #18 ]
20000468: 2280  movs r2 #80
2000046A: 0452  lsls r2 r2 #11
2000046C: 4313  orrs r3 r2
2000046E: 6183  str r3 [ r0 #18 ]
20000470: 3F04  subs r7 #4
20000472: 603E  str r6 [ r7 #0 ]
20000474: B430  push { r4  r5 }
20000476: 2400  movs r4 #0
20000478: 25FF  movs r5 #FF
2000047A: 022D  lsls r5 r5 #8
2000047C: 35FF  adds r5 #FF
2000047E: 012D  lsls r5 r5 #4
20000480: 350F  adds r5 #F
20000482: CF40  ldmia r7 { r6 }
20000484: 3401  adds r4 #1
20000486: 42AC  cmp r4 r5
20000488: D1FC  bne 20000484
2000048A: BC30  pop { r4  r5 }
2000048C: 2300  movs r3 #0
2000048E: 2B00  cmp r3 #0
20000490: D0CE  beq 20000430
20000492: BD00  pop { pc }
 ok.

Native Forth Blinky Written for Readability

: RCC_AHBENR_IOPCEN  19  lshift $40021014 ;  \ GPIOC Enable Bit
: GPIOC_MODER_MODER8 16  lshift $48000800 ;  \ GPIOC-8 is connected to blue led
: LEDON  %1  8  lshift $48000818   bis! ;    \ GPIOC_BSRR_BS8, turn on blue led
: LEDOFF %1  24  lshift $48000818  bis! ;    \ GPIOC_BSRR_BR8, turn off blue led
: DELAY $fffff 0 do loop ;                   \ delay $fffff


: blinky ( -- )
    %1  RCC_AHBENR_IOPCEN  bis!        \ enable I/O port C clock
    %01 GPIOC_MODER_MODER8  bis!       \ set GPIOC_MODER_MODER8 as OUTPUT
    begin                              \ endless loop
       LEDON
         DELAY
       LEDOFF
         DELAY
    0 until
;