Benchmarks

The question of Speed often crops up, so I’ve listed some comparisons of programming languages and MCU’s below.

Programming Language Benchmark

Computing the same 100-term polynomial 500,000 times, smaller is faster.

I’ve borrowed the table below from http://dan.corlan.net/bench.html which you will need to read for the full story.

Note

Performed on a 300MHz Pentium using Debian GNU/Linux, so it’s fairly old

Language

single body (s)

with call (s)

FORTRAN, g77 V2.95.4

2.73

2.73

Ada 95, gnat V3.13p

2.73

2.74

C, hand optimized , gcc V2.95.4

2.73

Java, gcj V3.0

3.03

15.53

D, gcc V4.0.3+

1)3.43

1)3.98

C, gcc V2.95.4

3.61

3.57

R translated to lisp using R2cl v0.1 and compiled with cmucl

3.69

Lisp, CMU Common Lisp V3.0.8 18c+, build 3030

4.69

10.69

Java, jikes V1.15 (bytecompiled)

8.23

13.54

FORTH, hand optimized Gforth 0.6.1

1)18.21

FORTH,** Gforth 0.6.1

1)27.26

Python** +psyco (interpreted)

1)168.50

Perl, more optimized$ V5.6.1 (natively compiled)

209.20

Perl, more optimized$ V5.6.1 (interpreted)

258.64

Perl, hand optimized*** V5.6.1 (bytecompiled)

306.18

Perl* V5.6.1 (natively compiled)

367.23

Python** V2.1.2 (interpreted)

505.50

Perl* V5.6.1 (bytecompiled)

515.04

RUBY*** (interpreted)

1074.52

R V1.5.1 (interpreted)

5662.64

Power Of A Language Or Performace Of An Implementation ?

This is the big question to me:

  • What if the project itself doesn’t need the maximum processing speed of the target MCU, but rather requires the minimum development time ? I prefer the interactivity of Forth so I can quickly determine hardware characteristics when beginning a new project. This saves me a lot of time correcting invalid assumptions in my code later on.

Benchmarking Different Forths/Mcus

Calculate the greatest common divisor for 0 to 200 Download: GCD

Inspired by : http://weblambdazero.blogspot.com.au/search/label/forth

Benchmark: Calculate The Greatest Common Divisor For 0 to 200

Less time is best, ordered by fastest at top.

MCU

Clock (MHz)

Time (sec)

Comments

STM32F103

75

0.1505

Mecrisp-Stellaris RA 2.5.1 with M3 core for STM32F103

STM32F051

96 (overclocked)

0.3

Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051

STM32F051

72 (overclocked)

0.516

Mecrisp-Stellaris RA 2.5.3 with M0 core for STM32F051

STM32F051

48

0.6

Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051

STM32F051

8

2.59

Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051

Atmega328

16

4

Flash Forth 5

STM8S003F3P6 ( From Flash )

16

6.4

optimized for size not speed

STM8S003F3P6 ( From Ram )

16

6.9

optimized for size not speed

Atmega328

Forth

Clock

Time (Sec)

Flash Forth 5:

16

4

AmForth 6.3

16

8

Yaffa Forth 0.6.1

16

70

Code

: gcd ( a b -- gcd )              \  15 25 gcd .
  begin
   dup
   while
     swap over mod
  repeat
  drop ;

: bench ( n -- )
  ms.counter.reset
  200 dup 0 do
     dup 0 do
       j i gcd
       drop
     loop
  loop drop
  ms.print cr
;

bench