Benchmarks¶

The question of Speed often crops up, so I’ve listed some comparisons of programming languages and MCU’s below.

Programming Language Benchmark¶

Computing the same 100-term polynomial 500,000 times, smaller is faster.¶

I’ve borrowed the table below from http://dan.corlan.net/bench.html which you will need to read for the full story.

Note

Performed on a 300MHz Pentium using Debian GNU/Linux, so it’s fairly old

Language	single body (s)	with call (s)
FORTRAN, g77 V2.95.4	2.73	2.73
Ada 95, gnat V3.13p	2.73	2.74
C, hand optimized , gcc V2.95.4	2.73
Java, gcj V3.0	3.03	15.53
D, gcc V4.0.3+	1)3.43	1)3.98
C, gcc V2.95.4	3.61	3.57
R translated to lisp using R2cl v0.1 and compiled with cmucl	3.69
Lisp, CMU Common Lisp V3.0.8 18c+, build 3030	4.69	10.69
Java, jikes V1.15 (bytecompiled)	8.23	13.54
FORTH, hand optimized Gforth 0.6.1	1)18.21
FORTH,** Gforth 0.6.1	1)27.26
Python** +psyco (interpreted)	1)168.50
Perl, more optimized$ V5.6.1 (natively compiled)	209.20
Perl, more optimized$ V5.6.1 (interpreted)	258.64
Perl, hand optimized*** V5.6.1 (bytecompiled)	306.18
Perl* V5.6.1 (natively compiled)	367.23
Python** V2.1.2 (interpreted)	505.50
Perl* V5.6.1 (bytecompiled)	515.04
RUBY*** (interpreted)	1074.52
R V1.5.1 (interpreted)	5662.64

Power Of A Language Or Performace Of An Implementation ?¶

This is the big question to me:

What if the project itself doesn’t need the maximum processing speed of the target MCU, but rather requires the minimum development time ? I prefer the interactivity of Forth so I can quickly determine hardware characteristics when beginning a new project. This saves me a lot of time correcting invalid assumptions in my code later on.

Benchmarking Different Forths/Mcus¶

Calculate the greatest common divisor for 0 to 200 Download: GCD

Inspired by : http://weblambdazero.blogspot.com.au/search/label/forth

Benchmark: Calculate The Greatest Common Divisor For 0 to 200¶

Less time is best, ordered by fastest at top.

MCU	Clock (MHz)	Time (sec)	Comments
STM32F103	75	0.1505	Mecrisp-Stellaris RA 2.5.1 with M3 core for STM32F103
STM32F051	96 (overclocked)	0.3	Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
STM32F051	72 (overclocked)	0.516	Mecrisp-Stellaris RA 2.5.3 with M0 core for STM32F051
STM32F051	48	0.6	Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
STM32F051	8	2.59	Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
Atmega328	16	4	Flash Forth 5
STM8S003F3P6 ( From Flash )	16	6.4	optimized for size not speed
STM8S003F3P6 ( From Ram )	16	6.9	optimized for size not speed

Atmega328¶

Forth	Clock	Time (Sec)
Flash Forth 5:	16	4
AmForth 6.3	16	8
Yaffa Forth 0.6.1	16	70

Code¶

: gcd ( a b -- gcd )              \  15 25 gcd .
  begin
   dup
   while
     swap over mod
  repeat
  drop ;

: bench ( n -- )
  ms.counter.reset
  200 dup 0 do
     dup 0 do
       j i gcd
       drop
     loop
  loop drop
  ms.print cr
;

bench

Benchmarks¶

Programming Language Benchmark¶

Computing the same 100-term polynomial 500,000 times, smaller is faster.¶

Power Of A Language Or Performace Of An Implementation ?¶

Benchmarking Different Forths/Mcus¶

Benchmark: Calculate The Greatest Common Divisor For 0 to 200¶

Atmega328¶

Code¶

Table of Contents

Previous topic

Next topic

This Page