LPC2148 vs AT91SAM7X256, battle at 48MHz

I just completed the port of ChibiOS/RT on my Olimex SAM7-EX256 so I ran a speed comparison with the Olimex LPC-P2148.

Of course speed is not everything, there are other factors like:

  • Peripherals number and quality.

  • Easy to use.

  • Memory.

  • Software support and documentation.

  • Errata.

  • Price and availability.

and so on.

Test setup:

AT91SAM7X256: clock 48054857, flash accesses with 1 wait state.

LPC2148: clock 48000000, flash accesses with 3 wait states, MAM mode 2.

Both processors have a timer enabled that generates interrupts with a period of 1mS, PIT for the Armel, Timer0 for the NXP. BTW, I love the PIT, a fully featured timer is wasted for just periodic interrupts generation.

ChibiOS/RT version 0.5.4 and GCC version 4.2.2 (YAGARTO) for both.

The GCC optimization level is -O2 in ARM mode and -Os in THUMB mode (defaults in ChibiOS/RT).

Note that the benchmark results are accurate enough, the numbers can change of +/-1 after many runs.

The results:

ChibiOS/RT 0.5.4 benchmark on AT91SAM7X256, ARM mode

*** Kernel Benchmark, context switch test #1 (optimal):
Messages throughput = 108463 msgs/S, 216926 ctxswc/S
*** Kernel Benchmark, context switch test #2 (no threads in ready list):
Messages throughput = 86020 msgs/S, 172040 ctxswc/S
*** Kernel Benchmark, context switch test #3 (04 threads in ready list):
Messages throughput = 86020 msgs/S, 172040 ctxswc/S
*** Kernel Benchmark, threads creation/termination:
Threads throughput = 65257 threads/S
*** Kernel Benchmark, I/O Queues throughput:
Queues throughput = 235772 bytes/S

ChibiOS/RT 0.5.4 benchmark on LPC2148, ARM mode

*** Kernel Benchmark, context switch test #1 (optimal):
Messages throughput = 136224 msgs/S, 272448 ctxswc/S
*** Kernel Benchmark, context switch test #2 (no threads in ready list):
Messages throughput = 108176 msgs/S, 216352 ctxswc/S
*** Kernel Benchmark, context switch test #3 (04 threads in ready list):
Messages throughput = 108175 msgs/S, 216350 ctxswc/S
*** Kernel Benchmark, threads creation/termination:
Threads throughput = 83739 threads/S
*** Kernel Benchmark, I/O Queues throughput:
Queues throughput = 335792 bytes/S

ChibiOS/RT 0.5.4 benchmark on AT91SAM7X256, THUMB mode

*** Kernel Benchmark, context switch test #1 (optimal):
Messages throughput = 90083 msgs/S, 180166 ctxswc/S
*** Kernel Benchmark, context switch test #2 (no threads in ready list):
Messages throughput = 77394 msgs/S, 154788 ctxswc/S
*** Kernel Benchmark, context switch test #3 (04 threads in ready list):
Messages throughput = 77394 msgs/S, 154788 ctxswc/S
*** Kernel Benchmark, threads creation/termination:
Threads throughput = 65082 threads/S
*** Kernel Benchmark, I/O Queues throughput:
Queues throughput = 237416 bytes/S

ChibiOS/RT 0.5.4 benchmark on LPC2148, THUMB mode

*** Kernel Benchmark, context switch test #1 (optimal):
Messages throughput = 89812 msgs/S, 179624 ctxswc/S
*** Kernel Benchmark, context switch test #2 (no threads in ready list):
Messages throughput = 77185 msgs/S, 154370 ctxswc/S
*** Kernel Benchmark, context switch test #3 (04 threads in ready list):
Messages throughput = 77188 msgs/S, 154376 ctxswc/S
*** Kernel Benchmark, threads creation/termination:
Threads throughput = 64831 threads/S
*** Kernel Benchmark, I/O Queues throughput:
Queues throughput = 235520 bytes/S

Note that the performance, for the Atmel chip, is about the same in ARM and THUMB mode, the THUMB mode is a bit slower in the first 3 tests because the processor has to do extra work while doing the context switch (it has to switch from THUMB to ARM and then to THUMB again for each context switch).

The difference is huge on the LPC2148 instead, it is much faster in ARM mode than in THUMB mode, probably because the MAM unit.

The LPC2148 is clearly a winner while running in ARM mode, it is something like 30% faster than the AT91SAM7X, in THUMB mode the two chips have about the same performance. The LPC2148 is also able to clock a bit higher so if you are looking for just speed then it is probably the best choice.

The Atmel chip seems to have an edge as peripherals, it is very rich and the DMA engine is really interesting. It also includes a 10/100 Ethernet port which is missing on the LPC2148.

Nice comparison. One note tho:AT91SAM7X256 counterpart is more LPC23XX than LPC2148.For example LPC2378 has a Ethernet port,CAN and DMA just like AT91SAM7X256 and it can be clocked at higher freq up to 72MHz while maintaing lower price.

Regards!

Thanks for the post

free:
Nice comparison. One note tho:AT91SAM7X256 counterpart is more LPC23XX than LPC2148.For example LPC2378 has a Ethernet port,CAN and DMA just like AT91SAM7X256 and it can be clocked at higher freq up to 72MHz while maintaing lower price.

Regards!

Very true, the SAM7X DMA is different however, it does not rely on dedicated memory areas and is not limited to the USB/MAC, most peripherals can use a DMA channel including USARTs and SPIs. This could be important in I/O intensive applications.

The Atmel chip also has a more complete USART that includes a lot of nice features, like multipoint and the ISO7816 mode, the 7816 mode is important in applications using smart cards (it does not have the 16 bytes deep FIFOs however, probably because it can use the DMA).

Anyway LPC chips seem to be a better choice in general, I can’t way to see the new family using the Cortex-M3 core.