Slow SPI on Artemis RedBoard driving ST7789 240x240 TFT LCD

I’m an old noob trying to port code from an Arduino Uno to the Artemis Redboard. The Redboard seems to be much faster running code, but the SPI is noticeably slower, so much so as to be almost unusable. I’ve timed it at 5X slower using the Adafruit graphicstest code (with all delays removed). This is very apparent in screen fills/overwrites. To update a character on the screen you must first fill that area with black (fillRect) before writing the new character. On the Uno this is very fast and works well. On the Redboard, it’s slow and distracting.

The TFT LCD is a 240x240 using the ST7789 driver. I’m using the standard Adafruit graphics libraries (Adafruit_GFX.h and Adafruit_ST7789.h) and calls. The code and pins (SCK pin 13, MOSI pin 11) are identical for the Uno and Redboard.

Is there something I need to do to increase the Redboard SPI speed? Has anyone else seen this?

Thx.

Ken

Did you ever get this resolved? I am trying to go down the same path, but I can’t get any GFX sketch to even compile. It seems to do with a myriad of ways for the library to manipulate the IO pins based on processor architecture capabilities. Do you have any advice for that?

Hi

kjhall - sorry this one slipped through the cracks back in January. I think this should be mostly resolved, or at least much improved.

robin_hodgson - I’m not a big fan of the Adafruit GFX style of device support. It is messy and hard to read. However in many cases it makes for a big improvement.

Ultimately this issue comes down to the amount of overhead to complete a single SPI transaction. In the worst case scenario clearing a screen would consist of writing WxH pixels to a color (black). If the library does not have any optimizations then that could mean WxH SPI transactions - in which case overhead begins to add up.

Previously overhead on Artemis was fairly high. This was reduced a couple months ago for better SD card support / speeds. Those changes should cascade here.

I’m not a big fan of the Adafruit GFX style of device support. It is messy and hard to read.

Neither am I, but this specific application needs to written as a very standard Arduino app using standard libraries. Are there better LCD libraries out there?

Now that you mention it, I am reminded that the Ambiq HAL’s SPI routines have a pretty high setup cost. In a different project, I was working with an SPI LoRa device. If I remember right, I found that combining several individual SPI transfers to access sequential registers into one single transfer to multiple registers saved a lot of time on setup costs.

Exactly! That’s where the bulk of overhead was coming from. We’ve done what we can to minimize it but there will always be some.

I’m partial to the [HyperDisplay library… though it supports far fewer displays currently it is highly extensible. For one it allows device drivers to override line and rectangle drawing operations which can result in huge efficiency gains. (not that that is unique to this library)](GitHub - sparkfun/SparkFun_HyperDisplay: Standardized library for control of displays and easy extension to new display families)

Thanks, I’ll check that library out.

In the meantime, I tried using SoftSPI instead of the normal hardware SPI just to see if my system might run a bit faster by avoiding all the IOM setup costs involved with a hardware transfer.

On the logic analyzer, I can see the soft SPI get right to work and laboriously bang out the bits in the byte being transferred. In contrast, the hard SPI transfer routine does nothing for a fair long time while the HAL gets the SPI IOM hardware set up, then the IOM spits out the bits of the byte being transferred in a narrow sliver of frenzied activity, followed by another delay as it gets back to the calling program to send the next byte.

Amusingly, the net result is that the hard and soft mechanisms take very nearly the exact same amount of time to perform a 1 byte transfer. The only way to speed SPI transfers is to send more bytes per transfer so the setup time can be amortized across the entire transfer. But if you are trying to use a library like GFX that is very pixel-oriented and therefore performs a ton of short transfers, then it’s just going to be slow. As things stand, it takes nearly 3 seconds to write every pixel on a 128x128 display. That is pretty unusable from a UI standpoint.

After digging into this for the last couple days, I have confirmed why things are so slow. The issue is that the original Arduino SPI library mechanisms assume that the underlying hardware supports single-byte SPI transfers that are fast&cheap. That assumption is carried on into the SPITft library, too. Finally, the GFX library defaults to work on a pixel-by-pixel basis. When the GFX library wants to write a pixel, it performs about 13 of what it thinks are fast, but separate SPI transfers to get the pixel color data written to the display hardware. This can get really ugly: when you copy an RGB bitmap to the display it gets copied pixel-by-pixel with 13 individual SPI transfers per pixel.

The bottom line is that single-byte transfers were cheap on an old AVR chip. The world has moved on though. Any modern chip with fancy silicon to manage its SPI port takes a certain amount of time to set up for a transfer regardless of the transfer length. The Apollo3 is no exception. As a result, a modern chip has a strong incentive to send as much data as possible per transfer to amortize the setup cost over as many bytes as possible.

With that in mind, I modified the SPITft code that displays a bitmap so that instead of doing it pixel by pixel (where each pixel requires 13 distinct SPI transfers), the logo data is set up in memory so that the modified driver can send it all in a single, massive SPI transfer. The original bitmap draw operation of my 110x128 pixel logo was taking 2.92 seconds. After my mods, it dropped to 11 mSec, or more than 260 times faster. In addition to that change, I also modified the display address setup sequence so that it only needs 6 SPI transactions instead of 13. That approximately doubles the speed of any per-pixel operation.

The bottom line is that the Artemis Apollo3 is plenty capable, but in this specific case, it really needs the SPITft/GFX display drivers to be rewritten with the goal of minimizing the number of SPI transfers.

To close this out: I made a new thread describing how to modify the GFX library so that the performance when using an Artemis/Apollo3 becomes pretty good. Details are here: viewtopic.php?f=169&t=53381