Do you have particular transfer speeds for particular targets in mind? Actually I’m interested in the STR71x target and openocd is slow with it, at least for flash writes. For RAM writes or JTAG execution, it is acceptable (However I don’t have other points of comparison, I’m new to continuous JTAG use, I’m more used to proprietary probes like ATMEL’s AVRISP)
In the case of Linux/Openocd/Str7/Amontec JTAG/Flash write, slow speed has nothing much to do with USB transfer speed or pure crunching power. From what the oscilloscope shows, any JTAG query-answer takes 6ms, probbably because of Linux scheduling, libftd2xx sequence of system calls, etc. The easiest trick to get more speed is to delocalize some processing done inside openocd in the target itself instead of processing everything needed to flash write thru the USB bus. This way the 6ms latency between each query disappears.
Thinking about it again, I guess that the performance bottleneck is probably libftd2xx itself: if you strace(1) openocd, you see that each JTAG query makes a lot of system calls. For instance memory is mapped/unmapped each time, instead of being mapped once for the time openocd runs. If you gprof(1) openocd, you’ll see also that some internal functions are called hundreds of thousands of time when you write 64k to the STR7 flash.
So I guess there is still much to do at the software level before having to think about making dedicated hardware. To start, IMHO one should clearly find where time is lost, it is still unclear at the moment but libftd2xx would be a good starting point. Too bad that it is the only part of the sofware chain that is not open source…
I suspect that the FTDI Chips clock of 6Mhz is actually very fast. The FTDI chip is not a dedicated JTAG device so I suspect that it no where near achieves what could be done at 6MHz.
I think one way to get good performance would be to run the actual OpenOCD code on a fast microcontroller which has a CPLD attached.
Not everybody needs such fast performance however, I am using the LPC2103 at the moment which has only 32K of Flash and 8K Ram so download speed is not a major issue at this time.
This would change if ARM926EJS support were added as I would be using it on a Platform with 8M flash and 16M of RAM. Then having a fatser version would be more interesting.