I would not worry about the speed at this point. I am sure the speed can be improved once it is working.
The close coupling of the GDB Server to the JTAG will be fast in the long term.
The problem with GDB Remote is that the native way it loads code is very slowly. It does it a Word at a time waiting for an ACK.
This can be solved in 2 different ways
Bypass GDB by using a native (monitor) way of doing it.
Extend GDB’s functionality to do it natively.
The second method has the downside of either the patch being accepted upstream or continual patching
The second method is the best in the long term however. The second method would fit in better with the GDB Remote protocol allowing all the data to pass over the same channel as the other GDB Remote information. It would also allow more rapid sending of code over serial ports to GDB Stub type systems.
I think a simple way of getting the code over quicker is to send the hex file over the link. It is pretty robust and split into convient chunks. I know a hex file is over twice as big as the binary file, but it is sending the data in chunks rather than one word at a time that gives the main speed up assuming the link is fast.
If the link is a 480Mbit/s USB link I think code loading could be pretty swift
Once you have proof of concept it can be taken on further.