skip to Main Content

Is there a way to get to the core of printf? I need an USB output as fast as possible, but printf() is very slow, so I want to get rid of all the formatting that is happening inside the function. Where is printf declared? It is most likely in the stdout library, but where can I find it (I´m using visual studio code).

Background (Working on Raspberry Pi Pico with SDK lib – but not relevant for this question):

I want to read 16Bit Hex values from SPI1 and write them within an interrupt to USB in "realtime". So my program kinda looks like this:

Do nothing until SPI Interrupt triggers
Read data from SPI1
Write data to COM19 on PC via USB

The SPI data from the master is coming in at a rate of 4MHz and there is a gap of about 480ns between data words. One data packet is taking about 4,5us (measured from bit1 of word1 to bit1 of word2). Means the maximum time of the interrupt should not be greater than 4,5us. The interrupt function with printf() takes about 9us to process, so thats too slow and some data is not getting caught. Write() is able to process within the time of one cycle and only takes about 2us. The problem with it is, that the data is printed as ASCII on the COM Port (Putty). I disabled a lot of the SDK functions to optimize the runtime and I was wondering can I do the same with printf() ? Must be possible to write directly into the TX register and send it to the PC, but I wasn´t able to find anything about this topic on my research.

Either I need to speed up the printf() function, or find some alternative, because it is clearly slowing down my program.

2

Answers


  1. How’s this for speed..

    char *hex = "0123456789ABCDEF";
    char buf[4];
    buf[3] = hex[n >> 0 & 0xF];
    buf[2] = hex[n >> 4 & 0xF];
    buf[1] = hex[n >> 8 & 0xF];
    buf[0] = hex[n >>12 & 0xF];
    write( fd, buf, sizeof buf ); 
    

    There’s ‘endianess’ to consider, but can’t imagine you can get faster than this…

    EDIT: Then again, there’s the branchless version without any table. Something like this:

    buf[3] = n    &0xF, buf[3] += '0' + (buf[3] > 9) * 7;
    buf[2] = n>> 4&0xF, buf[2] += '0' + (buf[2] > 9) * 7;
    buf[1] = n>> 8&0xF, buf[1] += '0' + (buf[1] > 9) * 7;
    buf[0] = n>>12    , buf[0] += '0' + (buf[0] > 9) * 7;
    

    Those "2nd half" tweaks to the values look like they’d be amenable to a parallel operation in systems that might support that.

    Login or Signup to reply.
  2. Since you have not specified all of your requirements and environment conditions, this list collects just ideas. Some might work, others not. It all depends on conditions you did not provide in your question. If some of the terms are new to you, use them as a starting point for research.

    1. Use one or more tables to convert from binary to ASCII, as others suggested. This only speeds up the conversion.

    2. Don’t transmit ASCII, send binary values. An ASCII hex value as you probably use needs at least 5 characters, 4 for the number and 1 separator. The same value as binary has just 2 bytes. Additionally, you don’t need any conversion.

    3. Detach the acquirement of the data from the transmission. For example, you can use a circular buffer big enough to handle the average throughput. Be aware that you have multiple threads then, design your software carefully.

    In any case, make sure the receiving machine can handle the data rate. PCs are notoriously known to depend on load and even "sleep" time after time, because their OS is not a real-time operating system.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search