Mecrisp only implements the minimal serial interface required, i.e. USART1 with polled I/O. This is very limited, because the serial port has no buffering capability: if we don’t poll it often enough (over 10,000x per second for 115200 baud!), we risk losing incoming input data.
The standard solution for this is interrupts: by enabling the RX interrupt, we can get the data out in time for the next one to be processed. Although this merely moves the problem around, we can then add a larger buffer in software to store that input data until it’s actually needed.
Let’s implement this - it’s a nice example of how to make hardware and software work together:
- to avoid messing up the only communication we have to Forth, i.e. USART1, we’ll be much better off developing this first for USART2 - as changing the values to adapt it to USART1 will be trivial once everything works
- we’re going to need some sort of buffer, implemented here as a generic“ring buffer”
- we need to set up the USART2 hardware, the easiest way is to start off in polled mode
- lastly, we’re going to add an interrupt-handling structure which ties everything together
Circular buffering
What we want for the incoming data is a FIFO queue, i.e. the incoming bytes are pushed in at one end of the buffer, and then pulled out in arrival order from the other end.
A ring buffer is really easy to implement - thisForth implementation is a mere 16 lines of code. Its public API is as follows - for initialisation, pushing a byte in, and pulling a byte out:
: init-ring ( addr size -- ) \ initialise a ring buffer
: >ring ( b ring -- ) \ save byte to end of ring buffer
: ring> ( ring -- b ) \ fetch byte from start of ring buffer
We also need to deal with “emptiness” and avoiding overrun:
: ring# ( ring -- u ) \ return current number of bytes in the ring buffer
: ring? ( ring -- f ) \ true if the ring can accept more data
Ring buffers are simplest when the size of the ring is a power
of two (because modulo 2^N
arithmetic can then be done using a bit mask).
Setup requires a buffer with 4 extra bytes:
128 4 + buffer: myring
myring 128 init-ring
With this out of the way, we now have everything needed to buffer up to 127 bytes of input data.
USART hardware driver
Setting up a hardware driver is by definition going to be hardware-specific. Here is a completeimplementation for the STM32F103 µC series:
$40004400 constant USART2
USART2 $00 + constant USART2-SR
USART2 $04 + constant USART2-DR
USART2 $08 + constant USART2-BRR
USART2 $0C + constant USART2-CR1
: uart-init ( -- )
OMODE-AF-PP OMODE-FAST + PA2 io-mode!
OMODE-AF-PP PA3 io-mode!
17 bit RCC-APB1ENR bis! \ set USART2EN
$138 USART2-BRR ! \ set baud rate divider for 115200 Baud at PCLK1=36MHz
%0010000000001100 USART2-CR1 ! ;
: uart-key? ( -- f ) 1 5 lshift USART2-SR bit@ ;
: uart-key ( -- c ) begin uart-key? until USART2-DR @ ;
: uart-emit? ( -- f ) 1 7 lshift USART2-SR bit@ ;
: uart-emit ( c -- ) begin uart-emit? until USART2-DR ! ;
Some constant definitions to access real hardware inside the STM32F103 chip, as gleaned from the datasheet, some tricky initialisation code, and then the four standard routines in Forth to check and actually read or write bytes.
It’s fairly tricky to get this going, but a test setup is extremely simple: just connect PA2 and PA3 to create a “loopback” test, i.e. all data sent out will be echoed back as new input.
During development, it’s useful if we can quickly inspect the values of all the hardware registers. Here’s a simple way to do that:
: uart. ( -- )
cr ." SR " USART2-SR @ h.4
." BRR " USART2-BRR @ h.4
." CR1 " USART2-CR1 @ h.4 ;
Now, all we need to do to see the registers is to enter “uart.
“:
uart.
SR 00C0 BRR 0138 CR1 200C ok.
That’s after calling uart-init
. Right after reset, the output would look like
this instead:
SR 0000 BRR 0000 CR1 0000 ok.
To test this new serial port with the loopback wire inserted, we can now enter:
uart-init uart-key? . 33 uart-emit uart-key? . uart-key . uart-key? .
The output will be (note that in Forth, false = 0 and true = -1):
0 -1 33 0 ok.
I.e. no input, send one byte, now there is input, get it & print it, and then again there is no input.
Enabling input interrupts
So far so good, but there is no interrupt handling yet. We now have a second serial port, but unless we poll it constantly, it’ll still “overrun” and lose characters. Let’s fix that next.
Here is theimplementation of an extra layer around the above ring and uart code:
128 4 + buffer: uart-ring
: uart-irq-handler ( -- ) \ handle the USART receive interrupt
USART2-DR @ \ will drop input when there is no room left
uart-ring dup ring? if >ring else 2drop then ;
$E000E104 constant NVIC-EN1R \ IRQ 32 to 63 Set Enable Register
: uart-irq-init ( -- ) \ initialise the USART2 using a receive ring buffer
uart-init
uart-ring 128 init-ring
['] uart-irq-handler irq-usart2 !
6 bit NVIC-EN1R ! \ enable USART2 interrupt 38
5 bit USART2-CR1 bis! \ set RXNEIE
;
: uart-irq-key? ( -- f ) \ input check for interrupt-driven ring buffer
uart-ring ring# 0<> ;
: uart-irq-key ( -- c ) \ input read from interrupt-driven ring buffer
begin uart-irq-key? until uart-ring ring> ;
This sets up a 128-byte ring buffer and initialises USART2 as before.
Then, we set up an “interrupt handler” and tie it to the USART2 interrupt (this requires Mecrisp 2.2.2, which is currently still in beta).
The rest is automatic: as if by magic, every new input character will end up
being placed in the ring buffer, and so our key?
and key
code no longer
accesses the USART itself - instead, we now treat the ring buffer as the source
of our input data.
Interrupts require great care in terms of timing, because interrupt code can run at any time - including exactly while we’re checking for new input in our application code! In this case, it’s all handled by the ring buffer code, which has been carefully written to avoid any race conditions.
Note that interrupts are only used for incoming data, the outgoing side continues to operate in polled mode. The reason is that we cannot control when new data comes in, whereas slow output will simply throttle our data send code. If we don’t deal with input quickly, we lose it - whereas if we don’t keep the output stream going full speed, it’ll merely come out of the chip a little later.
What’s the point?
You might wonder what we’ve actually gained with these few dozen lines of code.
Without interrupts, at 115200 baud, there’s potentially one byte of data coming in every 86.8 µs. If we don’t read it out of the USART hardware before the next data byte is ready, it will be lost.
With a 128-byte ring buffer, the data will be saved up, and even with a full-speed input stream, we only need to check for data and read it (all!) out within 11 milliseconds. Note that - in terms of throughput - nothing has changed: if we want to be able to process a continuous stream of input, we’re going to have to deal with 11,520 bytes of data every second. But in terms of response time, we can now spend up to 11 ms processing the previous data, without worrying about new input.
For a protocol based on text lines for example, with no more than 80..120 characters each, this means our code can now operate in line-by-line mode without data loss.
One use for this is the Mecrisp Forth command line. The built-in polled-only mode is not able to keep up with new input, which is whymsend needs to carefully throttle itself to avoid overruns. With interrupts and a ring buffer, this could be adjusted to handle a higher-rate input stream.