An FPGA And Embedded Engineering Blog

Author: Kai

Hi and thanks for stopping by!

My name is Kai Schoos, I am a freelancing FPGA and Embedded Engineer with an M.Sc. in computer science from Karlsruhe Institute of Technology with a strong focus in FPGA design, embedded systems and robotics. I have worked with devices from different vendors (Xilinx, Lattice, Nordic Semiconductors, STM, TI, Espressiv) and am fluent in relevant programming languages (Python, C, C++, and more). My deep understanding and interest in both, computing systems and software architecture enables me to quickly understand, extend and design new systems.

I am currently actively looking for clients, so if you are interested in my expertise, I'd love to get in touch.

nrf5340-DK vs nrf52-DK

The nrf5340-DK from Nordic Semiconductors just arrived in the mail. Let’s have a look at what sets it apart from the nrf52-DK featuring an nrf52832!

Picture of a breadboard, an nrf52-DK on the left and an nrf5340-DK on the right. Saleae Logic Analyzer trying to steal itself into the picture.
nrf52-DK on the left, nrf53-DK on the right. Saleae Logic Analyzer trying to steal itself into the picture.

The Dev-Board

The differences

Looking at the two boards side-by-side, the first thing we notice is the bigger size of the dev-board. This is probably mostly due to the increased number of GPIO pins the nrf5340 supports. It features twice as many I/O pins as the nrf52832 (2 ports of 32 pins instead of just 1).

The nrf53-DK also comes with the possibility to run it on a Li-po battery and the power source can be chosen accordingly with a switch. In addition, the nrf53-DK also has a USB connector that is directly interfaced to the target MCU. The ESD protection circuit features a PRTR5V0U2X, which looks interesting and will probably be on the BOM of my next PCB featuring an USB connector, but I digress…

The commonalities

For the rest, the two boards have a lot in common, they feature an NFC connector as well as a PCB antenna and SMA connectors for adding an external antenna. Both have debug headers that can be used to program and debug external devices, and they also both have current measurement support, which comes in handy if we need to do current profiling of applications.

The nrf5340-DK actually features two nrf5340s! Here, the first one (U1) is the one that’s actually connected to all the peripherals, on the DK. The second one (U2) is only used for programming and debugging (U2). Nordic calls this second MCU the interface MCU .This is similar to the nrf52-DK, however the older revision of the nrf52-DK I am using still features a Microchip ARM-MCU (ATSAM3U2CA) as an interface MCU, while on newer revisions you’d also find an nrf5340.

As usual, the board comes in an Arduino compatible design. Arduino shields are easily applied and used for prototyping. LEDs and Buttons can be found on both, even though they aren’t interfaced to the same pins.

The Documentation

I have to say, when it comes to documentation, Nordic is on top of the game. They provide a lots of detail while not overloading you with information. A single ZIP-File containing all the production files is available for download, including gerber files. Even pick-and-place files are in there so these development boards are essentially open source hardware! There is also a pdf schematic in the ZIP-file and we find everything with a simple Google-Search. (Here for the nrf52-DK and here for the nrf53-DK) That’s top of the line and I can’t see a big difference between the nrf52-DK and the nrf53-DK. Keep that up Nordic!

The Processor

Now for the heart and brain of the boards, the processors. The nrf5340 is quite a lot beefier than the nrf52832.


Instead of a single processor, the nrf5340 features two Arm® Cortex®-M33 processors, where one is marketed as the “Application core” and one as the “Network core”. The application core comes with a lot of debug-ability supporting different Trace-interfaces. It also contains a Arm TrustZone CryptoCell™-312 security subsystem that allows for hardware accelerated computation of common cryptographic algorithms and PRNG.


While the nrf52832 only has measly 512kB flash and 64kB of RAM, the nrf5340 has 1MB of flash memory and 512(!)kB of RAM – now that’s an improvement.


In terms of peripherals, there isn’t too much of a difference between the two, as both feature the common usual suspects – SPI, I²C, I²S and UART. The nrf5340 has the already mentioned USB-support and also supports QSPI. Everything can be interfaced using the programmable peripheral interconnect so we aren’t forced to use certain pins for certain functions.

Wireless communication

Finally, in addition to the nrf52’s BLE, BL Mesh and ANT™ support, the nrf5340 also supports IEEE 802.15.4, which makes it ready for Matter, Thread and Zigbee, so it is even more applicable for the IoT devices of the coming decade.


The nrf52-DK and nrf53-DK have commonalities and differences. Both devices have their place and knowing the differences is key in knowing when to use which device. In general, these devices are quality products and if you can get away with an nrf52832, it should definitely be used. Having both to my disposal enables me to quickly evaluate solutions and find the most appropriate device for the task.

Driver development for Zephyr RTOS and nrfConnect – Part 1


In this tutorial we will look at how we can create a device driver for Zephyr RTOS.
Zephyr RTOS already supports a lot of different devices and it may already have drivers for your target. If that’s not the case, you came to the right place. We will find out how we can create device drivers for Zephyr RTOS in a streamlined manner. The device that we will be targeting is the MS8607 sensor by TE connectivity.

We will put the driver together with the source code of our main application. This is the simplest form of implementing the driver and it will serve us well. When projects become larger or if multiple projects depend on the same device, it may become necessary to move the driver out, however for small to medium projects, having the driver right there with the rest of the source is completely sufficient.

The Bare Minimum

The Necessary Files

After generating a hello-world project in VSCode using nrf Connect, the first thing we need to do is add several directories and files, such that our tree looks like the following (ignoring some of the pre-generated files):

│   └───bindings
│       └───drivers
│           └───example,test-driver.yaml
    │    └───test_driver.c
    │    └───test_driver.h

The attentive reader will see that this is built for the nrf52-DK featuring an nrf52832.

CMakeLists Entry

Before we forget it, let’s quickly add the c file we just added to the CMakeLists.txt file. Simply add the line

target_sources(app PRIVATE src/driver/test_driver.c)

to the bottom of the file and it will be included when building the project.

The Devicetree Bindings

Next, let’s create the simplest possible devicetree binding file. Edit dts/bindings/drivers/example,test-driver.yaml to contain the following:

description: Driver Test

compatible: "example,test-driver"

include: base.yaml

This will create a compatible that we will be able to use in our device tree. By including base.yaml our driver automatically inherits some important properties, such as status and compatible, which you will have come across already when you have used Zephyr RTOS before.

The Driver Source

We will create a multi-instance driver. Like this, it is possible to have multiple instances of the same device in our project.

Let us fill out the barebones for src/driver/test_driver.c next:

#include <zephyr/device.h>

#define DT_DRV_COMPAT example_test_driver

static int driver_init(const struct device *dev) {
    printk("Initializing driver!\n");

        inst,        \
        driver_init, \
        NULL, \
        NULL, \
        NULL, \
        APPLICATION, \


The line …

#define DT_DRV_COMPAT example_test_driver

… registers our driver with the compatible we defined earlier. Commas and hyphens are replaced by underscores and everything needs to be lowercase.

Next, a macro is defined that will allow for an instance to be initalized. Actually, the last line …


… will run this macro for each instance that is added to the devicetree that has a status of okay. You will want to look at the documentation for DEVICE_DT_INST_DEFINE to see, what the different entries stand for. A word of caution though: If you have several instances of the same device, the value for inst will not have any relationship with the ordering of the devices in the devicetree file, nor to the memory address of that device, so you should never rely on that directly.

We set the driver_init function as an initialization function for our driver and we choose to initialize the driver at APPLICATION level, which allows us to see the output of the printk function.

Now only a last change needs to be done before we can build, flash and reap the fruits of our work!

The Devicetree Overlay

In the file nrf52dk_nrf52832.overlay (or whatever board you are using), create an instance for our newly created driver in the soc node:

/ {
    soc {
        drivertest: driver_test0 {
            compatible = "example,test-driver";
            status = "okay";

Build and Flash

That’s it for the first part! Build the code by hitting Ctrl+Alt+B. You will probably come across the line…

node '/soc/driver_test0' compatible 'example,test-driver' has unknown vendor prefix 'example'

… which is due to the fact that there is no vendor called example in the file zephyr/dts/bindings/vendor-prefixes.txt in your zephyr installation. This is not an error though and everything compiles fine anyway.

After that, flash the code to your device and you’ll be greeted by the wonderful line…

Initializing driver!

We didn’t even have to add any code to main.c yet our driver is initialized and ready to be used. Next, we will look at adding a bit more functionality to the driver and making it actually useful.


That concludes the first part of the driver development! In the next part, we will look at how to choose an API for our driver and make it actually useful. Stay tuned!

Generating sine-waves using PWM – Sine pulse width modulation with an STM32 Cortex M0 device

Often times, harmonic signals, such as sinewaves are needed. However, many microcontrollers do not feature a digital to analog converter (DAC) and therefore lack the ability to output analog voltages directly. However, practically every microcontroller comes with a pulse width modulation (PWM) output, or at least have a hardware timer which can be programmed to create a PWM signal. The idea behind sine pulse width modulation (SPWM) is, that pulses are generated whose duty cycle corresponds to the amplitude of a sinewave at that point in time, with a given frequency. The PWM signal is then smoothed with a capacitor and amplified to a voltage of interest using op-amps or a simple transistor amplifier.

The idea is to have a look-up-table with a predefined number of sine-wave values. The pulse width of the next pulse is then set to the current value according to that look-up-table during the interrupt service routine. The STM32 HAL provides a callback function that can be implemented, called HAL_TIM_PWM_PulseFinishedCallback. This callback is called during an interrupt service routine. The logic for computing the next index is outsourced to another timer. This timer calls the afformentioned interrupt handler every time a period is done.

First, we have to setup the timers. For that, we will use the STM32 CubeIDE, which provides a nice visual interface to do these things. We setup timer 1, channel 1 to PWM Generation and channel 2 to Output Compare. The Prescaler is set to 1 and the Counter Period is set to PERIOD (Remember to choose “No Check” in order to be able to set a non-hex value).

The look-up-table could for example look as follows:

#define PERIOD 32 #define DC_OFFSET (PERIOD >> 1) 


The values used will be presented shortly:

– PERIOD: corresponds to the number of ticks the timer counts before it overflows. Having a lower period enables the generation of higher frequencies.

– DC_OFFSET: As we can only generate pulses with a width greater or equal to 0, we have to add half of the period as dc offset.

– PERIOD_FRACTION: A macro to compute the width of the pulse from a decimal value.

– sin[]: The actual look-up-table with the sine-function samples. The number of samples should be as low as possible in order to keep the code-size small. In addition to that, the number of samples also dictates the maximum frequency that can be generated, as every value must be held for at least one period. Finally, a small value decreases the resolution and the reconstructability of the sinusoidal shape. The table shown is the one that I settled for after a lot of trial and error. One could also obviously get away with only using a quarter period, however that would increase the computation time inside of the ISR, so a time-space-trade-off must be made.

The implemented callback functions look as follows:

void updatePWMPulse(uint16_t value) { 
    htim1.Instance->CCR1 = value; 

void HAL_TIM_PWM_PulseFinishedCallback(TIM_HandleTypeDef *htim) { 
    if(htim == &htim1 && htim->Channel == HAL_TIM_ACTIVE_CHANNEL_2) { 
        step = (step + 1) % num_steps; 

As STM32 Timers only have a single set of these callback functions, we need to manually check if we were called by the one timer and channel of interest. Of so, we simply set the pulse for the PWM signal and increase the step.

Note, that this is channel 2, which corresponds to a simple compare timer, which triggers every time a period ends. There are a number of ways to do this, though.

In our main function, we have to start the two channels of the timer:

HAL_TIM_PWM_Start_IT(&htim1, TIM_CHANNEL_1); HAL_TIM_OC_Start_IT(&htim1, TIM_CHANNEL_2);

And that’s it for basic SPWM generation. Generating different frequencies can be done by either changing the period length, the prescaler or the number of sine-wave samples.

  1. uint16_t) ((X) * (PERIOD >> 1 []

Getting Started with TinyFPGA

In this write-up, I am going to document my first steps using an FPGA.

The device that was used is a TinyFPGA BX, featuring a Lattice iCE40LP8K. This FPGA features 7680 logic elements, so called LUT-4. This means that every programmable logic element is a 4-input look up table that can take on any possible 4-input logic function. This LUTs are the basic logic elements (BLEs) of the FPGA. Several BLEs are combined to form a configurable logic block (CLB), which is at the basis of the configurability of FPGAs. The internal frequency of the FPGA is 16MHz.

A great resource for FPGA architecture can be found under this link: FPGA Architecture.

The first thing was to install everything according to this link: TinyFPGA Setup
After the installation completed successfully, and the first program was synthesized on the FPGA, I could see the LED flashing according to an SOS pattern, just as the tutorial promised.

After reading a bit about Verilog, the language used in the TinyFPGA, I wanted to create my own first program. The idea of the program was a simple counter, that would get incremented whenever a button was pressed. The counter was rather simple:

module counter(

  input trigger;
  output out;

  wire trigger;
  reg out;

  reg [2:0] counter;

  initial begin
    out <= 0;
    counter <= 0;

  always @(posedge trigger) begin
    if (counter == 7) begin
      counter <= 0;
      out <= 1;
    else begin
      counter <= counter + 1;
      out <= 0;

It’s a small module that contains a 3-bit wide register. This register is incremented on every positive edge of the trigger input. Whenever it hits the maximum value of 7 (3’b111 in Verilog-speak), the counter outputs a 1 and resets the counter to 0. Otherwise, the counter is just incremented and it outputs a 0.

Now I hooked up a button to one of the IO-Pins of the FPGA with a pull-up resistor and hooked the IO-Pin to the trigger of the counter. However, the mindful reader may have noticed that this will obviously not work. The reason for this is, that the button is not debounced.

Looking up a way to debounce a circuit in an FPGA, I came across this site: Debounce

Using the blockdiagram, I implemented the debounce logic in Verilog and tested it with a testbench.
The final result was the following code:

module button(

  input clk;
  input in;
  output out;

  reg step_1;
  reg step_2;
  reg counter_clr;
  reg [19:0] counter;
  reg out;

  initial begin
    step_1 = 1'b0;
    step_2 = 1'b0;
    counter_clr = 1'b0;
    counter = 4'b0;
    out = 1'b0;

  always @(posedge clk)
    step_1 <= in;
    step_2 <= step_1;
    counter_clr <= step_1 ^ step_2;

    if (counter_clr)
      counter <= 0;
    else if (counter < (1 << 19))
      counter <= counter + 1;
    else begin
      counter <= counter;
      out <= (counter == (1 << 19)) ? step_2 : out;

The basic idea behind this logical circuit is, that there is a counter, which is cleared every time the input signal transitions from high to low or vice-versa. The output of the circuit is only set to the input value (or rather the input value one timestep ago), once the timer reaches a certain value.

The value that was needed for this was computed as follows:
First, the button was “clicked” several times, and the signal was measured with a logic analyzer. The minimum value for the time the button signal showed 0V was measured. This value was divided by two for good measure, then multiplied by the clock frequency of 16 MHz, giving the number of cycles. Finally, the closest power of 2 was used as a counter value, which turned out to be 219.

These two modules were bundled together in the top module:

module top(

  output LED;
  input PIN_2;
  input CLK;
  reg OUT;

  button btn2(

  counter cnt(

The output of the debounced button is directly fed to the trigger input of the counter.
The output of the counter is then used to light an LED.

Now, clicking the button 8 times triggers the onboard LED!

I hope you found this first dabbling in FPGA synthesis interesting.
Please leave a comment and let me know about your thoughts!

Exploring the nrf52-DK – Timers

This first post dives into one of the most fundamental peripherals of any microprocessor: The timer. Let us examine this part very closely. All the information was gathered from looking through an example file and the datasheet for the nRF52832.

The code with the examples can be found in the following github gist:

To start, we need to find an example that makes use of the timer. After some digging I found the peripherals/ppi example to be a good candidate. Reading through the main function and going from there shows the general pattern that we should follow if we were to use this peripheral:

– Generate timer config (This looks interesting so we make a mental note)
– Initialize timer with that config and pass event handler callback
– Convert a period in ms to a number of ticks (A quick glance at the function reveils that it does what we’d expect)
– Configure the compare register (This also looks interesting)
– Enable the timer

The event handler is just a function that is called every time a certain event is triggered by the timer. We will see this in more depth shortly.

Now that we have a broad overview, let’s dig deeper into the different bits. The constant NRFX_TIMER_DEFAULT_CONFIG pops into view and it’s interesting to see what a “default config” for such a timer contains. We follow its definition and see that it is made up of more constants that are provided by the sdk_config.h.

The first parameter is the frequency. The maximum frequency is 16 MHz  and it can be reduced by setting this parameter. Interesting to note is that, when a frequency below or equal to 1MHz is chosen, the slower CLK PCLK1M (1 MHz internal clock)  is chosen instead of PCLK16M (16 MHz internal clock), in order to reduce power consumption. This should be kept in mind when choosing the right timer values for certain tasks.

Next, mode can be either Timer or Counter. When mode is configured to be “Timer”, the module acts like an actual timer. The timer’s internal counter register is incremented for every tick. The time between ticks is the period given by the timer frequency. However, when setting the mode to “Counter”, the internal counter register is incremented whenever the COUNT-task is triggered. We will see an example of both.

Bit width sets the width of the variable that represents the number of ticks that have passed. This can be 8, 16, 24 or 32 bit wide. We can easily compute the largest timeline that these constraints allow. For example 2^32 / 31.25 kHz = 137439 seconds (38 hours and 10 minutes) is the longest possible timeline when using the slowest frequency (remember, that this puts constraints on the accuracy of the timer). When using the full 16MHz we only get 4 minutes and 28 seconds worth of timeline. We can extend these timelines by keeping track of time itself, updating a variable in software whenever an the timer is about to overrun.

IRQ priority sets the interrupt priority. Nested interrupts are a thing in ARM Cortex-M4, so the priority indicated, by which other interrupts the interrupt service routine for timer interrupts can be … interrupted.

Now that we know the settings, which describe a timer, let us see what this mysterious nrf_drv_timer_extended_compare function is doing.

Every timer in the nRF52 has multiple capture/compare registers associated with them. Timers 0 to 2 have 4, timers 3 and 4 have 6 so called CC-registers. The concepts of these functionalities is straight-forward and will be briefly explained here:

“Capture” means, that the current value of the timeline is written into the specified CC-register. Capturing can be used for example for time-stamping certain events and interrupts or for precisely measuring the time between two instants, as setting a capture task has such a small amount of overhead. In order to use a certain CC-register in capture mode, simply call the function nrfx_timer_capture with the timer and channel you are interested in. This will trigger the CAPTURE-Task and return the captured value. The function nrfx_timer_capture_get is a wrapper that simply returns the captured value for a certain channel. This function should be used when you do not want to capture the value again, but just want to read from it. Whenever the CAPTURE-task is triggered via the PPI (Programmable Peripheral Interface), this function must be used, as we don’t want to trigger the CAPTURE-Task again when reading the value.

“Compare” means, that an event occurs whenever the timeline is equal to the value in that CC-register. The timer counts up to the value specified in the CC-register and triggers an interrupt when it reaches that value. This can be useful when you want periodic things to happen, such as an LED blinking or a sensor measurement being taken. However, when implementing such functionality, make sure to keep your ISRs as short as possible and defer all complex processing to later times. Timer interrupts  are also used by preemptive schedulers: A timer interrupt occurs every X milliseconds and checks which task is supposed to run next. To setup a compare timer interrupt, simply call nrfx_timer_compare or nrfx_timer_extended_compare with a timer instance, an appropriate channel, the value at which the interrupt should be triggered as well as a flag, if interrupts should be enabled or disabled. In addition, nrfx_timer_extended_compare also takes an nrf_timer_short_mask_t can be added, which allows to specify that the CLEAR- and / or the STOP-Task should be executed, whenever the corresponding COMPARE-event is raised.

The index of the capture/compare register is equivalent to what is called “channel” in the example code.

Now that we have a good overview of what these timers are capable of, it’s time to play around with them. I created three small examples that showcase what we have learned. The gist of these examples is obviously not that this can’t be done differently, but it’s fun to tinker with these kind of things to deepen the understanding.

The examples are based on the example project peripherals/bsp in the nRF5 SDK, because it already includes button and LED initialization as well as UART logging. The only thing we need to import is nrfx_timer.h and nrfx_timer.c. In addition, some fields in the sdk_config.h must be added and reconfigured (The part found under “nrfx_timer – TIMER peripheral driver”). After this is done, our playground is ready for action.

There are 3 tasks implemented, which can be cycled through by pressing Button 1 on the nRF52-DK.

Task 1:
The first task creates a single counter. Whenever Button 0 is pressed, the counter is incremented by 1 and the value of the counter is displayed in form of its bits by the LEDs. When the compare value of 16 is reached, the counter is supposed to be resetted. This is achieved by setting the mask to NRF_TIMER_SHORT_COMPARE0_CLEAR_MASK in the call to nrfx_timer_extended_compare. The timer is now cleared every time the compare value of channel 0 is reached.

Task 2:
In the second task, we create another timer. This timer increments the first timer (the counter), every time it reaches a certain compare value. The compare value is computed, such that it takes exactly 1 second between events. The result is the display of a binary number which is incremented every second.

Task 3:
The final task contains two timers. The timer from task 2 enables the new timer every time it hits its compare value. The second timer has several compare events attached to it. The datasheet tells us, that we need to use timer 3 or 4 if we need more than 4 compare registers. Instead of computing the compare values ourselves, we use the function nrfx_timer_ms_to_ticks, which does the computation for us, based on the frequency of the timer. Note that we only set the masking shortcut for the last compare registers: We only want to reset the timer once it cycled to the end. We finally disable the timer inside of its own event handler.

This wraps up this first post on timers in the nRF52-DK. We will come back to timers again and again, as they are vital for all kinds of embedded devices.