I’ve been working on the HPS/FPGA communication, and how we will handle buffering.


I came up with this completed version of the architecture:

FPGA buffers will be fed by a DMA Controller. The DMAC will read data from RAM using the fpga2hps bus, and write to an AXI bus connected to all FPGA buffers.

On the HPS side, we will have a kernel module controlling the DMAC through the hps2fpga bus. It will be responsible of DMA transfer sequencing (between all buffers), and also the transfer timing.

The DMA source will be a pool of contiguous RAM buffers allocated by the driver.

Finally, the driver will expose a Linux char device. Through this device, a user-space app will be able to feed the buffers with data from various source (an SD card, a Wifi streaming).


If you think about it, each half frame-slice will need to be printed twice, for it to be seen from all points of view. Thus, the drawing flow looks like this (where each line is a half turn, and the two numbers represents the frame being drawn by each half of the panel) :

  • F(n, right) ; F(n+1, left)
  • F(n+1, left) ; F(n+1, right)
  • F(n+1, right); F(n+1, left)
  • F(n+1, left) ; F(n+2, right)

We can see that the first half always get data from the second half at previous round. Thus, if we swap buffers smartly across drivers, we can use half-a-turn buffers. This results in 3 buffers for 2 drivers storing (F(n), right), F(n+1, left), F(n+1, right)), where at each time, one buffer is useless and can be fed by the DMAC with the next correct half.

To start with simpler code, we will approximate to 1 buffer per driver (+33% data stored), with a full turn capacity. We will optimize only if needed, that is, if the throughput of the HPS -> FPGA is not sufficient to transfer that much data.

In RAM, we will have a double buffer. One where we write data for the next turn, and one with data that should be used during this turn.

[CyL3D] PCB done, FPGA started


It’s been a long time since the last post. I’ve been busy finished the routing of our vertical PCB. We use 4 layers, layer 2 being a GND plane.

Here is the routing layer by layer, with explanations:

Layer 1 (top)
Layer 3
Layer 4 (bottom)

FPGA architecture

Since we’re done (modulo reviewing) with the PCB, we started thinking about the FPGA architecture.

We came up with an architecture based on 4 modules.

  • We will use the Altera On-chip Memory IP as our per-driver buffers. It is practical since it has an AXI-compatible interface that we can use to easily connect to the HPS. 
  • A synchronizer module will be in charge of time-keeping in order to load and display data at the right time. It will also select which column to display.
  • The data provider will load data from the memory and pass them to the LED-driver driver.
  • Finally, the LED-driver driver is in charge of creating the SIN/LAT signals. It will both send data to the driver (as requested per the data provider), and display those data (as requested per the synchronizer)

[CyL3D] LED Rooting

Quick post.

I finished placing and routing the LEDs, one each 3.4mm (not minimal but required to ease soldering).

We have less space and MOSFETs are going to take more space than I originally thought. Although, if we use less MOSFETs, it should be fine.

Still, we won’t have much free space at the top of the panel for the LED drivers. Since they are quite small, I think we can put them on the two sides.

Here is the updated routing idea:

[CyL3D] Schematics + routing continued

I’ve almost completed the schematics of all the LED panels. Once done, I will start placing and routing it.

I tried routing a small number of LEDs to check the estimations I made in the previous post. I forgot to take the soldering pad into account, so I only managed to place a LED each 3.2mm.

Anyway, this distance is too small and we risk soldering issues. Based on the vertical distance between last year’s SPIROSE LEDs, we will put a LED each 3.4mm (1.2mm between each).

Based on this, the LED panel will take roughly 11cm x 14cm. Since the PCB is 19cm x 19cm, we have plenty of space for other components. I will try to fully route one LED panel and its drivers to check if it is possible.

For now, here is an idea of how I was hopping to do the routing:

Technically, all components (except for the power supply, which I haven’t checked) fit in. What I need to figure out is if routing is feasible,

[CyL3D] LED Routing

Although we are not even done with the shematics, we wanted to figure out how much space we will have left on the PCB once the LED are routed. Indeed, since we can’t use blind vias, and must cross columns/rows signals, we know we are going to create a vias matrix.
Thus, it will be hard to use the back of the LED panel other than for routing, or placing small components. In particular, the SoM’s connectors aren’t probably going to fit.

Taking inspiration from one of TI’s reference design, I tried to do a minimal routing of the LEDs only. Using NFC 93-713 class 5, I managed to place a LED each 2.7mm. RGB cathodes are routed on the surface, and anodes are routed in a medium layer using vias.

 The good news are:

  • We can have a really small-pitch screen.
  • We have a lot of free space on the PCB. The minimum LED panel size is roughly 11 x 9cm whereas the PCB’s height is 19cm.

I will need to check with the other components we need, but I’m quite confident that we can avoid using an extra horizontal PCB.

[CyL3D] From architecturing to schematics


Last Wednesday, we finished discussing about architecture. We made some major changes.

First, we discovered that we couldn’t use the TLC59582. We made some assumptions about the way ES-PWM works that aren’t specified in the Datasheet. We thought we could control the PWM width over one segment using the 8 MSB, and we could send a VSYNC signal in the middle of the PWM period. This would have resulted in an 8bit PWM, which is ideal. Without these assumptions, we only have a 12bit PWM which period is too long compared to the display time of a single point.

After checking a lot of LED drivers and PWM generators, I couldn’t find anything suitable except for the TLC5957. Most drivers are either using 12-bits PWM, or using an 8-bits PWM but with really insufficient data rates.
The TLC5957 has the advantage of being configurable from 9 to 14 bits (resulting in small enough PWM period when used with 9 or 10 bits), and with a really high data rate.

We also had to think about how to configure the Wifi. After thinking about a bunch of scenarii, we went with the simplest solution since we are running late. We will use a button on the mobile part used to turn the Wifi module into AP mode. It will also start a web server that can be used to configure the Wifi AP to connect to.

Finally, and again for sake of simplicity, we decided that the fixed and mobile parts will not communicate since it’s not strictly necessary. The fixed part will use a switch to start and stop the motor, and a photosensor to get feedback on the motor’s speed. Since the motor is controlled with 5V logics, and we do not longer need Wifi or Bluetooth, we will use an Arduino Micro instead of the ESP32. It can be powered using 12V, provides a regulated 5V/1A pin, and PWM capable 5V GPIOs.


I’m beginning to work on the schematics. For know, I designed one LED panel including its driver and column multiplexing.

[CyL3D] Choosing components

Rotation synchronization

For the rotation part, we were thinking of an IR LED on the fixed part, and an IR receiver on the mobile part. It turns out there are transmissive photosensors which are exactly what we need: an emitter and a receiver, isolated from outside world except for a small hole. The photosensor will detect when something passes through that hole.
We needed the component to be easily placeable on the PCB, and with the right orientation. It turns out it was not that easy since most photosensors are either SMD or oriented in the wrong way. We managed to find a sensor with the 3 possible orientations, fixed by screws, and connected with wires: Omron EE-SX3164-P2.

We think we are going to use multiple “spikes” on the fixed part. This way, we will have a finer granularity to synchronize LED driving.


We also needed to choose components for the multiplexing. Multiplexing each column will be done using a PMOS since our LEDs are high-side driven.
Moreover, LEDs need to be powered with ~5V (due to the voltage taken by internal LED driver circuitry). Providing that we use 3.3V logic, we need a MOSFET driver to generate a signal capable of toggling a PMOS for which Vds = 5V.

I used the TI design note to see what components they were using. It turns out both there PMOS and driver where adapted to our situation (3.3V logic, 5V for LEDs, 5A maximum sink current, and quick commutations).
This is why we are going to use the ISL55110 driver and the SI2333CDS (simple) or SI4953ADY (double) mosfet.

The LED drivers will be directly connected to the FPGA which will be in charge of the multiplexing logic.


I’ve also been thinking about Wifi. We want to process data in the Cyclone V HPS. It has the advantage of being fast and directly connected to the FPGA and shared RAM (which we will be using for buffering).

Since we need to support quite high data-rates, there are only two HPS interfaces we could use: SDIO and USB. Since SDIO is already connected to eMMC on the SoM, only USB is left.

I’ve been looking at linux friendly 802.11ac module since it can handle greater throughputs. There are 3 drivers available.

  • Broadcomm’s. I couldn’t find a place to easily buy their ICs so it seems it’s reserved for business buyers.
  • Intel’s. All of there cards communicate through PCIe but we don’t have a PCIe interface on the SoM. There are variants of the SoM with PCIe but still we want to avoid extra complexity.
  • Realtek’s. Feedback is that the driver is buggy and unstable. Again, unnecessary complexity.

For now, we think it’s better to stick to 802.11n and reduce the resolution or depth of our panel if needed. We’ve been suggested to use Acmesystem’s WIFI-2. It is a Wifi module based on a Realtek IC. It is compatible with USB IF so it can work out-of-the-box using default linux drivers.

We have all our “logical” components. We can then finish by choosing power components, now that we are sure of our needs.

[CyL3D] More architecturing

Since last week we’ve mainly done three things : defining formal specifications, upgrading the architecture, and start choosing components.

Formal specifications

We’ve created a spreadsheet that would compute some figures like the maximum power, the maximum bandwidth, the bandwidth per LED driver, etc. We can immediately see parameters effects by varying them.

We’ve chosen to use 1200 LEDs using a 4:3 format. This seems sufficient to get a nice result without creating unnecessary complexity.

We also wanted to check the maximal throughput because we think it’s our main bottleneck. Using 8 bits per color, and 25 revolutions per second we measured a total throughput of around 50 Mb/s. Again this seems reasonable since we can transmit that much data both through WiFi or SDIO.

Finally we needed to check how many drivers and multiplexers we should use. Using 16-LED drivers and 8-columns multiplexers, a block that is at the edge of the panel (thus we the higher update frequency) would have a maximal throughput of around 8 Mb/s. We saw several LED drivers made by TI which have at least 20 Mhz bandwidth (and approximately the same throughput).

With all this, we add our final setup : 40×30 LED pane, decomposed in 8×15 blocks each controlled by a LED driver and a column multiplexer.

Upgrading the architecture

We’ve also been thinking about the fixed part of the project. We needed a way to control the motor and the IR LED used to synchronize the system. We tried to think about the usage flow, and finally came up with this :

  • Push a button to turn the power on. This should power two BLE modules, one fixed and one mobile.
  • When selecting or streaming a file through the WiFi interface, the mobile BLE module would communicate with the fixed part to start IR emission and motor rotation.
  • When paused or stopped, the same process would happen to stop the motor.

It means that we’re going to have to design a (much simpler) circuit in the fixed part to control the motor and LED.

Choosing components

We’ve also started to choose components. In fact this is highly correlated to the architecture since we need to be aware of what’s existing to design the system.


In particular, I’ve been searching for a WiFi module. After some research, I found the ESP8266 and its ESP32 family successors. These are particularly know, and adpated for our usage, because:

  • The processor is dual-core, one core being dedicated to the IP stack, and the other being available for the user
  • The processing power is sufficient for most usage. We are not going to do much work apart from forwarding data to the FPGA.
  • It has 500 Ko of SDRAM and up to 16 Mo of external flash memory. This is clearly enough for our program and data.
  • It is SPI and SDIO capable.
  • It is cheap.
  • It supports WiFi with an UDP throughput of 30 Mo/s
  • It supports BLE
  • It has a huge fan community, lots of tutorials and programming guides.

For all these reasons, we are going to use two ESP32 module. One will be used in the mobile part to handle the WiFi interface. It will also communicate through BLE with the second module on the fixed part. This is the module that will drive both the IR LED and the motor.

Moreover, the ESP32 is supported by, and mostly used with FreeRTOS. This is a sufficient reason for choosing FreeRTOS as our main OS.


I’ve been digging into TI website to compare their LED panel drivers. I noticed that two of their drivers are suited for “large high frequency multiplexed panels”. Furthermore, TI even wrote a document aimed at explaining the whole process of using those two drivers to build a panel. Since this is pretty much what we’re doing, it looked like a good idea to use those.

Ambroise checked the differences with other drivers that were available and noticed that, although the bandwidth is lower, they provide lower rising/fall time and most importantly have buffers to store the whole frame.

As a result, we are going to use the TI TLC59582.