Categories

[SpiROSE] Last few days, close to the end, but we have finished the FPGA’s modules !

Hey, last week  I was quite pessimistic because of the time we had left for the testing, verifications and calibration of our hardware. It seems that this is almost fixed ! We are a little more than two days before the end and the three cards have been done — almost at 100%. There is still an unknown issue with one of the voltage regulator, which allegedly made it burn, but it might run well tomorrow.

We managed to implement the whole new FPGA architecture that Adrien described in today’s post and add a bunch of new SystemC tests. We made a global pinout assignment with quartus TCL definition file for the EP3C40 FPGA and a lot of different testbench for both EP3C40 and DE1_SoC targets so as to exhaustively test every part of the project on the hardware. Tomorrow morning I will finish one of these part which will test if our solutions for synchronizing the RGB input stream with the display will work correctly. The goal is to make a full RGB cylinder in 3D as input for the RGB logic, simulate the rotation and extract only one slice position. If the panel was to change color, we would know it doesn’t work.

Finally, it will probably be a happy ending, or else we can still show that we know how to blink LED.

[SpiROSE] FPGA synchronization architecture

Synchronization signals

The engine may not have cycle accurate speed, the rgb logic will not send data at exactly 33MHz… there are various synchronization issue to tackle. Thus we need some sync signals.
In order to have the simplest and less error prone code, the modules will all rely on a few sync signals, each one output by the relevant module so that the other ones don’t need to worry about his inner working. The framebuffer buffer for instance shouldn’t have to know how the driver controller works, it just needs to know when to send data to it.

The synchronisation signals are the following:

Signal name Output by Used by Role
hall_sensor_trigger hall effect sensor Indicates that we have made half a turn
position_sync hall effect sensor framebuffer, driver controller Indicates that the position has changed
rgb_enable SPI rgb logic Command sent by the SBC to start everything
stream_ready rgb logic framebuffer, driver controller Indicates that enough slices are stored in ram to begin sending data to the driver
driver_ready driver controller framebuffer Indicates that the drivers are ready to receive data (they are configured and we are not in a blanking cycle)
column_ready driver controller multiplexer Indicates that the drivers have latch the data for a column

Modules’ states

The modules will follow those steps:

  1. At reset, the driver controller send the default configuration to the drivers. driver_ready and column_ready will be low so the framebuffer is waiting and the multiplexer doesn’t turn on anything. The rgb_logic waits for rgb_enable to start writing in the ram.
  2. The SPI module receives an rgb_enable command, so it drives the signal high. rgb_logic starts to read the incoming rgb stream, and monitor vsync to ctach the beggining of an image. Then it starts writing in the ram. When enough slices have been written it drives stream_ready high.
  3. The framebuffer and driver controller modules will do nothing before the stream_ready signal. When they receive the stream_ready signal they start a two state behaviour: send data, wait for next slice. In send data state framebuffer listen to the driver_ready signal to send data to the driver controller, driver controller send data to the drivers with the protocol describe in previous posts. When a column has been sent the driver controller drives column ready high for one cycle.
  4. When the multiplexer module receives column_ready it turns on one column only for less than 10µs (so we don’t burn the led when overdriving), then turn it off and wait for the next column_ready assert.
  5. When the whole slice has been displayed (i.e. 8 columns), the frembuffer and driver controller enter wait for next slice state, where they wait for position_sync signal.
  6. The position_sync signal triggers the change from the wait for next slice state to the send data state (go back to step 4)

This is sum up in the following figure:

Hazards

There is an issue if position_sync changes when we are in the send data state. This can occurs when:

  • There are too many slices, so the time required to send the data exceeds the lenght of a slice
  • The motor was too fast, so the hall_sensor_trigger happens too soon

The first issue can be fixed by using fewer slices, but this means a lower resolution.
The second one is unlikely to happen, as the engine speed is steady.

In case one or the other happens, we handle it this way:

  • Continue to send data for the current column
  • Then reset the relevant counters/signals and start sending the next slice normally

Thus we just finish to send our column before moving to the slice n+1, so in the worst case we lose 7 columns out of 8 in slice n. The slice n+1 is displayed normally but slighlty delayed, worst case by 512 cycles. So if wait for next slice state last for more than 512 cycles the delay is gone when we reach slice n+2, otherwise we end up being in the same situation as before.

[SpiROSE] FPGA, the end is near

Hello everyone, I hadn’t posted since Christmas, so here I go ! Since the expedition (no pun intended) of the PCBs to the manufacturer, I kept mainly focused on FPGA development including source code and tests for various modules, which allowed us to correct many bugs we hadn’t seen so far. I will detail some features later on.

SPI slave

The SPI slave implementation is complete. The protocol for sending commands from the SBC to the FPGA is simple, we have a header command byte, followed by as many bytes of data as needed. Let’s have a quick summary of the different possible commands.

  • Enable/Disable the RGB module: we need to tell the RGB module, the one that writes into RAM the data the FPGA receives from the parallel RGB, to start or stop writing in RAM, since we only want to display relevant data. As the RGB module is the first module in the chain, this command starts everything.
  • Configuration command: change the configuration of the drivers. The command may be issued at every moment, the driver controller module can handle the reconfiguration even when it was streaming data to the drivers.
  • Request rotation information: The FPGA should be able to send its the rotation position (the slice we are at), as well as the speed of the motor.

But how can we get the position of the rotating part ? Let’s look at the Hall effect sensors !

Hall effect sensors

Since we finally use Hall effect sensors instead of the rotary encoder that was originally wanted, I did the module that tells the driver controller and the framebuffer when we enter a new slice, to begin displaying for the current slice.

The issue is that we have only two Hall sensors that are opposite one to another and 256 slices per turn. So we need to infer the slice we are at given only 2 positions, over the 256 for a turn, that we are absolutely sure of. Ideally, the synchronization signal is generated without knowing in advance the motor speed, so it must adapt in “real time” to it. An idea, since we have 128 slices per turn, is to estimate the slice positions of the current half-turn given the number of cycles that it took to make the previous half-turn. In a word, we constantly correct the duration of a half-turn to fit the speed variations of the motor.

The Hall effect sensors we chose are hysteresis sensors which, in their case, means that the output of the sensors is set low when the magnetic flux is beyond a certain threshold and is kept low until the flux reaches a second threshold that is lower than the first one. This provides an integrated anti-bounce mechanism. We thus only need to detect the negative edge of the sensor output to have a “top” to synchronize onto. Between the 2 tops of the opposite sensors, we count the number of cycles and we compute (right shift)  the duration for each slice for the following half-turn, sending the synchronization signal accordingly. To give a few figures, if we have 15 rotations per seconds, around 15600 clock cycles are elapsed between 2 slices.

 

For the upcoming week, while we are waiting for the components to be shipped and since all FPGA modules are complete, we will build end-to-end tests with all of them, hoping that all the tests we have implemented for each modules were thorough enough to spot all possible bugs, for it to be ready for a quick deployment on the actual board.

See you soon !

[SpiROSE] Rust, Device Trees and Mechanics

Hello! This have been a long time since I’ve posted here. Lots happened!

First the mechanics

The motor we chose was doing lots of noise, so I had to investigate this and try to get a solution to avoid having to buy another motor. I discovered the noise was coming from a broken ball bearing.

The faulty ball bearing was hiding in the bottom of the motor.

 

After ordering a new ball bearing, the motor was running perfectly… Up to the moment I realized the motor was not the only one to do a big big noise… The planetary gearbox is using 3 needle bearings that does a lot of noise too…

I recycled an old Brushless Motor, a 4 pole 2100kv 3-4S LiPo sensored BLDC motor. This motor is not supposed to fit the low speed requirements (~20-30Hz), so I had to test with the real world load. We engineered PCB prototypes without the components to simulate the load, and mounted them over the newly installed motor.

New sensored motor speed test without load

Load test, with a big protection. We care about our lives 🙂

The two black plates over the vice are the prototype PCBs (pre-sensitized FR4 plate we machined with a CNC to fit the real size). The system is sensored, so we just placed a logic analyzer at the hall effect outputs (and divided by 4 the outputs, as it’s a 4 poles motor). The result is that we can go from ~5Hz to more than 30Hz, so the motor fits perfectly with the load!

Good news: the motor is now fixed in the structure, mechanical engineering is about to end!

 

Discovering Rust

This have been a long time since I wanted to learn Rust programming language, but :

  • I didn’t have time to learn it
  • I didn’t have a good and simple project to start with

Thus, when I saw we had to make a simple Command Line Utility for our SPI communication with the FPGA, we decided to make it in Rust. The first day was hard. Very hard. I discovered how we use rustup and cargo correctly, how we setup Rust to always develop with the nightly version, and how to cross compile for armv7.

SpiROSE Command Line tool

After hours of questioning myself about how it works, reading the docs, asking questions to people who Rust and error reading, I finally got a working SPI CLI for SpiROSE. Ok, that’s cool, but how do I get the SPI device now?

SPI, Yocto, dtsi

As the current distribution is not really made for embedded (ubuntu), we wanted to try Yocto, a build system made for packaging a GNU/Linux inside embedded systems. Yocto is widely used for that, but has a complex curve of learning. As the time was getting short, we rolled back to ubuntu.

So, how to get SPI device in /dev/spidevX.X? The solution is:

  • Install Spidev module (we recompiled the kernel to integrate it inside)
  • Update the Device Tree Blob given to U-Boot to give the address of the SPI hardware to Spidev and enumerate it inside /dev

As nobody knew about Linux Device Trees in the group, I hardly tried to put the new device in the dtsi, to get the new Device Tree Blob to give to U-Boot.

I finally achieved this tonight. Tomorrow will be reserved for testing the SPI output (already tested with a logic analyzer) by connecting it to our FPGA development board.

 

Now this week will be full of tests, to get the PCBs ready to use as short as possible!

[SpiROSE] Routing voxels

Howdy!

Good to see you again after the holidays, Christmas, New Year’s Eve, alcohol, …

I haven’t written here in a while, so I’ll do a large combined post of the week before the holidays, the holidays and this week (i.e. the week after the holidays).

Routing the rotative base

Before the holidays, I did most of the place and route of the rotative base. Obviously, it ended up being heavily modified around the board-to-board connectors. The pinouts (and even the shape!) of those connectors were modified several times, due to space constraints.

In the end, this is our rotative base PCB. This is a 4 layers, 20x20cm board that we now call “The Shuriken” due to its odd shape.

Right now, it is in fab, but with some delay due to a bug in the board house website.

Voxelization

Now that the renderer has a working PoC, it was time to wrap it in something more flexible and reusable.

Introducing libSpiROSE, which allows you to turn any OpenGL scenery in voxel information usable by SpiROSE. It can output both to the screen (which will by piped to the FPGA through the RGB interface), or to a PNG file.

For example, this is actual code that voxelizes a cube to then dump it in a PNG :


#include <iostream>
#include <spirose/spirose.h>
#include <glm/gtx/transform.hpp>

#define RES_W 80
#define RES_H 48
#define RES_C 128

int main(int argc, char *argv[]) {
glfwInit();
GLFWwindow *window = spirose::createWindow(RES_W, RES_H, RES_C);
spirose::Context context(RES_W, RES_H, RES_C);

float vertices[] = {1.f, 1.f, 0.f, 0.f, 1.f, 0.f, 1.f, 0.f,
0.f, 0.f, 0.f, 0.f, 1.f, 1.f, 1.f, 0.f,
1.f, 1.f, 0.f, 0.f, 1.f, 1.f, 0.f, 1.f};
int indices[] = {3, 2, 6, 2, 6, 7, 6, 7, 4, 7, 4, 2, 4, 2, 0, 2, 0, 3,
0, 3, 1, 3, 1, 6, 1, 6, 5, 6, 5, 4, 5, 4, 1, 4, 1, 0};
spirose::Object cube(vertices, 8, indices, sizeof(indices) / sizeof(int));
// Center the cube around (0, 0, 0)
cube.matrixModel = glm::translate(glm::vec3(-.5f));

// Voxelize the cube
context.clearVoxels();
context.voxelize(cube);

// Render voxels as slices
context.clearScreen();
context.synthesize(glm::vec4(1.f));

context.dumpPNG("cube.png");
return 0;
}

This leads to this :

Not impressive, but the code to generate it is all contained above.

Now, add an MVP matrix, a few rotations to the cube and swap context.synthesize with context.visualize and you’ll get this :

Next week

We have a first PCB that arrived from fab this week, so we’ll assemble it next week. However, this is the LED panels, sooooooo that’s going to take a long time.

We will also focus on building actual demos, now that we have libSpiROSE in an usable state.

See you next week!

[SpiROSE] Led panel assembly and slice format

Led panel assembly

One major issue to design our two PCB was the board to board connection and the led panel position.

In order to have a 80 pixel horizontal resolution with only 40 columns per panel, we need to have the rotation axis between two columns, as the following image shown :

This means two things :

  • The panel axis is not the same as the rotation axis, they are shifted by 1.125mm (half a column), thus the steadiness of the whole must be fixed by adding weight on the rotative base
  • As the panels are meant to be perfectly aligned with one another, the leds and holes has to be symmetrical to the panel axis (not the rotation axis) in order to be face to face with their clone on the other side

For one panel, the connectors are symmetrical to the rotation axis, but when we rotate it by 180°, the connectors don’t align with their clone. This is shown below:

The connector on the rotative base have to be shifted by 2.25mm between the two lines.

The holes on the rotative base also have to be shifted by 1.125mm from the rotation axis to be aligned with the ones on the panel.

Consequences on the framebuffer and slices format

At a given position p, a column on a panel and his clone on the other one display the same voxels. Thus only 40 columns out of 80 are displayed. We have to wait half a turn to reach the position p again, but now the panels have rotated by 180° so the new 40 columns are the ones missing. Therefore we still have our 80 voxels wide resolution.

Hence a slice is displayed in two steps :

  • the even columns first
  • the odd columns half a turn later

This could be handled by the framebuffer module, but would imply to remember at which position we are. In order to remain position agnostic, it is better to let the SBC directly handle this by outputting the slices in an adequate format.

The SBC will cut a slice in two batches, one with the even columns, the other with the odd ones. However, because the panel rotate by 180°, the odd columns need to be reversed to be displayed correctly. All of this is shown in the following figures:

Let’s assume we want to send this slice

At position p we will display the even columns

At position p plus half a turn we will display the odd columns

Thus the SBC will send the slices with the following format

[SpiROSE] Last updates on the hardware, happy new year !

It has been time since the last update, I hope you had a good time with the New Year and holidays. Happy new year to everyone and I hope the best for every other ROSE project.

Since the last time I posted, we managed to finish the most important subjects in our work:

  • The last components of our projects have been chosen.
  • The hardware part is (at last) completely done.

MOSFET Driver

First, about the components, we had to choose a mosfet driver so as to have good switching timing. Indeed, we want to overdrive the LEDs so they are visible even if we don’t light them long. So we have to be very precise on the

However, the LEDs are connected to the +5V by their common anode, and to the mass through the LED driver. As we don’t have much place, we have to put the mosfet to the bottom of the LED panel, so it has to connect the LEDs to the +5V. Thus, we have no choice but using P-channel high side mosfet. Difficulties stemmed from this choice.

Actually, P-mosfet are bigger than N-mosfet, and as described by Vincent last time it has been difficult to put them on the PCB. But finding a P-channel high side mosfet driver is even worse. We finally end up with the LTC1693-5 from Linear Technology. It can drive a single channel but we can safely put eight MOSFET on each channel and still meet the timing requirement. It eases the wiring as we light eight columns at a time during a multiplexing step.

Basically, it was the only one MOSFET driver we found within our criteria, so there was no room for choice.

Card-to-card connectors

Then we had to find connectors between our cards. Our first thoughts went to single row 28-pin connectors with a spacing of 1,27mm between pins, with male ones being on the rotative base and female ones on the LEDs panel. However it failed to be routable because it put the flat angle brackets 0.85cm too far from the center, making the LEDs panel bigger than 20cm, too big to go to a pick&place machine. We hereby swapped them with dual-row 1,27mm 28-pin connectors (so 14-pin long) after computing that dual-row 2mm 28-pin connectors weren’t enough and failing to find smaller connectors.

What next ?

Next we will have to reorganize our PSSC for the software, as we noticed before the holidays that most of the time they were either obsolete or overlapped: one critera would ask the work for two and both would be validated at the same time, so there was no use of having them on their previous state. However most of the new plan has been done and accepted by the teacher just before the holidays.

We also have to finish SystemC tests for SPI and framebuffer, implement end-to-end SystemC tests (from SBC output to FPGA output).

We will probably not have the PCB before the end of the project but at least we will be able to show it for the final presentation, so software has to be tested enough so that we can focus on the hardware problems as soon as we receive them.

[SpiROSE] OpenGL ES and mainboard

Howdy :]

OpenGL ES

The first half of this week was dedicated to porting the renderer to OpenGL ES, which I managed to pull off. In the end, the port was really easy, as the desktop version used OpenGL 3.3, which is very close to OpenGL ES.

The renderer PoC running on the wandboard

You may notice there are no 3D visualisation in the middle. This is actually expected, as this view relied on geometry shaders, not available on this GPU. Still, I did not waste time re-implementing it as the only thing that matters are the white things on the bottom left hand of the window.

However, some issues do remain. Those issues only happen on the Wandboard with the Vivante drivers. Any other GL ES environment is working just fine, without those issues (for example, the VMware Mesa driver that now supports OpenGL ES 3.2).

Mainly, I have uniforms that are not found for a shader stage, and steadily returns -1, which is “no uniform found” in GL terminology. strange thing is, only this specific shader is problematic.

Furthermore, the keen-eyed amongst you may have noticed a small glitch on the above picture. The top-right hand slice is missing a pixel column, effectively cutting poor Suzanne’s ear.

What is infuriating is, neither the glitch nor the missing uniforms happen on my Arch Linux VM, where the VMware OpenGL ES driver is consistent and reliable.

Mainboard

This week I also seriously attacked the mainboard and finished up the schematics. As a quick recap, this board is what we called previously the “rotative base”. Its main features are :

  • SBC
  • FPGA
  • Hall Effect sensors
  • Power supplies:
    • 5V (SBC)
    • 4V (LEDs)
    • 3V3 (Drivers, buffers)
    • 3V (FPGA IO)
    • 2V5 (FPGA IO and FPGA PLL)
    • 1V2 (FPGA core)
  • Board-to-board connector to link to the LED panels

Phew!

PSU

I made the PSU with some help on switching ones from Alexis. You may have noticed, we may have gone overboard with the supply rails (man, 7 different voltages on a single board!). However, each have a purpose, and I’ll only bother explaining the two odd ones:

  • 4V: the LEDs are driven to ground by the drivers. Said drivers dissipate the excess voltage themselves. Thus, to avoid overheating, we chose a supply voltage for the LEDs as close to their forward voltage as possible.
  • 3V: The FPGA I/Os can work up to 3V3. However, they are poorly protected, and some overshoot on a 3V3 signal might kill the IO ping. (to be short, the protection diode used by altera is too weak). That’s why the banks which output any signal are using 3V, which is still readable by the buffers and the drivers, while being much more tolerant to overshoot, and other signal integrity problems.

Now, upon its architecture. A picture is worth a thousand words:

As for power, well… With overdrive, at the maximum power of the drivers, we can chew through 22A of current per panel. That’s 44A of blinky goodness. Oooops. Well, fortunately, people already have had this problem, and Ti happens to have a wonderful buck converter module (PTH12040WAD) that can deliver up to 50A. Fantastic!

Place and route

This turned up to be much harder than I expected. It is quite a mess, for one simple reason: the LED panels are identical, but back to back. This means our board-to-board connectors are parallel, but opposite of each other. Most of the signals that go to one go to the other. This means that most of the signals have to cross. Oooopsie!

However, this is getting together. Placement is mostly done, except for a few components that can be tucked anywhere. And most submodules are, internally placed and routed. Think of buck converters, that once grouped and routed, can be moved around, almost like a single component.

The board is quite spacious because of the SoM that takes quite a lot of space. The smallest bounding circle I could do has a diameter of 175mm.

Next week

Simple: moar routing. Right now, this looks more like the giant spaghetti monster than a PCB….

[SpiROSE] Placing and routing in restrictive environment

Hello everybody !

Recently, I have worked on the PCB of the LED panels. It needs to be finished really soon in order to work on it shortly. For a brief recap, a LED panel has in its centre a 1920 LED matrix which is surrounded by 15 LED drivers, 5 of which are beneath the matrix and the remaining 10 above. Now we had to add the MOSFETs for the multiplexing as well as the clock buffers (2 buffers for SCLK, GCLK and LAT, for a total of 6 buffers). Since the drivers and the matrix had already been placed and routed, we tried to figure out what the optimal placing location for the MOSFETs, in order no to mess up the PCB too much.

Routing MOSFETS

Some MOSFETs with their multiplexing lines

Under each LED column, there is a plane for its corresponding multiplexing signal (the filled purple vertical plane). Since we have one MOSFET for each LED column, we chose to place the MOSFETs right where this plane ends, beneath the LED matrix. Yet, there is very little place: we indeed have vertical traces (the blue ones, layer 1) between all LEDS, which restrain the MOSFET place. With Unidan, I did run placing and routing tests to determine whether placing MOSFETs vertically or horizontally would make the routing more convenient. It appeared that the horizontal one gave best results, so I placed then routed this pattern. The MOSFET area is filled with a thin 4V plane.

Routing everything

Lower part of the LED panel, showing some MOSFETs, 2 LED drivers and 1 clock buffer (blue=top, red=bottom)

After that the struggle began, welcome to the trace jungle. Between the 5 bottom drivers are now placed 6 clock buffers, all aligned in a tiny place. The challenge was to route the output signals of the clock buffers up to the 10 upper drivers as well as routing the signals that will be transmitted from the rotary motherboard. The big issue is that we should only use 2 layers to do so, ie route all buffer/upper-driver, buffer/lower-driver, driver/matrix, MOSFET/MOSFET and MOSFET/multiplexing connections in the 2 external layers, since the 2 inner layers are used for Vcc and ground. I tried not to use the inner layers at all, but it was not entirely possible, so instead, I tried to minimize the length and the number of traces that occupied the aforementioned layers. By the way, using many custom colours for the different nets/traces/planes really helped a lot.

I have almost finished routing all nets, some still require a consequent length in inner layers and thus need to be improved. But is is not over, we still need to add the MOSFETs drivers as well as the board-to-board connector. This task is now our priority and will be carried out in the beginning of the upcoming week.

[SpiROSE] Testing systemverilog with SystemC, final round

Among the tasks I’ve been busy with these weeks, there is the making of integration tests for the FPGA’s SystemVerilog code.

SystemC is a very good asset to put in a CI environment. Basically, its advantages are:

  • It is quite light and free.
  • You only need tools to compile C++ code in your CI, instead of having modelsim or other full simulator.
  • You can customize the output to your needs, although formatting with SystemC with only 80 characters a line is quite awful.
  • You can use other library more instinctively.
  • You can even get VCD files back from the CI.
  • The compilation files are almost usual makefiles and testbenchs are C++ (sc_)main.

But eventually, we want to use this tool against the SystemVerilog code, so we need a bridge between them.

The situation

To put it clear, I will describe the situation for the driver_controller module. This module is the one generating control and data signal to the drivers. Amid these signals there are:

  • The GCLK and SCLK signals, both are clocks.
  • The LAT signal, which has strong timing requirements. It has different possible duration at the high state, each of these associated with a different command in the driver.
  • The SIN signal, which is the input of the driver itself

Before using the driver, it has to be configured and its the role of the driver_controller to do it too.

The inputs of the driver controller are first two clocks: clk_hse running at 66MHz and clk_lse running at 33MHz. This last one is generated thanks to the clk_hse clock in the clk_lse module, that we will need to include in the SystemC testbench too.

Then we have a framebuffer_sync signal, which should be generated each time we start a new frame. Finally there is a framebuffer_data signal of size 30bits, giving the SIN for each driver.

The tools

To achieve this, we will need to translate our SystemVerilog code into SystemC modules. It is time for the Verilator tool to enter the scene.

This tool was first included as a linter into our project, but revealed to be more useful than first thought. It is capable to translate SystemVerilog code into either special C++ to use with verilator, or SystemC modules which are a bit slower than the former but far easier to use.

Our little clock module becomes the following beautiful SC_MODULE:

SC_MODULE(Vclock_lse) {
    public:

    // PORTS
    // The application code writes and reads these signals to
    // propagate new values into/out from the Verilated model.
    sc_in<bool> clk_hse;
    sc_in<bool> nrst;
    sc_out<bool> clk_lse;

    // LOCAL SIGNALS
    // Internals; generally not touched by application code

    // LOCAL VARIABLES
    // Internals; generally not touched by application code
    VL_SIG8(__Vcellinp__clock_lse__nrst,0,0);
    VL_SIG8(__Vcellinp__clock_lse__clk_hse,0,0);
    VL_SIG8(__Vcellout__clock_lse__clk_lse,0,0);
    VL_SIG8(__VinpClk__TOP____Vcellinp__clock_lse__nrst,0,0);
    VL_SIG8(__Vclklast__TOP____Vcellinp__clock_lse__clk_hse,0,0);
    VL_SIG8(__Vclklast__TOP____VinpClk__TOP____Vcellinp__clock_lse__nrst,0,0);
    VL_SIG8(__Vchglast__TOP____Vcellinp__clock_lse__nrst,0,0);
    VL_SIG(__Vm_traceActivity,31,0);

    // INTERNAL VARIABLES
    // Internals; generally not touched by application code
    Vclock_lse__Syms* __VlSymsp; // Symbol table

    // ...
};

 

Verilator will generate its files in an obj_dir/ directory, which will be in the sim/ directory.

It will generate Vmodule_name.{h,cpp} file, containing the module itself and glue code to go with verilator library, Vmodule_name__Syms.{h,cpp} files making it available in the verilator library and some Vmodule_name__Trace.{h,cpp} files.

For the simulation, you only need to link against Vmodule_name.o and Vmodule_name__Syms.o. You can add the Vmodule_name__Trace.o file if you want to get the trace written into a VCD file, but you’ll need to have a deeper look into Verilated, the Verilator library, and especially the VerilatedVCD object to make it work.

The methodology

 

The first idea with testing module in SystemC is to put all the testing code in the sc_main function, where you are able to control the progress of the simulation :

sc_main(int argc, char**argv) {
    sc_time T(33, SC_NS);
    sc_clock clk(T);
    sc_signal<bool> nrst(“nrst”);
    // create the IN/OUT signal for the DUT
    Vmodule_name dut(“module_name”);
    dut.clk(clk);
    dut.nrst(nrst);
    // bind the IN/OUT signal of the DUT
    while(sc_time_stamp() < sc_time(10, SC_MS)) {
        // advance the simulation of T
        sc_start(T);
        // do your tests
    }
}

 

However it looks like software testing and it is usually a bad idea because you won’t be able to describe every interaction at the module scale you want. In our context within the driver_controller module, this drawback is visible and won’t allow us to do correct testing.

 

Instead, we write a Monitor SystemC module, which will interact with the DUT and run the tests as SystemC threads.

 
SC_MODULE(Monitor) {
    SC_CTOR(Monitor) {
        SC_THREAD(run_test_1);
        SC_THREAD(run_test_2);
    }

    void run_test_1() {}
    void run_test_2() {}

    sc_in<bool> clk;
    sc_in<bool> nrst;
    // create sc_out for output signal to the DUT
    // create sc_in for input signal from the DUT
};

I will explain later how we benefit from this in the different tests.

The compilation

 

Now we have to compile the SystemC code into an executable simulation. We will want to recompile SystemVerilog into C++ each time there is a change and write makefile as small as possible for each testbench.

 

In the FPGA/ directory, we currently have the following structure:

  • src/ : containing the SystemVerilog code.
    • systemc/ : containing some SystemC modules, including the driver model we developed.
  • tb_src/ :
    • systemc/ : containing the SystemVerilog testbench we developed in SystemC.
  • sim/ : containing what’s needed to generate the simulation.
    • Makefile: will be the entrypoint to launch tests
    • base_testbench.mk: base makefile from which will inherit the others
    • module_name.mk: makefile for the module_name testbench

 

In Makefile, we will have a variable listing the different module we want to test. It will serve as generator to create FORCE-like tasks to generate and launch the simulation.

 

What we want for the module_name.mk is the following:

 
MODULE := driver_controller
DEPS := clock_lse

ROOT = tb_$(MODULE)
OBJS += $(ROOT)/main.o $(ROOT)/monitor.o driver.o driver_cmd.o

-include base_testbench.mk

$(MODULE).simu: $(OBJS)
    $(LINK.o) $(OBJS) $(LOADLIBES) $(LIBS) $(TARGET_ARCH) -o $@

I’ve currently put the target to generate the simulation in this makefile but it could have been in the base_testbench.mk too. I just feel that it has more sense in the module_name.mk file.

 

MODULE is the device under test and DEPS is a list of SystemVerilog modules needed to build the testbench.

 

Then, the base_testbench will first define variables describing the environment:

export SYSTEMC_INCLUDE ?= /usr/include/
export SYSTEMC_LIBDIR ?= /usr/lib/

SV_MODULE_PATH = ../src

VERILATOR = verilator
VERILATOR_ROOT ?= /usr/share/verilator/include/
VERILATOR_FLAGS = --sc --trace
VERILATOR_BASE = verilated.o verilated_vcd_c.o verilated_vcd_sc.o

Then it will create objects list and dependency list:

 
VERILATOR_OBJS = \
    obj_dir/V$(MODULE).o obj_dir/V$(MODULE)__Syms.o \
    $(patsubst %,obj_dir/V%.o,$(DEPS)) $(patsubst %,obj_dir/V%__Syms.o,$(DEPS))
OBJS += $(VERILATOR_BASE) $(VERILATOR_OBJS)
DEPSFILES = $(subst .o,.d,$(OBJS))

all: $(MODULE).simu

Then it defines the compilation options:

LDFLAGS = -L$(SYSTEMC_LIBDIR)
LINK.o = g++ $(LDFLAGS) $(TARGET_ARCH)
LIBS = -lsystemc
VPATH = ../src/ ../src/systemc ../tb_src/ ../tb_src/systemc ./obj_dir/ $(VERILATOR_ROOT) ../lib
CPPFLAGS = -I../src/systemc/ \
    -I./obj_dir/ \
    -I../lib/ \
    -I../tb_src/systemc/ \
    -I$(VERILATOR_ROOT) \
    -I$(SYSTEMC_INCLUDE)
CXXFLAGS = -DSC_INCLUDE_DYNAMIC_PROCESSES -g

And finally defines the compilation target to rebuild Verilator files or handle dependencies:

obj_dir/V%.cpp obj_dir/V%__Syms.cpp obj_dir/V%.h: %.sv
    $(VERILATOR) $(VERILATOR_FLAGS) -y $(SV_MODULE_PATH) $<

%.d: %.cpp
    $(CXX) $(CPPFLAGS) $(CXXFLAGS) $(TARGET_ARCH) -MM -MP $^ -MF $@

clean:
    rm -rf $(DEPSFILES) $(OBJS) $(MODULE).simu

-include $(DEPSFILES)

Issues

 

What has been done currently doesn’t fulfill our requirements yet. It seems that even if dependency files are correctly generated and included, the simulation won’t recompile if header files are modified. As we are prioritizing the development of the hardware at the moment, we’re not trying to debug it more. But this issue should be solved by the end of the year to make the development of the last testbench more enjoyable.

 

Except this task, I’ve been working on the improvement of the tests made by Adrien on the renderer, mainly by gathering code into sh functions so that tests and especially our use of Xvfb are more robust. I’ve also been working on the routing of the LED panel with Adrien, trying a slightly different version than Vincent’s one, but it seems that his one will be better. Finally I’ve written some tests for the column_mux module, a mux choosing the active columns on the LCD screen, but it doesn’t pass them.

 

I will continue to implement some tests as soon as the hardware is finished or if I get a window of time free.