Categories

[ZeROSEro7] SX1276 DIO, ANT and now low power

Last week I improved my LoRa driver by using the IRQ through the DIO pins and greatly improved range. I increased the dBm output to 14 and made sure the RX sensitivity was correctly set. I then played with the CRC and spreading factor parameters to have best range performance. I still have to do an exterior range test with the antenna we will use, but with the dev kit, we were able to receive the signal on the other end of Telecom with all it’s walls.

I am now working on the power consumption, that is to say on sleep modes and wakeup times. Since I am focusing on the spy talk, I preferred the ON base current standby consumption for the nRF to save roughly 100ms on wakeup for only double the sleep consumption. I think this tradeoff is justified because even if the spy talk has low throughput, it should be able to send alarm messages really quickly. I must now try to disable as many peripherals, including the SX12 when I want to save energy. Depending on the algorithm, deep sleep may also be needed later.

I had a good look at the MAC layer from Semtech, but I’m not sure we will be using it. As mentionned above, we will aim towards very low latency and I’m not sure it will achieve that. Moreover, there is a lot of overhead and Everything isn’t completely customizeable. I still need to investigate this more, but for now I think I will design my own TCP-like protocol.

My next step will be to run a GATT server on BLE and I should have all my basic blocks to start the Spy talk software.

[SpiROSE] OpenGL ES and mainboard

Howdy :]

OpenGL ES

The first half of this week was dedicated to porting the renderer to OpenGL ES, which I managed to pull off. In the end, the port was really easy, as the desktop version used OpenGL 3.3, which is very close to OpenGL ES.

The renderer PoC running on the wandboard

You may notice there are no 3D visualisation in the middle. This is actually expected, as this view relied on geometry shaders, not available on this GPU. Still, I did not waste time re-implementing it as the only thing that matters are the white things on the bottom left hand of the window.

However, some issues do remain. Those issues only happen on the Wandboard with the Vivante drivers. Any other GL ES environment is working just fine, without those issues (for example, the VMware Mesa driver that now supports OpenGL ES 3.2).

Mainly, I have uniforms that are not found for a shader stage, and steadily returns -1, which is “no uniform found” in GL terminology. strange thing is, only this specific shader is problematic.

Furthermore, the keen-eyed amongst you may have noticed a small glitch on the above picture. The top-right hand slice is missing a pixel column, effectively cutting poor Suzanne’s ear.

What is infuriating is, neither the glitch nor the missing uniforms happen on my Arch Linux VM, where the VMware OpenGL ES driver is consistent and reliable.

Mainboard

This week I also seriously attacked the mainboard and finished up the schematics. As a quick recap, this board is what we called previously the “rotative base”. Its main features are :

  • SBC
  • FPGA
  • Hall Effect sensors
  • Power supplies:
    • 5V (SBC)
    • 4V (LEDs)
    • 3V3 (Drivers, buffers)
    • 3V (FPGA IO)
    • 2V5 (FPGA IO and FPGA PLL)
    • 1V2 (FPGA core)
  • Board-to-board connector to link to the LED panels

Phew!

PSU

I made the PSU with some help on switching ones from Alexis. You may have noticed, we may have gone overboard with the supply rails (man, 7 different voltages on a single board!). However, each have a purpose, and I’ll only bother explaining the two odd ones:

  • 4V: the LEDs are driven to ground by the drivers. Said drivers dissipate the excess voltage themselves. Thus, to avoid overheating, we chose a supply voltage for the LEDs as close to their forward voltage as possible.
  • 3V: The FPGA I/Os can work up to 3V3. However, they are poorly protected, and some overshoot on a 3V3 signal might kill the IO ping. (to be short, the protection diode used by altera is too weak). That’s why the banks which output any signal are using 3V, which is still readable by the buffers and the drivers, while being much more tolerant to overshoot, and other signal integrity problems.

Now, upon its architecture. A picture is worth a thousand words:

As for power, well… With overdrive, at the maximum power of the drivers, we can chew through 22A of current per panel. That’s 44A of blinky goodness. Oooops. Well, fortunately, people already have had this problem, and Ti happens to have a wonderful buck converter module (PTH12040WAD) that can deliver up to 50A. Fantastic!

Place and route

This turned up to be much harder than I expected. It is quite a mess, for one simple reason: the LED panels are identical, but back to back. This means our board-to-board connectors are parallel, but opposite of each other. Most of the signals that go to one go to the other. This means that most of the signals have to cross. Oooopsie!

However, this is getting together. Placement is mostly done, except for a few components that can be tucked anywhere. And most submodules are, internally placed and routed. Think of buck converters, that once grouped and routed, can be moved around, almost like a single component.

The board is quite spacious because of the SoM that takes quite a lot of space. The smallest bounding circle I could do has a diameter of 175mm.

Next week

Simple: moar routing. Right now, this looks more like the giant spaghetti monster than a PCB….

[SpiROSE] Placing and routing in restrictive environment

Hello everybody !

Recently, I have worked on the PCB of the LED panels. It needs to be finished really soon in order to work on it shortly. For a brief recap, a LED panel has in its centre a 1920 LED matrix which is surrounded by 15 LED drivers, 5 of which are beneath the matrix and the remaining 10 above. Now we had to add the MOSFETs for the multiplexing as well as the clock buffers (2 buffers for SCLK, GCLK and LAT, for a total of 6 buffers). Since the drivers and the matrix had already been placed and routed, we tried to figure out what the optimal placing location for the MOSFETs, in order no to mess up the PCB too much.

Routing MOSFETS

Some MOSFETs with their multiplexing lines

Under each LED column, there is a plane for its corresponding multiplexing signal (the filled purple vertical plane). Since we have one MOSFET for each LED column, we chose to place the MOSFETs right where this plane ends, beneath the LED matrix. Yet, there is very little place: we indeed have vertical traces (the blue ones, layer 1) between all LEDS, which restrain the MOSFET place. With Unidan, I did run placing and routing tests to determine whether placing MOSFETs vertically or horizontally would make the routing more convenient. It appeared that the horizontal one gave best results, so I placed then routed this pattern. The MOSFET area is filled with a thin 4V plane.

Routing everything

Lower part of the LED panel, showing some MOSFETs, 2 LED drivers and 1 clock buffer (blue=top, red=bottom)

After that the struggle began, welcome to the trace jungle. Between the 5 bottom drivers are now placed 6 clock buffers, all aligned in a tiny place. The challenge was to route the output signals of the clock buffers up to the 10 upper drivers as well as routing the signals that will be transmitted from the rotary motherboard. The big issue is that we should only use 2 layers to do so, ie route all buffer/upper-driver, buffer/lower-driver, driver/matrix, MOSFET/MOSFET and MOSFET/multiplexing connections in the 2 external layers, since the 2 inner layers are used for Vcc and ground. I tried not to use the inner layers at all, but it was not entirely possible, so instead, I tried to minimize the length and the number of traces that occupied the aforementioned layers. By the way, using many custom colours for the different nets/traces/planes really helped a lot.

I have almost finished routing all nets, some still require a consequent length in inner layers and thus need to be improved. But is is not over, we still need to add the MOSFETs drivers as well as the board-to-board connector. This task is now our priority and will be carried out in the beginning of the upcoming week.

DFU

This week, I worked on the DFU (or Device firmware update) over BLE. I succeeded in doing a dfu with our firmware, now I need to automatise it.

How it works

To make a DFU, we need another application called the DFU bootloader, it does this:

  • Check if a valid app is present (it will check a crc32 checksum)
  • If it’s the case it will run it, unless someone is pressing on button 4 (of course we will change it)
  • If there is no valid app or someone is pressing on button 4, the bootloader will enter dfu mode and wait for a dfu signed packet (there is a public key somewhere in the bootloader code)
  • If nobody is doing a DFU, and there was a valid application, it will run it after a timeout
  • If someone is doing a DFU, it will write the new code to the flash, check the content, and run it.

So the steps to make it work are:

  • Compile the bootloader and flash it
  • Compile the firmware
  • Create a zip packet with nrfutil : nrfutil pkg generate –application _build/nrf52832_xxaa.bin app.zip –debug-mode –hw-version 52 –sd-req 0xFFFE –key-file bootloader/key
  • Send it to your phone, and start nrf connect, launch the dfu

The dfu took 30 seconds (our app is not stripped, so it should be 15 seconds actually),

The softdevice update took 1 min 30.

Now I need to make the CI create the zip package, with a good private key handling, and see how we will work locally (on our laptops) with it.

[Little Brosers] From Python to C

Soooooo… No joke this week, I’m almost gonna write an essai this time.

 

First step: Allocating memory for our Merkle tree

So, the first issue about implementing our Merkle tree structure using C is memory allocation.

A simple way would be to use recursive build function for each node and malloc() each of them. But, let’s remembe that we will use this code on an embedded system. Dynamic allocation is not allowed.

So the first struggle was to find out how to staticaly allocate the space needed for our Merkle tree.

 

First I need to refresh your memory. Our Merkle tree is 4-ary balanced-ish. That is to say, each child of a node represents the time frame of the node divided by four. But this balanced time distribution rule is broken for the 4 first children of the tree, a.k.a. the root’s children. Those nodes have a an arbitrary part of the root’s time frame. This choice has been made because we think a lot of messages will be recent. Thus, the very last 24 hours will be more likely to contain messages than the last week.

To know when do we stop to split a node’s time frame, we have a constant in the code called BUCKET_SIZE. It gives the minimum time frame a node can represent. Thus, in this case, if the time frame width of a node divided by the number of children (4 here) is greater than BUCKET_SIZE, we split it. Otherwise, this node becomes a leaf.

Here is a basic representation:

So how do we statically allocate such a structure?

Several solutions were examined and to be short, the chosen one is the magic “python generated C header”.

In other terms, a python script included in our Makefile is called whenever the file containing our Merkle tree’s constants is modified and re-write a second header which only contains the size of our sub trees. This way we only have to include this generated header in our merkle_tree.c file and we know at compilation time the size needed for each subtree.

Oh and the sub trees are stored in C arrays.

 

Second step: Building the Merkle tree

Ok, now we know we have all the memory pre-allocated. The thing is…. we only have four dumb arrays. We need to turn this into a tree structure.

See below, we have 4 arrays of ‘node’ C structs and one node C stuct:

The sub trees are stored in their array in a Breadth First Search manner. The sizes of the arrays have been “#define” declared by the python script in a header.

The image above is a simplified representation of the sub trees and arrows are missing. Actually, There is also pointers to parent node.The job in this part has been to make sure each node C struct of the arrays is filled correctly and that each of its pointers is set.

For the curious ones, here is the content of a node C struct (for now):

struct node_t {
    uint64_t start;
    uint64_t end;
    uint32_t child_width;
    node_t *p_parent;
    node_t *p_children[LB_MT_NB_CHILDREN];
    uint16_t depth;
    uint16_t child_id;
    uint16_t array_id;

    slot_t *p_slots;
    uint8_t p_hash[SHA256_BLOCK_SIZE];
};

 

 

Third step: Attaching messages to leafs

We have now a Merkle tree structure we can explore without knowing it’s an array and just using the provided pointers in each node.

The next step is to find out which message has to be attached to which leaf. Also, the actual messages will be stored in external flash so we can’t “put” the message in the leaf structure. We will instead use structures called ‘slot’. Each slot can only correspond to one message and will have an integer which will correspond to the address of its message in the external flash. The only thing we will copy from the message is its time stamp, for convenience. This is what it looks like :

struct slot_t {
    slot_t *p_previous;
    slot_t *p_next;
    uint32_t msg_addr;
    uint32_t timestamp;
};

So each leaf will be able to have what we call a bucket, a.k.a. a linked list of slot_t C structs.

By the way, the slots are stored in a global array of slot_t. The array’s size is the max number of messages we can store in the external flash. We also have three global variables in the program of type slot_t* named begin_busyend_busy and begin_free. We will se that our array of slots can be seen as a pool of slots.

At startup, none of slots are used. Thus, we will attach the slots all together. Then will assign begin_free to the first slot of the array. Making begin_free the head of a linked list of ready-to-use slots.

During the tree building process, a slot will be detached from this linked list each time a message is being attached to a leaf’s bucket. The begin_busy and end_busy pointers will be set latter.

At the end of this process, each message will be attached to its slot. This slot will be placed in a leaf’s bucket following those rules:

  • When there is only one message belonging to a leaf’s time frame, the bucket of this leaf will be of size 1 and will contain this messages’s slot.
  • If several messages with different time stamps all belong to the same leaf, they will be attached to this leaf’s bucket. The bucket’s order will be based on the slots time stamps.
  • Finally, if several messages with the same time stamp all belong to the same leaf, the will be attached to this leaf’s bucket. We will not be able to order them by time stamp since their are equal. We will us their SHA256 instead.

Fourth step: Building our linked list of time-ordered messages

 

So now, we have a Merkle tree with (small) linked lists attached to its leafs. The only thing to do next is to attache all those linked lists together to have a long one. The begining and ending of this long list will be stored in the begin_busy and end_busy pointers. This long linked list will be used to travel across the messages list in an time ordered manner. Being able to doing so will be very useful because it allows us to not order the messages in the external flash.

 

What to do next ?

 

So yeah, it has been quit challenging to find an efficiente way to store and build this Merkle tree in C. I now understand why python is such an easy-to-prototype language.

 

Even if i have a some messages pools for which the tree building process succeed, I know for sure that it is not over yet. I found particular message pool which make crash my building process. So i might spend the following week fixing it.

 

By the way, there is still no automatic test using criterion in my code. I plan on adding some as soon as possible. For now I only use my eyes to read the text-representation of my tree and printf() error reporting in order to detect errors.

 

See you next week !

 

Antony Lopez

[Little Brosers] Who Are You?

Last week I mentioned that I will start working on the authentication part, however, there was a PSSC that needed to be closed fast so I worked on it for the most of the week. It was basically about making the drop able to read a message from a sequence of bytes and verify its signature. This was simple considering I did exactly the same thing in Java before but in a more complex way.

Besides that, I started working on the authentication service. I created a new GATT service and added characteristics associated with the data that will be exchanged. The drop will always request the user’s time certificate, and the certificate with his public key, both signed by our server. After that, it’s a challenge-response mechanism; the user sends a random number to the drop which the latter has to sign (to make sure it’s not a rogue drop), and vice-versa to authenticate the user. After this stage is passed, both parties can start the Merkle tree comparison and the message exchange.

For the moment, the service is up and running, and the characteristics available with the proper read/write permissions. The next step would be to work on the backend of this service, that is the challenge creation, response, and verification as well as the verification of the user and time certificates.

[ZeROSEro7] Host USB still get no descriptor…

This week, as last week, I worked on schematics and USB Keyboard.

Schematics and PCB

We almost finished schematics ! The second review on the USB Sniffer and the Stealth Drop was valid and few details have to be updated on the Spy Talk. PCB will be updated very soon.

USB Keyboard

I continued to work on the USB Keyboard issue. I updated the Makefile to compile with the ChibiOS_Contrib to use hal_usbh.h library.

In this file, there are some interesting typedefs:

  • USBHDriver
  • usbh_status
  • usbh_device
  • usbh_port
  • usbh_ep

…and some interesting functions:

  • usbhStart(USBHDriver *usbh);
  • usbhMainLoop(USBHDriver *usbh);
  • usbhDevicePrintInfo(usbh_device_t *dev);
  • usbhDevicePrintConfiguration(const uint8_t *descriptor, uint16_t rem);

I also found a very nice post on the ChibiOS forum which explain USB Host stack and driver for STM32. The contributor said he is still in development but he did it with some device like a Keyboard. It was 2 years ago, but I found some //TODO comment in the last ChibiOS version.

I got back files from HOST_USB example to integrate it in my project. I’m able to compile it and I added code in the main.c to use it. I configured OTG_HS (OTG2) pad correctly. I even verified the 5V tension on the USB_VBUS with a multi-meter and some device like a mouse where I can see LED turn on.

I used usbhDevicePrintInfo and usbhDevicePrintConfiguration on the USBHDriver USBHD2 to observe on the RTT connection all device information. Nevertheless, nothing happens and I only can see :

usbhDevicePrintInfo
----- Device info -----
Device descriptor:
USBSpec=0000, #configurations=0, langID0=0000
Class=00, Subclass=00, Protocol=00
VID=0000, PID=0000, Release=0000
----- End Device info -----

usbhDevicePrintConfiguration
----- Configuration info -----
Configuration descriptor:
Configuration 101, #IFs=104
----- End Configuration info -----

I’m still working on the step to get back USB Keyboard descriptor…

Next Week

Next week, I will command all PCB before Christmas and I will continue to work on the USB Keyboard.

[SpiROSE] Testing systemverilog with SystemC, final round

Among the tasks I’ve been busy with these weeks, there is the making of integration tests for the FPGA’s SystemVerilog code.

SystemC is a very good asset to put in a CI environment. Basically, its advantages are:

  • It is quite light and free.
  • You only need tools to compile C++ code in your CI, instead of having modelsim or other full simulator.
  • You can customize the output to your needs, although formatting with SystemC with only 80 characters a line is quite awful.
  • You can use other library more instinctively.
  • You can even get VCD files back from the CI.
  • The compilation files are almost usual makefiles and testbenchs are C++ (sc_)main.

But eventually, we want to use this tool against the SystemVerilog code, so we need a bridge between them.

The situation

To put it clear, I will describe the situation for the driver_controller module. This module is the one generating control and data signal to the drivers. Amid these signals there are:

  • The GCLK and SCLK signals, both are clocks.
  • The LAT signal, which has strong timing requirements. It has different possible duration at the high state, each of these associated with a different command in the driver.
  • The SIN signal, which is the input of the driver itself

Before using the driver, it has to be configured and its the role of the driver_controller to do it too.

The inputs of the driver controller are first two clocks: clk_hse running at 66MHz and clk_lse running at 33MHz. This last one is generated thanks to the clk_hse clock in the clk_lse module, that we will need to include in the SystemC testbench too.

Then we have a framebuffer_sync signal, which should be generated each time we start a new frame. Finally there is a framebuffer_data signal of size 30bits, giving the SIN for each driver.

The tools

To achieve this, we will need to translate our SystemVerilog code into SystemC modules. It is time for the Verilator tool to enter the scene.

This tool was first included as a linter into our project, but revealed to be more useful than first thought. It is capable to translate SystemVerilog code into either special C++ to use with verilator, or SystemC modules which are a bit slower than the former but far easier to use.

Our little clock module becomes the following beautiful SC_MODULE:

SC_MODULE(Vclock_lse) {
    public:

    // PORTS
    // The application code writes and reads these signals to
    // propagate new values into/out from the Verilated model.
    sc_in<bool> clk_hse;
    sc_in<bool> nrst;
    sc_out<bool> clk_lse;

    // LOCAL SIGNALS
    // Internals; generally not touched by application code

    // LOCAL VARIABLES
    // Internals; generally not touched by application code
    VL_SIG8(__Vcellinp__clock_lse__nrst,0,0);
    VL_SIG8(__Vcellinp__clock_lse__clk_hse,0,0);
    VL_SIG8(__Vcellout__clock_lse__clk_lse,0,0);
    VL_SIG8(__VinpClk__TOP____Vcellinp__clock_lse__nrst,0,0);
    VL_SIG8(__Vclklast__TOP____Vcellinp__clock_lse__clk_hse,0,0);
    VL_SIG8(__Vclklast__TOP____VinpClk__TOP____Vcellinp__clock_lse__nrst,0,0);
    VL_SIG8(__Vchglast__TOP____Vcellinp__clock_lse__nrst,0,0);
    VL_SIG(__Vm_traceActivity,31,0);

    // INTERNAL VARIABLES
    // Internals; generally not touched by application code
    Vclock_lse__Syms* __VlSymsp; // Symbol table

    // ...
};

 

Verilator will generate its files in an obj_dir/ directory, which will be in the sim/ directory.

It will generate Vmodule_name.{h,cpp} file, containing the module itself and glue code to go with verilator library, Vmodule_name__Syms.{h,cpp} files making it available in the verilator library and some Vmodule_name__Trace.{h,cpp} files.

For the simulation, you only need to link against Vmodule_name.o and Vmodule_name__Syms.o. You can add the Vmodule_name__Trace.o file if you want to get the trace written into a VCD file, but you’ll need to have a deeper look into Verilated, the Verilator library, and especially the VerilatedVCD object to make it work.

The methodology

 

The first idea with testing module in SystemC is to put all the testing code in the sc_main function, where you are able to control the progress of the simulation :

sc_main(int argc, char**argv) {
    sc_time T(33, SC_NS);
    sc_clock clk(T);
    sc_signal<bool> nrst(“nrst”);
    // create the IN/OUT signal for the DUT
    Vmodule_name dut(“module_name”);
    dut.clk(clk);
    dut.nrst(nrst);
    // bind the IN/OUT signal of the DUT
    while(sc_time_stamp() < sc_time(10, SC_MS)) {
        // advance the simulation of T
        sc_start(T);
        // do your tests
    }
}

 

However it looks like software testing and it is usually a bad idea because you won’t be able to describe every interaction at the module scale you want. In our context within the driver_controller module, this drawback is visible and won’t allow us to do correct testing.

 

Instead, we write a Monitor SystemC module, which will interact with the DUT and run the tests as SystemC threads.

 
SC_MODULE(Monitor) {
    SC_CTOR(Monitor) {
        SC_THREAD(run_test_1);
        SC_THREAD(run_test_2);
    }

    void run_test_1() {}
    void run_test_2() {}

    sc_in<bool> clk;
    sc_in<bool> nrst;
    // create sc_out for output signal to the DUT
    // create sc_in for input signal from the DUT
};

I will explain later how we benefit from this in the different tests.

The compilation

 

Now we have to compile the SystemC code into an executable simulation. We will want to recompile SystemVerilog into C++ each time there is a change and write makefile as small as possible for each testbench.

 

In the FPGA/ directory, we currently have the following structure:

  • src/ : containing the SystemVerilog code.
    • systemc/ : containing some SystemC modules, including the driver model we developed.
  • tb_src/ :
    • systemc/ : containing the SystemVerilog testbench we developed in SystemC.
  • sim/ : containing what’s needed to generate the simulation.
    • Makefile: will be the entrypoint to launch tests
    • base_testbench.mk: base makefile from which will inherit the others
    • module_name.mk: makefile for the module_name testbench

 

In Makefile, we will have a variable listing the different module we want to test. It will serve as generator to create FORCE-like tasks to generate and launch the simulation.

 

What we want for the module_name.mk is the following:

 
MODULE := driver_controller
DEPS := clock_lse

ROOT = tb_$(MODULE)
OBJS += $(ROOT)/main.o $(ROOT)/monitor.o driver.o driver_cmd.o

-include base_testbench.mk

$(MODULE).simu: $(OBJS)
    $(LINK.o) $(OBJS) $(LOADLIBES) $(LIBS) $(TARGET_ARCH) -o $@

I’ve currently put the target to generate the simulation in this makefile but it could have been in the base_testbench.mk too. I just feel that it has more sense in the module_name.mk file.

 

MODULE is the device under test and DEPS is a list of SystemVerilog modules needed to build the testbench.

 

Then, the base_testbench will first define variables describing the environment:

export SYSTEMC_INCLUDE ?= /usr/include/
export SYSTEMC_LIBDIR ?= /usr/lib/

SV_MODULE_PATH = ../src

VERILATOR = verilator
VERILATOR_ROOT ?= /usr/share/verilator/include/
VERILATOR_FLAGS = --sc --trace
VERILATOR_BASE = verilated.o verilated_vcd_c.o verilated_vcd_sc.o

Then it will create objects list and dependency list:

 
VERILATOR_OBJS = \
    obj_dir/V$(MODULE).o obj_dir/V$(MODULE)__Syms.o \
    $(patsubst %,obj_dir/V%.o,$(DEPS)) $(patsubst %,obj_dir/V%__Syms.o,$(DEPS))
OBJS += $(VERILATOR_BASE) $(VERILATOR_OBJS)
DEPSFILES = $(subst .o,.d,$(OBJS))

all: $(MODULE).simu

Then it defines the compilation options:

LDFLAGS = -L$(SYSTEMC_LIBDIR)
LINK.o = g++ $(LDFLAGS) $(TARGET_ARCH)
LIBS = -lsystemc
VPATH = ../src/ ../src/systemc ../tb_src/ ../tb_src/systemc ./obj_dir/ $(VERILATOR_ROOT) ../lib
CPPFLAGS = -I../src/systemc/ \
    -I./obj_dir/ \
    -I../lib/ \
    -I../tb_src/systemc/ \
    -I$(VERILATOR_ROOT) \
    -I$(SYSTEMC_INCLUDE)
CXXFLAGS = -DSC_INCLUDE_DYNAMIC_PROCESSES -g

And finally defines the compilation target to rebuild Verilator files or handle dependencies:

obj_dir/V%.cpp obj_dir/V%__Syms.cpp obj_dir/V%.h: %.sv
    $(VERILATOR) $(VERILATOR_FLAGS) -y $(SV_MODULE_PATH) $<

%.d: %.cpp
    $(CXX) $(CPPFLAGS) $(CXXFLAGS) $(TARGET_ARCH) -MM -MP $^ -MF $@

clean:
    rm -rf $(DEPSFILES) $(OBJS) $(MODULE).simu

-include $(DEPSFILES)

Issues

 

What has been done currently doesn’t fulfill our requirements yet. It seems that even if dependency files are correctly generated and included, the simulation won’t recompile if header files are modified. As we are prioritizing the development of the hardware at the moment, we’re not trying to debug it more. But this issue should be solved by the end of the year to make the development of the last testbench more enjoyable.

 

Except this task, I’ve been working on the improvement of the tests made by Adrien on the renderer, mainly by gathering code into sh functions so that tests and especially our use of Xvfb are more robust. I’ve also been working on the routing of the LED panel with Adrien, trying a slightly different version than Vincent’s one, but it seems that his one will be better. Finally I’ve written some tests for the column_mux module, a mux choosing the active columns on the LCD screen, but it doesn’t pass them.

 

I will continue to implement some tests as soon as the hardware is finished or if I get a window of time free.

[SpiROSE] Driver protocol and renderer test

Driver protocol

This week I have made a lot of changes in the driver controller code. This module controls all 30 drivers, so it has to send the data, sclk (shift register clock), gclk (displayed clock) and lat commands. To test it we have a PCB with 8 columns of led (7 columns with only one led, and a last column with 16 leds), this simulates what a driver will have to drive : 8 columns of 16 leds, with multiplexing. We have connected a driver to this PCB, and use the DE1-SoC card we have at school as our FPGA. On those cards the FPGA is a cyclone 5, not a cyclone 3 as we plan to use, however the driver controller code is device agnostic so it is not a problem.

The module driver controller is supposed to receive data from another module, called framebuffer, which read the images in the FPGA ram. Thus for the test I wrote a simple framebuffer emulator which send data directly, without reading any ram.

The driver controller is a state machine, it can send a new configuration to the drivers, dump this configuration for debug purposes, do a Led Open Detection, or be in stream mode where it sends data and commands to actually display something. This last state has to be in sync with the framebuffer, thus the framebuffer sends a signal to the driver controller when it starts sending data from a new slice. When the driver controller goes from any state to the stream state, it has to wait for this signal. This signal will also be used by the multiplexing module.

The drivers have a lot of timing requirements, after each lat command sclk or gclk needs to be paused to give time to the driver to latch his buffers, and the data and lat needs to change severals ns before and sclk rising and falling edges. Therefor I added a second clock, two times faster (66 MHz), and used it to generate two clocks at 33MHz, in phase quadrature. One is used to clock the state machine, the other is used to generate sclk and gclk. This means that the lat and data edges will occurs a quarter cycle before or after the sclk/gclk edges, which is enough to respect the timing requirements.

In stream mode, gclk has to be input as a segment of length 2^n. During a segment we have to send all the new data to the 16 leds, and input the right lat commands to latch the buffers. The final command has to be input precisely at the last gclk cycle of the segment. In poker mode we send only 9 bit by leds, which takes 9*48 = 432 sclk cycles. The closest power of two is 512, thus we have 512-432=80 cycles of blanking to put somewhere. It was first decided to do all the blanking at the beginning of a segment, and then stream all the data. However as stated before, we need to pause sclk after each WRTGS command, which are sent every 48 cycles. Fortunately one cycle is enough, thus we have 8 blanking cycles not occurring at the beginning of a segment. So now we have 72 cycles of blanking, then one cycle of blanking every 48 cycles.

This means that the frame buffer has to take into account those blanking cycles. Just skipping one shift all the data, resulting in a weird color mix.

Demo

To test all this I wrote two simple demo. One simply lights all the leds in the same color (red, blue or green). The second one lights the led with the following pattern:

A button allow to shift the pattern, resulting in a nice Christmas animation.

You can notice that the colors don’t have the same luminosity. Fortunately this can be control with the driver configuration: each color has a 512 steps brightness control. What it still unclear to me is if the driver simply diminish the power sent to a color, or divide the same amount into the three colors. The current measures we have made seems to suggest the later, as the global amount of power doesn’t change when reducing the green intensity for instance.

Renderer test

The renderer allows us to voxelize an opengl scene. It is still a proof of concept and will soon be turn into a library. To test it, I wrote an sh script that does the following steps:

  • Start the renderer with default configuration to see if there is no error and that the shaders load properly
  • Take a screenshot of the rendering with imagemagick, and check that something is actually displayed by checking the color of the central pixel
  • Start the renderer with a simple sphere, take a screenshot and compare it (with imagemagick) to a reference image to detect any changes
  • Start the renderer in xor and non-xor mode, and compare two screenshot taken at the same time

[AmpeRose] And yet another problem… Experts… A word…

Hello,

So, in previous posts, we’ve shown you the results of simulating Switches, initial calibration, automatic calibration sub-system, etc. In addition, we’ve talked about some problems and how we’ve approached them and shared the results with you. However, we are still having a little problem, let’s see what’s happening… The origin of the problem is the DUT itself, to be more precise, it’s the decoupling capacitors that are bothering us. Yesterday, we showed you (In this post) that  a decoupling capacitor must be used by the DUT to ensure that we do not have a big voltage drop in case of a very fast transition between current ranges. This guarantees that the DUT will always be powered up.

Now, we’re getting closer to the problem.

Problem Statement

A negative current is being fed to AmpeRose by the decoupling capacitors during fast current transitions, therefore delaying the stabilisation of the op amp output.

Remember that we’re sampling each 10 us. That being said, a lot of samples would need to be discarded (~ 70) before voltage stabilization.

What do we want?

We want to block the current flowing from the decoupling capacitor of the DUT, easy, let’s use a diode.

Not that easy! In fact, an ideal diode would have definitely solved our problem(?), because it allows the current to flow in one direction only and does not consume any voltage (drop voltage = 0). However, such a diode does not exist… 🙁

Different solutions

Constructors (Linear tech, Maxim,…) provide different near-ideal diode solutions. However, the drop voltage is still huge, and therefore not suitable for our needs.

Well, I hope I made the problem clear… If you have any questions or clarification requests, we’re here to answer… And, it goes without saying… We’re open for any and all suggestions.

Thank you.