Categories

So Long, and Thanks for All the Fish

A big thanks to you all! This year has been a wonderful experience. We hope that you enjoyed it as much as we did.

Let’s keep in touch and best wishes!

[SpiROSE] Last few days, close to the end, but we have finished the FPGA’s modules !

Hey, last week  I was quite pessimistic because of the time we had left for the testing, verifications and calibration of our hardware. It seems that this is almost fixed ! We are a little more than two days before the end and the three cards have been done — almost at 100%. There is still an unknown issue with one of the voltage regulator, which allegedly made it burn, but it might run well tomorrow.

We managed to implement the whole new FPGA architecture that Adrien described in today’s post and add a bunch of new SystemC tests. We made a global pinout assignment with quartus TCL definition file for the EP3C40 FPGA and a lot of different testbench for both EP3C40 and DE1_SoC targets so as to exhaustively test every part of the project on the hardware. Tomorrow morning I will finish one of these part which will test if our solutions for synchronizing the RGB input stream with the display will work correctly. The goal is to make a full RGB cylinder in 3D as input for the RGB logic, simulate the rotation and extract only one slice position. If the panel was to change color, we would know it doesn’t work.

Finally, it will probably be a happy ending, or else we can still show that we know how to blink LED.

[ZeROSEro7] Transparent keyboard

This week, I worked on the USB keyboard, the SPI Slave, the read/write in Flash and casing.

USB Keyboard

The USB keyboard is now fully transparent. All input works and LED about Caps lock and NUM lock are managed according to SET REPORT from the computer as well.

I could add a feature to copy directly important descriptor of the keyboard plugged like PID and VID to be more transparent. It will be done if the time allows it.

SPI Slave

Currently, the most important part is the SPI slave. It’s used on two devices (USB Sniffer and Stealth Drop) to communicate between nRF52 (Master) and STM32 (Slave).

The configuration of the STM32 is as follow:

  • 16bit
  • CPHA = 0, CPOL = 0
  • MSB-first
  • NSS Hardware
  • Motorola mode
  • Slave mode
  • Rx interrupt on

I also add SPI_IRQHandler() in the vector table which called every interrupt.

I receive some data from the nRF52 but I got only the first byte of each tram and the Overrun flag turn on. So I fixed it while using mail box to manage received data.

Flash

I tried to read and write in the flash of the STM32F407 (Olimex). It’s not the most important features, but there is more storage in the flash memory than RAM and password will be safe in flash memory if the device is unplugged or the host computer is turned off.

First of all, I added a section in the linker script reserved for password data. I choose a lonely sector to be sure to not conflict with other data. In the software, I got back address from this section.

I can easily read in my section with the right address. Nevertheless, to write inside, I have to unlock the flash access with the right code in the right register. Every first write works well when flash is 0xFF value… But to write again, I have to erase the section before. And I have some trouble with the erase instruction because the STM32 crash… And the device become no bootable at all. I have to reload a program to fix it…

Casing

With Vincent, we designed the prototype casing of USB Sniffer and Spy Talk. We worked on Autodesk Fusion 360 software. See results on the following pictures:

Spy Talk

USB Sniffer

Last week

I have to fix the SPI slave and code the USB Sniffer demo software.

[ZeROSEro7] BLE fully operationnal

I made the last improvements to the BLE control and enabled fast two-way connection to a characteristic with a simple programming interface. I am using Write requests with no response on the GATT client side and notifications on the server side along with the highest MTU available for the nRF52 (200+ bytes). This will be usefull for the USB sniffer where we will download a great amount of data.

I ran a test program on our SpyTalk custom PCB and the BLE and LoRa chip worked. Yet, because of difficulties when welding, one of the Quartzs doesn’t seem to work but it is not critical. It will only raise consumption during off time.

I prepared a test program on the nRF SPI for the USB Sniffer that sends packets to the STM32 and expects a response. This program is proven to work because it did with the LoRa chip so now Erwan and I can implement the SPI driver on the STM side and imagine a simple protocol to exchange our data. This part will be even easier on the Stealth Drop because only a wakeup is sent. It could be done with a simple GPIO.

My next priorities are to finish the USB Sniffer, including the final user interface, and to design and implement a simple protocol over LoRa for the SpyTalk. This protocol will include at least addressing, retransmission, and basic AES encryption. I abandonned using LoRaMAC because this protocol, despite being very efficient for what we want to achieve, requires gateway stations which are usually public and it is not what we planned as our network architecture. It would also require different hardware. Given the time left, I may also add synchronization for lower RX time, better security with changing keys and any other ideas that may come to my mind. I’d also like to add bonding and security on the BLE connections.

Erwan and Vincent made great progress on their side and I think we are very close to having every of our 3 gadgets fully working!

[SpiROSE] FPGA synchronization architecture

Synchronization signals

The engine may not have cycle accurate speed, the rgb logic will not send data at exactly 33MHz… there are various synchronization issue to tackle. Thus we need some sync signals.
In order to have the simplest and less error prone code, the modules will all rely on a few sync signals, each one output by the relevant module so that the other ones don’t need to worry about his inner working. The framebuffer buffer for instance shouldn’t have to know how the driver controller works, it just needs to know when to send data to it.

The synchronisation signals are the following:

Signal name Output by Used by Role
hall_sensor_trigger hall effect sensor Indicates that we have made half a turn
position_sync hall effect sensor framebuffer, driver controller Indicates that the position has changed
rgb_enable SPI rgb logic Command sent by the SBC to start everything
stream_ready rgb logic framebuffer, driver controller Indicates that enough slices are stored in ram to begin sending data to the driver
driver_ready driver controller framebuffer Indicates that the drivers are ready to receive data (they are configured and we are not in a blanking cycle)
column_ready driver controller multiplexer Indicates that the drivers have latch the data for a column

Modules’ states

The modules will follow those steps:

  1. At reset, the driver controller send the default configuration to the drivers. driver_ready and column_ready will be low so the framebuffer is waiting and the multiplexer doesn’t turn on anything. The rgb_logic waits for rgb_enable to start writing in the ram.
  2. The SPI module receives an rgb_enable command, so it drives the signal high. rgb_logic starts to read the incoming rgb stream, and monitor vsync to ctach the beggining of an image. Then it starts writing in the ram. When enough slices have been written it drives stream_ready high.
  3. The framebuffer and driver controller modules will do nothing before the stream_ready signal. When they receive the stream_ready signal they start a two state behaviour: send data, wait for next slice. In send data state framebuffer listen to the driver_ready signal to send data to the driver controller, driver controller send data to the drivers with the protocol describe in previous posts. When a column has been sent the driver controller drives column ready high for one cycle.
  4. When the multiplexer module receives column_ready it turns on one column only for less than 10µs (so we don’t burn the led when overdriving), then turn it off and wait for the next column_ready assert.
  5. When the whole slice has been displayed (i.e. 8 columns), the frembuffer and driver controller enter wait for next slice state, where they wait for position_sync signal.
  6. The position_sync signal triggers the change from the wait for next slice state to the send data state (go back to step 4)

This is sum up in the following figure:

Hazards

There is an issue if position_sync changes when we are in the send data state. This can occurs when:

  • There are too many slices, so the time required to send the data exceeds the lenght of a slice
  • The motor was too fast, so the hall_sensor_trigger happens too soon

The first issue can be fixed by using fewer slices, but this means a lower resolution.
The second one is unlikely to happen, as the engine speed is steady.

In case one or the other happens, we handle it this way:

  • Continue to send data for the current column
  • Then reset the relevant counters/signals and start sending the next slice normally

Thus we just finish to send our column before moving to the slice n+1, so in the worst case we lose 7 columns out of 8 in slice n. The slice n+1 is displayed normally but slighlty delayed, worst case by 512 cycles. So if wait for next slice state last for more than 512 cycles the delay is gone when we reach slice n+2, otherwise we end up being in the same situation as before.

[Little Brosers] Indeed, dealing with flash is lightning fast

Hello everyone !

 

Don’t ask me how, my brain seems to have enough battery life to keep writing posts here.

Let’s get to it, shall we?

 

First: The flash

Last week I left you with a code very approximative trying to stimulate the flash.

One week later, we fully control it. I just needed more of those sweet full sleeping night (and some ROSE Guru’s advice). After reading even more the datasheet and insulting my computer for no reason, I got it to work as expected. We can now, read/write in a page any amount of data up to a page size (528 bytes). Also, I tested the “super-duper amazing ultra low power mode” of the flash. Just kidding, just note that we are able to put the flash in its deepest power mode and wake it.

 

Second: We gotta make it fit in 64 ko

 

Indeed, our NRF52832 can only provide 64 ko of RAM. This few kilobytes are supposed to hold our Merkle tree and the slots. (c.f. older posts if you don’t know what ‘slots’ refers to). And let’s not forget all the crypto code as well as the RAM that BLE needs.

 

So, with Guillaume, we tried to reduce the size of our structures as much as we could. Choosing space-saving over run-time optimization. For instance, the width of a node’s children can be deduced from its ‘start’ and ‘end’ fields. Same for its depth in the tree. So yes, we reduced the size of our structures by removing as much ‘not so essential” fields as possible (which was previously stored in each node).

Also, we dramatically reduced the size and count of the messages that will be stored in a drop. Being optimistic, I would say that we could fit 1000 messages in a drop. We’ll see. The ideal would be to use a NRF52840 (which has way more RAM), but this SoC was not available to buy when we started the project, and still isn’t, I think.

 

Third: the diff

Ok, the rest of the week was dedicated to the diff. I you forgot what ‘diff’ means in our super fancy vocabulary, it means “comparing two Merkle trees and determining which messages those trees need to send to each other in order to be identical”.

This has been done in python by Guillaume at the beginning of the project. The thing is, he had not the RAM limitation that I encounter now with the Drop’s CPU. So with Guillaume we worked on translating his diff algorithm to C. I Started by implementing in C the stack he was using in python. Saturday, I spent the day connecting the diff code to the C++ simulation. And dude, that’s so satisfying. I can launch as many drops as I want now and ask them to launch a diff with another one. All of this being done in separated threads. Since I use UDP for emulating BLE communication, we can even launch one drop per terminal. It even should work for computers in the same local network.

 

Oh yes, and the best part is, I tried to keep the diff hardware independent. Indeed, at first we thought we would have to code a diff in Java for the Android part in addition of the C one. Mostly because of the difference between Drops and Android phones about BLE, events and interrupts. But since we now master the JNI and the Android NDK , my fellow mates only need to provide two C function pointers, one for sending, one for receiving. Those function need to be blocking, and write to/read from a byte array for which I will provide a size, that’s all. There is also a void* in the function signature, if the language which uses the diff needs to pass something to its send/recv functions. (I do for C++ simulation)

And finally, because unitary testing is always around the corner, we started today testing each part of the diff algorithm with Criterion. The diff is not fully functional yet, but it shouldn’t be long.

 

What’s next…?

Basically everything. I mean, there is only 4 days left. I still need to write code for the merge (not very complicated I think, the diff does all the handwork), the date shift, the flash monitor to know which page of the flash is dead or not and the battery monitor.

 

That is my program for the following days.

 

Have fun guys !

 

Antony

 

[Little Brosers] New levels of exhaustion

Hello, I guess this might be one of the last post from me, if not the last. I am very tired and I have a lot of things to do before next week so I will keep it short. Here is what I’ve done this week:

 

Android GUI

 

A dedicated a few days to build a basic GUI for our Android messaging app. Nothing really fancy was done, but I do think what I have done will be decent enough for what we want. I used the base template which I adapted using Fragments. The UI is functional so unless I have some extra time before the end the only work done on it will be connecting it semantically to the rest of the application. It took more time than I anticipated because I had actually never done any Android before unless really really basic stuff, so I had to learn on the fly.

 

Merkle tree debugging and modifications

 

I worked on a modification of the core lib code of Antony, for it not to rely on pointers so much. This way, the merkle tree and the slot pool can be serialized either in a file (on Android native) or in the external flash (on the drop). It took a lot of time, but the tree and the slot pool can successfully be saved and restored in a file. We still need to do it for the external flash.

 

Signature Keystore

 

I had terrible flashbacks when the signature started to fail with Android KeyStore generated key pairs. I thought about the week and a half I lost in december on the RSA key generation. Fortunately, after removing a priority given to the SpongyCastle provider, everything worked.

 

Merkle tree diff implementation

 

I helped Antony write the C implementation of the merkle tree conparison. We are still looking for some pesky bugs, this will be priority number one for now since we absolutely need a working diff in C for the project to succeed. I really hope it will be done before tomorrow night because we still have to manage the proper implementations on both platform and connect the sotrage to the rest of the app.

 

I’m off to sleep now, dreaming about trees.

Guillaume

[Little Brosers] The End is Near

Here I am, writing my last post in the logbook. Time passed quickly and right now we’re working around the clock to finish the project before Friday.

As far as I’m concerned, the certificate verification problem that I was stuck on 2 weeks ago was solved a long time ago. The authentication process on the drops is working like a charm, with the drop disconnecting if there is a problem with the user certificate or the challenge response.

On the Android side, I implemented the filters based on the networks we’re active in. Then the results are analysed to tell if we should connect to a drop or not. In addition to that, I have laid out the entire connection process on drop discovery, with stubs for the yet unwritten processes. I’m currently working on finishing the authentication process on the Android side. Challenging a drop and verifying its response, as well as responding to its challenge are working perfectly well. However, given that the user certificate is not yet ready, I’m putting that part on hold for now, and working on communicating with the server to obtain the user certificate on signup.

During last week, while implementing the authentication on Android, I realized that we couldn’t write more than 20 bytes (the default MTU size on the drop) to a characteristic, which was problematic given that we have much larger characteristics, like the challenge/response and the certificate. I solved this problem by increasing the MTU size on the drop to 214 and implementing long write operations, which allow Android to break the data into multiple packets that are sent sequentially to the drop. Upon reception the drop’s BLE stack asks for a memory block that will hold the temporary data, and then writes the whole data in the characteristic.

I hope you enjoyed reading my blog posts as much as  I enjoyed writing them, but now, I have to get back to work.

[AmpeROSE] How to discover the AmpeROSE device?

Hi everybody,

As you can expect, here we are going to see how the AmpeROSE could be discovered by the software on the computer. We thought of 2 ways to do that. The first way consists in using the Zeroconf technologies and the second one is our solution using an UDP server on the AmpeROSE.

  1. Zeroconf : The Zeroconf (or Zero-configuration networking) is a providing 3 technologies: the IP address assignations (1), the name resolution (2) and the services discovery (3). The first two technologies are usually done, respectively, by a DHCP server and a DNS server but the Zeroconf does it without any servers. In our application, we would not be using the address assignation since we suppose that a DHCP server will be available but we would use the two other services. In fact, we need only to implement the second part because the service discovery is done by the computer. To implement the name resolution, there are 2 protocols: the multicast DNS (mDNS) and the Link-local Multicast Name Resolution (LLMNR) respectively developed by Apple and Microsoft. If we choose to do it, we will use the mDNS since it is the most used.
  2. UDP server : If we take this option, we will implement a simple udp server on the AmpeROSE responding to a specific broadcast by a message giving its name.

We did not decide which one will be used but here are some of the pros and cons of these solutions:

Pros Cons
Zeroconf Compatible with a widely used protocol. Only partial implementation available.
UDP server Simple implementation. Not standard.

 

We need to discuss this a little more but if you have suggestions about that feel free to leave a comment J

[SpiROSE] FPGA, the end is near

Hello everyone, I hadn’t posted since Christmas, so here I go ! Since the expedition (no pun intended) of the PCBs to the manufacturer, I kept mainly focused on FPGA development including source code and tests for various modules, which allowed us to correct many bugs we hadn’t seen so far. I will detail some features later on.

SPI slave

The SPI slave implementation is complete. The protocol for sending commands from the SBC to the FPGA is simple, we have a header command byte, followed by as many bytes of data as needed. Let’s have a quick summary of the different possible commands.

  • Enable/Disable the RGB module: we need to tell the RGB module, the one that writes into RAM the data the FPGA receives from the parallel RGB, to start or stop writing in RAM, since we only want to display relevant data. As the RGB module is the first module in the chain, this command starts everything.
  • Configuration command: change the configuration of the drivers. The command may be issued at every moment, the driver controller module can handle the reconfiguration even when it was streaming data to the drivers.
  • Request rotation information: The FPGA should be able to send its the rotation position (the slice we are at), as well as the speed of the motor.

But how can we get the position of the rotating part ? Let’s look at the Hall effect sensors !

Hall effect sensors

Since we finally use Hall effect sensors instead of the rotary encoder that was originally wanted, I did the module that tells the driver controller and the framebuffer when we enter a new slice, to begin displaying for the current slice.

The issue is that we have only two Hall sensors that are opposite one to another and 256 slices per turn. So we need to infer the slice we are at given only 2 positions, over the 256 for a turn, that we are absolutely sure of. Ideally, the synchronization signal is generated without knowing in advance the motor speed, so it must adapt in “real time” to it. An idea, since we have 128 slices per turn, is to estimate the slice positions of the current half-turn given the number of cycles that it took to make the previous half-turn. In a word, we constantly correct the duration of a half-turn to fit the speed variations of the motor.

The Hall effect sensors we chose are hysteresis sensors which, in their case, means that the output of the sensors is set low when the magnetic flux is beyond a certain threshold and is kept low until the flux reaches a second threshold that is lower than the first one. This provides an integrated anti-bounce mechanism. We thus only need to detect the negative edge of the sensor output to have a “top” to synchronize onto. Between the 2 tops of the opposite sensors, we count the number of cycles and we compute (right shift)  the duration for each slice for the following half-turn, sending the synchronization signal accordingly. To give a few figures, if we have 15 rotations per seconds, around 15600 clock cycles are elapsed between 2 slices.

 

For the upcoming week, while we are waiting for the components to be shipped and since all FPGA modules are complete, we will build end-to-end tests with all of them, hoping that all the tests we have implemented for each modules were thorough enough to spot all possible bugs, for it to be ready for a quick deployment on the actual board.

See you soon !