BeagleLogic: now also analog

A year and half ago I published a survey on the BeagleLogic wiki where prospective visitors and users were asked which feature they wanted to see in BeagleLogic the most. I compiled the raw responses so far, here they are:BeagleLogic survey

It’s interesting that a majority of prospective users wanted to be able to do analog sampling with BeagleLogic.

And today, it becomes a reality for BeagleLogic users, thanks to the efforts of a team working at Google Research who wanted to use the BeagleBone for data acquisition and had been working independently on an ADC cape for the BeagleBone for a while. When we first got in touch, they already had a board fabricated and assembled and kindly agreed to send me a prototype to let me help BeagleLogic support the board. It excited me as not only would analog sampling support in BeagleLogic become a reality but also their project could benefit from the kernel infrastructure BeagleLogic has established to capture data using the PRUs on the BeagleBone and move it to userspace to be able to realize the full performance of the ADC which would be difficult to achieve through a libprussdrv based solution.

I am happy to tell you today about the PRUDAQ project (link to announcement on Google Open Source Blogs) – an ADC cape for the BeagleBone for doing high-speed analog data acquisition. At the heart of the board is an Analog Devices AD9201, a 10-bit ADC that can sample two channels up to 20MSPS.
PRUDAQ board
The board has been designed by Jason Holt and his team at Google Research and the team at GroupGets as the manufacturing partner have been instrumental in getting the boards manufactured and ready for sale – the board itself is available at their store, and they also offer a bundle with pre-loaded SD card and other accessories.

GroupGets is a secure platform to create or join group buys for things that are out of reach for a single buyer like a high minimum order quantity for a specialty sensor. Their GetLab engineering team also creates custom hardware and software to enhance group buy targets or make them easier to use.

PRUDAQ with the BeagleLogic stack can be a great way for high performance analog data acquisition – the PRUs and the kernel driver can handle simultaneously both channels of the ADC at sample rates up to 19.9 MSPS 1, which corresponds to a data rate of 79.6 MB/s 2. The dual-PRU architecture of BeagleLogic that maintains a clear separation between the sampling operation itself (done by PRU1) and data transfer to memory (done by PRU0) means that on an application level, only the firmware on PRU1 needs to be changed in order to support PRUDAQ, no modification is necessary on the kernel driver side, raw analog data can be directly read through /dev/beaglelogic and this also opens the door to other ADC boards or sensors being able to make use of the high-speed capture framework provided by BeagleLogic by writing suitable firmware extensions.

There’s also an update to the BeagleLogic system image that also builds in PRUDAQ support, and this will be bundled by default in all new BeagleLogic image releases. It is immediately available for download from the images page on the BeagleLogic wiki. For those interested in the firmware, it’s present in the main repository now here. To learn more about the PRUDAQ project, their wiki is a good place to start.

The performance of PRUDAQ has been documented really well and it also highlights the bottlenecks in the later part of the document, which is also relevant to the performance of BeagleLogic.

I’d also take the opportunity to announce collaboration with GroupGets for the first production batch of BeagleLogic capes and a bundle offer – while the BeagleLogic capes aren’t available today, those who purchase the first batch of the PRUDAQ boards from GroupGets will get 10% off their purchase of a BeagleLogic cape as and when it is made available from GroupGets. The first batch buyers of the board will receive a discount code in their email when it’s ready.

Congratulations to the team at Google Research and GroupGets for a successful launch! I am also very grateful to the support provided to me by the Google Summer of Code program and the BeagleBoard.org Foundation in bringing this project to you over the summers 2 years ago, and looking forward to see the new applications that this will make possible.


  1. Due to the nature of the assembly code that samples the PRU pins, it is not possible to sample both channels at 20MSa/s as the external clock needs to be tracked and at 20MSa/s, samples were found to be dropped. However it works OK at 19.9 MSPS with an external clock [tests done by Jason and his team]. It is however, possible to sample both channels simultaneously at the max possible 20MSPS if PRU1 is allowed to clock the ADC. This is a special case and the firmware to do so will be made available in the future. 
  2. ADC samples are 10-bit but use up 16 bits = 2 bytes. So 19.9 MSa/S * 2 channels * 2 bytes/channel = 79.6 MB/s 

A day with Hackaday

The year 2015 saw my project BeagleLogic as a Best Product finalist in the 2015 Hackaday Prize, and my other project Smarter Power Pack  as a semifinalist in the main event. I also got to attend the Google Summer of Code Mentor Summit at Sunnyvale, CA (thanks to BeagleBoard.org) in November 2015 where I met many interesting people and had a great time.

Just the next weekend was Hackaday’s first hardware conference – the Hackaday SuperConference – which would later go on to be a grand success. I however knew that I couldn’t stretch my 8000-mile trip to the weekend because my university end-term examinations began next week and I had to leave on the Thursday before, the 12th of November. I let Sophi (of Hackaday) know and they invited me for lunch with them Thursday afternoon at the Supplyframe office at San Francisco.

Thursday morning, I boarded the Caltrain to San Francisco, and then roughly a 10-minute walk from the station to the office. This is how my entry was announced at the “Hackaday SuperConference” group chat channel at Hackaday.io :

supercon-chat-screensnip

The office at that time was packed and brimming with activity as the entire Hackaday crew had gathered and working together towards the Superconference. I sat on a table beside [Aleksandar Bradic] (CTO, Supplyframe) and [Chris Gammell] (of AmpHour podcasts and Contextual Electronics). Behind me were [Rich] and [Sophi] working on a video-shoot for the Hackaday Omnibus 2015 release (yes, I saw a part of the shoot that day). Here is a link to the released video on Vimeo.

supplyframe-office-1

Thanks to [Jasmine], I did not miss the SuperCon badge and T-Shirt. As a proofreader for the Hackaday Omnibus 2015, I received my copy as well.

badge

Now there’s an interesting backstory of the badge here that I think I can share with you: As I put on the lanyard onto the badge, they realized that the drill hole for putting the lanyard on the badges was a tight fit and it would be cumbersome to remove and re-attach the badge without scratching the solder mask on top. The solution? A bunch of keyring rings which would then go on between the badge and the lanyard – like here.

I got to play with the new NVIDIA Jetson TX1 for a while, a test unit of which had just arrived at Hackaday ([Brian] would later publish a hands-on with it, which can be read here). I saw [Brian] and [Mike] typing on what would be the posts that would later go on to the Hackaday blog, and also the HaD tip line, in the short time I was there.

Once [Sophi] and [Rich] were done with a round of shooting for the Omnibus video, we proceeded for lunch to a nearby restaurant. The food was great. The lunch time conversations, even more great.

lunch-group-pic

After lunch we walked back to the Supplyframe office and I bid goodbye to the entire team, and boarded my flight back to India later that night. I had got to see the team behind Hackaday up really close, and my respect for them has only grown since. They are doing a great job with their blog, the Hackaday.io community and the SuperConference which I really missed being a part of this year. A big shout out to them, and also to [Sophi] for making an exception while you all were so busy preparing for the SuperCon so that I could get to be there, and interact with you all. Thanks for having me there! Looking forward to meeting you all again.

Wishing all readers a very Happy New Year 2016 from The Embedded Kitchen.

~Abhishek

An Adventure in self-fabricating PCBs

An Embedded Kitchen wouldn’t have been complete without the ability to fabricate circuit boards for rapidly prototyping circuits without waiting on the board house [fail fast, fail often applied to hardware design]. This week, it gained the capability to fabricate boards using the well-known toner transfer method.

I had done toner transfer first when I was at school, the results had been terrible. So terrible that I turned to wire-wrap prototyping for my circuits, even when I started using SMD components later. However the realization that dead-bugging SMD parts is too much effort which doesn’t scale brought me back to PCBs. This time because I couldn’t wait for two weeks to get my boards manufactured so I decided to take on toner transfer again. I knew the steps. Clean the board, print the design, iron it on. After the ironing when I rinsed the board under cold water, I was pleasantly surprised to see near-perfect pattern transfer from paper to board. Then I went on to etch them with ferric chloride.

The first boards to be so fabricated are an early stage prototype of my project Smarter Power Pack which uses spare laptop batteries to realize a better power bank (interested? here’s a link for further reading), and a respin of the LCD connector board for my BeagleBone LCD cape. Here are pics of the boards so fabricated:

IMG_20150904_135429_HDR

IMG_20150904_135451_HDR

IMG_20150904_134059_HDR

After this was done, I threw in an additional step of tinning the whole board to protect it from tarnishing and also patch-up any imperfections in traces. Then placing components and reflowing the boards are usual. Here are the results.

IMG_20150905_233900_HDR

IMG_20150905_204424_HDR

They’re not as pretty as the ones got from the board house, but they’re beautiful in their own way. They’re cheaper in terms of money but more expensive in terms of extra effort (drilling and hole plating). And I don’t need to have 5 or 10 copies of tea coasters if I used a wrong footprint or inverted it. But in the end, it’s all worth it when you can get from an idea to prototype in days and not weeks.

BeagleLogic: Building a logic analyzer with the PRUs: Part 1

At demonstrations whenever I am asked a question about how BeagleLogic works, it takes time to be able to explain how a low cost SBC can actually sample at digital signals at 100 MHz, what makes the BeagleBone Black so special, why can’t this be done with something like a Pi without adding any extra hardware.

This blog post is an attempt to start from (almost) scratch and explain the nuts and bolts of the BeagleLogic assembly and document the design decisions made last year as a reference for future application scenarios. This should be the first in a series of posts.

The PRUs, or the “Programming Real-Time Units” on the AM3358 SoC on the BeagleBone Black are two 200 MHz microcontrollers that run side-by-side the 1 GHz ARM CPU. They can be started, stopped, reset, programmed via the CPU and also share the same bus so the PRUs can independent of the CPU access the core peripherals like GPIO, memory, ADC, DMA, … and also have a GPI/GPO subsystem of their own (“Enhanced GPIO subsytem”).

These PRUs can be programmed in C using the TI PRU C Compiler (recommended) or PASM (now considered obsolete but still works) or a GCC Compiler port (in progress) and of course, hand tuned assembly code one can write for the PRU as well.

At the heart of a logic analyzer…

… is a register that samples the input signal(s) at regular intervals (decided by the sample rate in case of a free running sampler) or at the edge of an external clock signal. These samples are then recorded into the sample buffer which can be then used to extract and analyze the captured digital signal.

Note that this is the simplest case, in real life scenarios there are often trigger conditions like ‘Hey, start recording when there is, say, “a falling edge on pin B when pin A is high”, or “after 5 rising edges on pin C when pin D is high and pin E is low”‘ and so on.

The core of BeagleLogic is the simplest possible implementation. It simply samples and records the inputs into a buffer at a sample rate that can be configured in integer divisions of 100 MHz i.e. 100 MHz (100 / 1), 50 MHz (100 / 2), 33.33 MHz (100 / 3) and so on.

Now let us look at the building blocks available in the PRUs of the BeagleBone Black that enable us to achieve this.

MOV, SBCO and LBCO Instructions

A quick primer on the 3 most important instructions for data transfer in PRU assembly. For a detailed overview refer to the PRU instruction set manual.

MOV is used for moving data between registers

MOV R1, R2         // R1 gets entire contents of R2
MOV R1.b0, R2.b0   // least significant byte of R2 copied to LSbyte of R1
MOV R2.w1, R3.w0   // least significant halfword of R3 copied to MSHWord [HWord=16bits]

SBCO is used for moving data between a physical address and a register. This physical address can be within the PRU (e.g. PRU data RAM, shared RAM, power and control registers) or outside peripherals like the GPIO subsystem, the ADC subsystem, the EMIF subsystem (the DDR SDRAM controller), the GPMC. General usage:

SBCO &src, destination, offset, n

Moves n bytes from the src register into (*destination)+offset, this is indirect addressing. If n > 4, then the subsequent registers are accessed as well. Offset can be a register or an immediate value.

When using SBCO to write to the DDR memory, note that we must provide only physical memory addresses as there is no MMU involved when PRU accesses it. This is what most examples of the PRU that demonstrate shared memory access do.

Using R0 = 0x40000000, R1 = 0x100, try and guess what SBCO &R2, R0, R1, 32 does

LBCO loads data from an external address into a destination register. If n > 4, data gets loaded into the subsequent registers as well.

LBCO &dest, src, offset, n

Clock cycle counting

I’ve used the Cortex M3/4 microcontrollers and these cores have this nice register called DWT_CYCCNT which provides number of clock cycles the processor has executed code. So one can take the difference of DWT_CYCCNT register before and after a code block and get the number of cycles this code takes to execute. This allows cycle-accurate code profiling.

Lucky for us, the PRU has this neat feature as well, that I used for determining the number of processor cycles each instruction takes to execute. This is known as the CYCLE register1. But before we can use it, we’ll have to enable it by setting bit 3 in the CTRL register using this code snippet:

MOV    R1, CTPPR_0
MOV    R2, 0x00000220      // C28 = 00_0220_00h = PRU0 CFG Registers
SBBO   &R2, R1, 0, 4

LBCO   &R1, C28, 0, 4      // Enable CYCLE counter
SET    R1, 3
SBCO   &R1, C28, 0, 4

Notice that we first modify the CTPPR_0 register which allows us to use the C28 register to refer to the PRU control registers instead of using up another register to hold the PRUCFG register address.

So, whenever we need to time a section of the code, we can do something like:

LBCO &R1, C28, 0xC, 4     // Load "before" cycle count into R1
// your assembly code here
LBCO &R2, C28, 0xC, 4     // Load "after" cycle count into R2

Now we can examine the contents of R1 and R2 to determine how many cycles it takes. You would also have to account for the “extra” clock cycles of the LBCO instruction.

I ran a lot of tests using this initially, here’s what I found:

  1. every MOV operation is one cycle. In fact, any operation that does not access external memory or peripherals completes in a single cycle i.e. 5ns. This is by design.
  2. LBCO and SBCO instructions with byte count 4 take 2 cycles. The way I hypothesize is that 1 cycle is spent to generate the address by adding the offset to it and then the actual data transfer operation takes 1 cycle per 32 bits transferred, thus O(n) time. Therefore the SBCO example in the previous section should take 9 cycles to complete (1+32/4), assuming there is no bus stall while writing the data to the memory.

We will use this information to help us with the timings needed for sampling.

Enhanced GPI/GPO feature

The PRU has an enhanced GPIO that operates at 200 MHz2 and it implements a “Direct Input” GPI mode. What it means that whatever be the pin value at the PRU inputs at the sampling instant will be captured whenever register R31 of the PRU is read. So, to sample first 8 bits of R31 into the register file, we can use the following PRU assembly code: (edited for more clarity, referring issue #9 on GitHub) The first line of the snippet below shows how one can sample the lowest 8 bits of R31 which is connected to the PRU input pins. The entire snippet shows how to make 5 samples of the input pins and store them into successive registers.

MOV R10.b0, R31.b0
MOV R10.b1, R31.b0
MOV R10.b2, R31.b0
MOV R10.b3, R31.b0
MOV R11.b0, R31.b0

Observe:

  1. We can and pack up to 4 samples into a single 32-bit register using the PRU assembly instruction (Rn.bx refers to the x’th byte in the register – each register is 4 bytes).
  2. You might ask, why sample to the registers, and why not store this data in the 8 KB SRAM, or the 12 KB shared RAM, or even the DDR RAM?
    • From the previous sections, we see that every SBCO would take 2 cycles but register access is just 1 cycle, so we can achieve higher sampling rates.
    • While accessing the DDR RAM there is a very low but finite probability of bus conflicts while data is being written, and having such an instruction within the real-time sampling loop has the potential to compromise the sampling operation. So we would like to separate both of them.
  3. Right now, we’ve only stored data in the registers, and there isn’t enough space to store all samples in the registers. So we need to somehow get the data out of there.

The PRU0/1 Scratchpad and the XIN/XOUT instructions

​I remember initially reading this section in the PRU reference manual with skepticism and was disappointed to find scarce resources and/or example applications in the early phase of my GSoC period. But this turned out to be one of the important links in the puzzle.

​Apart from the 30 registers in the two PRUs (R30 and R31 are connected to the GPO / GPI respectively), there’s also 30×3 independent register banks available as a scratchpad; and this is connected to the registers on both the PRUs using a “broadside interface”. Broadside means that all 30 registers are connected, and all of them can be moved in parallel. This means that in a single clock cycle one can copy or swap all 30 registers of a PRU with one of these 3 banks.

Here’s an example [syntax:: XOUT <bank_id>, &Rn, count]:

// On PRU1
XOUT 10, &R16, 32  // Copies R16-R23 into Bank0 (Bank0 = 10)

// On PRU0 - after XOUT has executed on PRU1
XIN 10, &R16, 32   // Reads R16-R23 from Bank0 into PRU0

This ability of moving bytes across the PRU barrier is crucial for BeagleLogic as it means that:

  1. We can cleanly separate pin sampling (handled by PRU1) and data transfer (handled by PRU0).
  2. Since pin sampling operates only on registers, it is effectively shielded from bus latencies.
  3. Because register manipulation is cycle accurate we can design delay loops in PRU1 to give us a programmable sample rate, independent of PRU0 operation.
  4. PRU0 can now directly push data into the 512 MB(!) of DDR memory directly giving BeagleLogic a buffer capacity so large at this price point. Note that due to packing the samples into 32 bit words the number of write transactions is cut down by a factor of 1/4th (8 bit samples) or half (16-bit samples) as compared to the sample rate, giving us cycles to spare. [The actual PRU firmware of BeagleLogic writes 32 bytes at a time into the DDR memory.]

Inter-PRU signaling

The final piece in the puzzle for a basic implementation is to have a way so that PRU1 can signal PRU0 that it has pushed data into Bank0 using XOUT, and that it can take data in using XIN and write it to the DDR memory.​ Interrupts. By configuring the mapping appropriately in the PRU interrupt controller (PINTC) one can send a signal to the other PRU by writing a value into the R31 register which triggers an interrupt (note we generally read from R31, not write). Similarly by waiting on bit 30 of R30 to be set, one can know from the other PRU when an interrupt has occurred. The interrupt configuration and mapping is in general handled by the library (libprussdrv) or the pru_remoteproc kernel driver, we just assume within the PRU that everything is already configured for us and is working.

​Putting it all together

Let us try writing a sketch of a very simple firmware which implements a basic form of a 8-bit logic analyzer. This should be helpful when developing for a similar application scenario​ and understanding the behind-the-hood working of BeagleLogic.

PRU1 code:

    MOV R16.b0, R31.b0     // take an 8bit sample
    NOP
​    MOV R16.b1​, R31.b0
    NOP
    MOV R16.b2, R31.b0
    NOP
L1: MOV R16.b3, R31.b0
    XOUT 10, R16, 4        // Move 4 samples to Bank0
    MOV R16.b0, R31.b0
    MOV R31, PRU0_INTR     // Signal PRU0
​    MOV R16.b1, R31.b​0
    NOP
    MOV R16.b2, R31.b0
    JMP L1

Observe how writing an infinite loop this way allows us to sample 4 bytes, move it into Bank0 and signal the PRU0 that data is ready. Also, the time gap between two samples is 2 clock cycles, so this infinite loop samples the pins at 100 MHz. See? How could one sample at, say, 50 MHz?

Next, let’s write the segment of code in PRU0 to receive this and write it into memory

// Assume DDR buffer start address is in R0, offset is in R1
loop:
WBS R30, 30                // Wait till interrupt
// an SBCO to clear interrupt flag (omitted here for clarity)
XIN 10, R16, 4            // Receive data
SBBO &R16, R0, R1, 4      // Store it into the DDR mem
ADD R1, 4                 // Increment dest address
JMP loop

Now observing how BeagleLogic does it – PRU1 Sampling code and PRU0 memory writing code should give an idea of the basic processes happening that make BeagleLogic possible. Of course, there is the overhead but it’s the same basic principle.

A similar post explaining the BeagleLogic kernel module is on it’s way soon.


  1. section 4.5.1.4 of the AM335x reference manual 
  2. section 4.4.1.2.3 of the AM335x reference manual 

Coming Soon: (Yet another) BeagleBone Display+CapTouch cape

LCD and CTP

TL;DR: This creates a cape for the BeagleBone Black using readily available replacement spare TFT Panels and a capacitive touchscreen originally found in cheap tablets.

[ You can also follow the project at Hackaday.io ]

Thanks to the economy of scale, the market of lower-end tablets is flooded with N brands available at almost throwaway prices. Here one can buy a cheap one for less than ₹3000 ($45) and get a decent 7″ screen with capacitive touch. But when it comes to the BeagleBone Black I either saw that most of touch screens available were resistive and panels with capacitive screens were out of my budget, especially when you know you could leveraging the same economy of scale build one 🙂 . I decided to take on the challenge and build a cape for the BeagleBone out of these readily available parts.

This December during my winter break at Jakarta, I shopped at Glodok and Roxy – the electronics and mobile spare part shopping hubs respectively [compare that to HQB, Shenzhen but at a smaller scale though]. What really caught my attention was 7″ TFT panels and capacitive touch digitizers in shops – the LCDs looked very close to the cheap 7″ LCD panels being sold on eBay and other places with a Realtek-based HDMI converter which I wanted to check out lately (like this).

These panels are known by the name AT070TN9x (x=0,2,3,4) and are originally manufactured by Innolux and have 50-pin connectors with a TTL interface (datasheet here). But were the panels being sold in the markets the same AT070TN92 panels, their clones or something different? I decided to find out and requested at the shops to be able take a photo of a few models of these. Got home and tried to match the pinout on the panel with the AT070TN92. Bingo. Perfect match for almost every one of them. Even though the panels have slight dimensional differences in the bezel, they have the same pin layout and should (hopefully) be the same from the inside.

Tablet LCD - back

Here you can see the one which has KR070PM7T written on it. The giveaways – Pins 1 & 2 (VLED+) shorted, 3 & 4 (VLED-) shorted. Pins 45, 49 and 50 are not connected. If you refer the datasheets, the pin definitions seem to align.

I bought two panels and mix-matched them with available capacitive touch panels to see which one fitted the screen the best. I also bought from a nearby shop the ZIF FPC connectors for the display and the touchscreen. The display one is 50 pins, 0.5mm pitch and the touch panel has 6 pins at a 0.5mm pitch. Not exactly breadboard friendly but very PCB friendly. As seen on the image at the top of the post (not this one) the LCD is sitting over the capacitive touch panel and you can see the 6-pin connections there. No reverse engineering needed for the capacitive touch FPC as the connections for the 6 pins are already highlighted in the image!

Okay, now I had to find a good driver for the LCD. I was aware of the TI TFP401 DVI (HDMI) receiver and could get samples if I wanted. But hey, the BeagleBone converts from TTL to HDMI and then I’m gonna convert HDMI to TTL, right? Why not just cut through the layers and wire the display directly? Should just be D0-D15, VSYNC, HSYNC, PCLK, 3V3, backlight, adjust LCD driver resolution, timings and done. Turns out we’re not done, yet.

The catch

Every LCD requires a high voltage to control the twist of individual liquid crystals. This voltage is usually internally generated using charge pumps but turns out that this “dirty” LCD panel ( DirtyPCBs 😛 ) expects the voltages to be supplied externally to it. The LCD expects approximately 10.4V for AVDD, 16V for VGH and -7V for VGL to be supplied to it. Hmm, how do I generate these?

The answer was not very hard to find. I was able to get Allwinner’s A13 based reference design for tablets that use a display (no points for guessing which one) with a 50 pin interface. Looking at their gate voltage generation circuits, we get this:

lcd-driving-circuit

This app note from Maxim Integrated explains what we’re looking at [scroll down to the end of the appnote]. The AVDD rail draws the maximum current so it gets powered it by the boost converter. Then the diode and the capacitors form a charge pump generating approximately +21 V and -10.4 V from that rail and the Zener regulates it down to the needed voltages. Very cool.

I’ve ordered some boost converters from AliExpress, the ones called SY7201 and XR1151 which are as of now stuck in the Chinese New Year holiday shutdown. Until then I would test with a TPS61061 which I have at hand.

The design

The schematic of a beta cape is almost done and I’m proceeding with routing of the PCB as at the time of writing. Here’s a peek on how the schematic looks right now, the final will be different from this one:

bb-disp-sch1

The beta version is to be a locally fabricated quick turnaround prototype so that I get something to work up the software side until the PCB for the first batch is manufactured. The production cape may include termination resistors or a 74LVC322245 buffer.

The capacitive touch side is simple. Two I2C pins and an interrupt pin to inform of touches. Turns out that the LCD uses only 47 out of the 50 signals and I can squeeze these three lines into the same FPC as the display using an extension cable and adapter PCB. So I’ve done it this way as can be seen above. The Linux kernel already contains a driver for the ft5x06 in drivers/input/touchscreen/edt_ft5x06.c . So getting the touch for the LCD should just be equivalent to writing some device tree code to invoke the module.

That was a long post. The next posts would feature testing of the capacitive touch panel and of the prototype.

The BeagleLogic cape: Assembly and Testing

A day before Christmas, I was greeted by a yellow DHL packet from Hong Kong. The panels had finally arrived after 8 days – I had chosen rush (48hrs turnaround) and express shipping, and it did work out well as I received them before Christmas. The boards turned out to be nice. Not dirty at all and at a price which was reasonable to me.

Assembly

Ready for assembly
Before assembly – the board with applied solder paste and a few parts

I reflowed the board with a hot air gun. Applied solder paste with a toothpick and placed the parts with the naked eye using a pair of good quality tweezers. The passives are all 0603 size (1.5mm x 0.76mm). Since I was yet to receive the batch of BSS138 FETs I ordered, I used a BC847 NPN transistor for the build instead. Reflowing left a couple of solder bridges on the 74LVCH16T245 (now onwards referred to as ‘the 245’) buffer chip (smallest pitched part on board – 0.5mm) which I removed using a solder wick. Then soldered the pin headers manually.

Cleaned up after soldering using nail polish remover. It’s terribly inefficient, and I plan to get a bottle of concentrated isopropyl alcohol (IPA) at some point of time. Here’s the finished result:

“Smoke Test”

A colloquial reference to the first power-on of the assembled circuit. I plugged the cape into the BeagleBone Black and powered on the assembly. The LEDs lit up and the board booted. No smoke or burning smell. Woohoo, test passed, or …

Errata

I then probed the BeagleBone P8 header pins which serve as inputs to BeagleLogic using an LED, and did the same with the input pin headers. The LED were brighter on the inputs to the ‘245 than the P8 pins, which was contrary to expectations. A quick check on the schematics and turns out that I wired the 245 to translate from rail B to rail A while connecting the inputs to rail A and the Bone pins to rail B. Whoops! However I quickly resolved the problem by lifting pads 1 and 24 (DIR1 and DIR2) off the PCB (which was simple as those were on the ends) and soldered a bit of magnet wire through and to one of the pads of C1 going to +3.3V. Neatly done, and a lesson not to design and send off PCBs to a fab while pulling an all-nighter 😛 .

Testing

Once this was done all was back on track and the circuit worked as expected. The cape as designed did not interfere with the boot-up process due to the action of the transistor pull-down that ensured that the buffer did not drive the P8 pins until SYS_RESETn signal was high.

I tested it with SPI signals upto 24 MHz using the BeagleBoard itself and results are good at a 100Msps sample rate. I could also subject the input pins to 5V freely, without fear that I would burn the board. That’s exactly what the cape is meant for, and I’m happy with the results.

I am planning to give away the surplus boards from the first batch so anyone interested in testing it out can get in touch and depending upon where you are, I may be able to send you one of the surplus unpopulated boards (no parts included) which can be assembled yourself (1 IC, 4 passives, 1 transistor and pin headers). The Cape EEPROM section can be left unpopulated without affecting the functionality.

The design files are now available here. I made some changes to the design after fixing the errata, so the cape version is bumped to 1.1 .

Suggestions and feedback on the cape are welcome. The design is Open Hardware, so you have the freedom to use it and improve it as you like. Let me know if there’s anything that could be added in the cape as there is plenty of board real estate.

Introducing: The BeagleLogic Cape

BeagleLogic Cape - 3D Render

After coding up the BeagleLogic project, I thought that it would be great to have an add-on cape for the project that provides buffering and also makes the inputs of the BeagleBone Black tolerant to TTL logic voltage levels (up to 5.5V) allowing BeagleLogic to debug external projects with ease. Hence introducing the BeagleLogic cape, the 3D render of which you can see above. The design is done in KiCad.

The design source and gerbers will be made available on the BeagleLogic GitHub repository after I physically assemble and verify the design.

Design & Layout

The cape design is simple enough to just have a single layer layout, as you can see in the render above the top layer is entirely a ground plane but for a single trace. Since the top isn’t much populated I added useful information on the top silkscreen including indexing the pin headers on the Bone on both sides.

The logic channels are accessed via 2×14 right angled pin headers. The upper row of headers are the actual logic channels while the bottom row is all GND pins. The pin headers are arranged in a MSB-to-LSB fashion. This means that the rightmost pin when viewed from the top is raw bit 0 of the captured logic samples. Note that sigrok will use the names of the actual Bone pins so bit 0 (Channel 1) is to be identified as P8_45, bit 1 (Ch2) is P8_46 and so on. The numbering is a little non-obvious but it’s because that’s the way the pins are arranged on the BeagleBone GPIO header. But don’t worry as the cape lists the pin ID of each logic channel so you don’t have to look it up in the pin diagrams.

One important point here. Only the first 12 channels can be used by default. To use the last two channels, you must disable eMMC first and solder 0R resistors or bridge the two resistors R8 and R9 on the bottom side to enable them. Otherwise the buffer will drive those two pins and you will damage the eMMC of the board and also void the warranty.

Here’s a shot of the schematic (click to enlarge). This is for reference only with respect to the current board and the released schematic may or may not be the same

Cape Schematic

The active buffer is a TI 74LVCH16T245 or equivalent. The buffer is powered from the VDD_3V3B power rail. The OE pin initially pulled is driven using an arrangment of a BSS138 N-MOSFET whose gate is connected to SYS_RESETn of the Bone. This should ensure that the logic input pins, which are also the system boot pins, are not driven by the buffer until the startup has completed.

This version of the design has a 0R resistor through which the VDD_3V3B powers the VDDA side rail of the 16245. If you remove the short and connect it to a 1.8V supply it should become compatible with 1.8V logic levels. I am however thinking of a better solution to the problem and should address this in the next released design.

There’s the officially required cape EEPROM on the bottom side as well, I presume this could be rendered redundant as the community moves towards the Universal Cape concept. But the footprints are there, just in case.

Manufacturing

The first prototype cape has been manufactured by DirtyPCBs.com as a 2-layer Black 10x10cm protopack. It has been shipped as of the time of writing and should reach me next week. I ordered the boards as a Rush order (48h turnaround time) and got it shipped via DHL so that I could have the boards in hand before Christmas rush. I would be using their services further if the boards work out well, looking forward to receive them!

Since I had left space on the panel, and there’s free panelizing so I managed to squeeze some more of my designs into the panel and make the best use of the available real estate. I would write more about those in the coming posts.

So that’s pretty much it. Design suggestions are welcome, and I’ll see if they can be accomodated in the subsequent hardware revisions. Once I test and it all works, the design files will be made available as I have written above.

Re-furnishing

I just refurnished The Embedded Kitchen, both on the outside as well on the inside, or I should say, on the front-end and hosting levels; it is still powered by WordPress. I was contemplating moving the site to static pages using Jekyll or Nikola, but WordPress (with Jetpack) now also supports Markdown-based content authoring so the biggest reason for migration to a static system is no longer there so this decision may be deferred indefinitely.

The site is now powered by OpenShift by RedHat PaaS (hover over the word for the definition).

Phase 1 – I just want a Sandbox

I had come across OpenShift some time before and (almost immediately) signed up for the free tier of the service – which gives me 3 “gears” to experiment with whatever I like – PHP, Node.JS, MySQL, MongoDB … . Since I had been contemplating installing a new theme on the website and didn’t want to fiddle with the old site too much so I exported all my data in the WordPress XML format, and created a test site on OpenShift and used it as a sandbox to try out various themes. I settled on Nirvana first (too many choices, I was overwhelmed) and then stumbled across the Freelancer child theme based on the GeneratePress theme framework and liked it more. So that’s the current theme on the website.

After checking my site with PingDom I thought it might be a good idea to migrate the site to OpenShift.

And so wp-embeddedkitchen.rhcloud.com was born.

Phase 2 – blog.theembdeddedkitchen.net

I decided to give the OpenShift version of my blog a proper subdomain blog.theembdeddedkitchen.net (does not exist as of now). This was simple, just one CNAME record to add to my DNS Zone settings, and tell OpenShift and WordPress (admin panel) about it, and boom. Checkpoint 2 crossed.

Now the acid test – getting theeembeddedkitchen.net to point to my blog. I had read enough about CNAME at the root DNS level (since the IP address of RedHat’s service could change without notice, so I cannot just use an A record of the current IP) to be wary of this.

In my excitement, I updated my new blog address in the WordPress admin panel from blog.theembeddedkitchen.net to theembeddedkitchen.net at the point where the DNS hadn’t moved. Yikes! The transition is going to happen faster than it should have.

I added a “Moving” sticky post to the old blog just in case someone might view the site in the transition phase.

Phase 3 – Enter CloudFlare

Of all the alternatives out there, CloudFlare (at least) looked to me the safest. As a side effect, its CDN infrastructure should accelerate my website and help Akismet further in deterring spammers (In over 7 months I guess I had tens and thousands of spam comments)

Configuring CloudFlare was as easy as advertised, just a quick DNS Server change and I was on board CloudFlare. I then added CNAME records and configured OpenShift to catch theembeddedkitchen.net and www.theembeddedkitchen.net for my application. I waited for the DNS changes to take effect and opened theembeddedkitchen.net . All looks well, I think I could call it a day now.

But when I opened www.theembeddedkitchen.net , the site just hung up and I get – “too many redirects”. Wait. A redirect loop! Ok, no problem – I changed my blog URL in WordPress to www.theembeddedkitchen.net and it seems that the problem is fixed. Then I open http://theembeddedkitchen.net and get the same redirect loop. It seemed like a cat-and-mouse game.

However on more digging I figured out that the missing piece of the Jigsaw puzzle to set up a redirect rule on CloudFlare to give a HTTP 301 redirect from http://www.theembeddedkitchen.net/* to http://theembeddedkitchen.net/$1. So if you are stuck in a redirect loop and using both OpenShift and CloudFlare, here’s what you have to do (assuming your-domain.com):

  • In the DNS Control panel in CloudFlare settings, add a CNAME for your-domain.com to your destination (theembeddedkitchen.net => wp-embeddedkitchen.rhcloud.com).
    If you do this correctly, CloudFlare will tell you that “CNAME Flattening will be applied to this record”.
  • Add CNAME www and point it to your-domain.com (theembeddedkitchen.net)
  • Goto the Rules section and set up a redirect rule [HTTP 301] from http://www.your-domain.com/* to http://your-domain.com/$1). This is important to curtail that nasty redirect loop, so that the WordPress installation gets requests only for the non-www domain [Needless to say, the blog address in WordPress should be http://your-domain.com] and CloudFlare takes care of that even before OpenShift kicks in.

All this done the site was back online to its full glory. Now just a few finishing touches…

Phase 4 – Plugins

I installed the Jetpack plugin first so that I could get all the goodies and a site check every 24 hours which ensures that the site never “idles” (apps in the free tier are put into standby if it doesn’t receive an HTTP request in 24 hours).

Also I decided to put a Disqus comment system in place of the old comment form. This was as simple as setting up my Disqus account and installing the Disqus comments plugin.

There’s also WP Super cache that makes this (almost) a static site. Not sure if N levels of caching are going to help anyway but I think it might speed up things a bit.

I am thankful to the shared PHP hosting service provided by edaboard.com to its members which powered my website so far. The old content would still be there, albeit not really accessible for the time being.

And yes, I write this I am finalizing a panel to be sent to the board house. What the circuits are about, is in the next post.
EDIT: Panels are already sent to the board house and I’m waiting for them to arrive. Yay!

BeagleLogic: 14 weeks and counting; The journey so far

BeagleLogic was born out of a project idea using the BeagleBone Black, and I am thankful to Jason Kridner and the BeagleBoard.org community for accepting it as a mentored project under the Google Summer of Code 2014. As the GSoC program officially ended last week, here is a video report highlighting what’s been accomplished so far, over a period of 3 months from 19th May to 18th August.

It was an awesome experience this summer, interacting with the community and developing the codebase for a project that showcases the capability of the two little yet powerful integrated Programmable Real-Time Units (PRUs) on the AM335x SoC that powers the BeagleBone Black to realize a 14-channel, 100Msps, 360 MB buffer Logic Analyzer which by far, offers the best value for a 55$ board which you can use not only to develop your embedded projects but also to debug them on-site without any extra hardware requirement. All the magic happens inside the firmware loaded into the PRUs and the kernel driver which manages sharing the system memory with them.

This project also saw me doing many things for the first time. It was the first time I compiled a Linux kernel and wrote myself a kernel module that is now a part of the BeagleBone community kernel, and I look forward to getting it merged upstream. The Web Client for BeagleLogic (link) is my first web application using HTML5 and Bootstrap for a lightweight web client for BeagleLogic. The backend I’ve written for the web client is my first Node.JS based application. I found these frameworks quite interesting and hope to work on projects in these areas in the future.

Google Summer of Code has been the best thing to happen to me so far, and I’ll be looking forward to it in the years to come.

Have fun with BeagleLogic. Also, do write back to me with how BeagleLogic helped you debug helped you debug your circuit and learn more about logic protocols!

BeagleLogic and sigrok: The beginning

This week I spent some time familiarizing myself with the libsigrok components, and developed bindings for BeagleLogic into libsigrok. The development work walked me through non-blocking I/O, which I implemented into the BeagleLogic kernel module only to realize that it had a major bug that causes the client to sleep uninterrupted and froze my BeagleBone Black everytime. The bug got identified and fixed and after that non-blocking I/O worked as expected.

Memory Mapped Kernel Buffers
This was another optimization I made use of in the sigrok bindings. The sigrok library internally keeps logic data in packets and it is exchanged with the host application via pointers. I exploited this good opportunity to use the mmap() functionality I designed into the kernel driver, so the sigrok implmentation just passes a mmap()’ed pointer to the host application instead of read()-ing the data.

However the currently being read buffer still needs to be updated for poll() to work properly. For this reason I implemented dummy reads. When a NULL pointer is passed as the buffer to read(), it does not read any data from the buffer, and just updates the read state within the drivers so that poll() works fine.

The flow of data is like this:

  • libsigrok bindings signal the BeagleLogic module to start the sampling operation, and starts polling on the /dev/beaglelogic node using a GPollFD
  • Once the PRU fills up a data buffer, it sends an IRQ that wakes up the the polling thread and the beaglelogic_receive_data() callback in protocol.c executes
  • The callback sends a packet with the pointer to that segment of data required, and adjusts the internal offsets for the next callback

The process goes on till the sample limit has been reached. BeagleLogic also supports streaming but…

sigrok-cli on the BeagleBone Black
To test the results of the bindings, I ran a sample operation on BeagleLogic using the sigrok bindings and it worked fine. I was able to decode a wave file from my trusty audio player hardware 🙂 to wav right there on the BeagleBone Black using the article on the sigrok blog as a reference. I got 6 seconds of a wave file directly from the dump. This took ~750 seconds to process from RAM on the BeagleBone Black.

I also tested sampling into a 256 MB buffer in the RAM