Upcoming Adventure: 100 Days of FPGA and Digital Design

In the next 100 days starting Monday, 10th June 2019, I will be actively tracking my efforts in ramping up on Digital Design and FPGAs. They aren’t easy topics to master by any means, and 100 days may not be enough but still I consider it worth giving a shot as it would give my learning effort a momentum. Because of my full-time commitments I’ll only be able to give 1-2 hours to this on weekdays, let’s see how it goes!

I’ve created a separate site to track my progress – it’s at https://100dayfpga.theembeddedkitchen.net. There’s a lot of FAQ on the front page of the sub-site, please go ahead and read it to understand why I am doing this challenge. I’ll also be using the hashtag #100DayFPGA on social media to tag my efforts!

If you have any suggestions on what I could take up during my 100-day effort, please feel free to email me or leave a comment below. Wish me luck!


The sub-site is created in Hugo and served over GitLab Pages. I’ve just started to get a feel for static sites and one day I plan to migrate over The Embedded Kitchen over to Hugo as well but there’s no rush – it’ll take time and I would not like to break anything during the migration.

There’s now also a Discourse discussion forum at https://discourse.theembeddedkitchen.net to discuss articles on the site as well as about this challenge – I plan to migrate away from Disqus comments in the near future to inline Discourse threads on each article.

Announcing: BeagleLogic Standalone

Today, I’m delighted to announce something that I’ve been working on the past four months, which is now alive and doing exactly what it was meant for. A Standalone, turnkey logic analyzer based on the work I did for BeagleLogic. And is aptly titled – BeagleLogic Standalone.beaglelogic-standalone-top

BeagleLogic Standalone is a specialized version of the BeagleBone which is intended to be used a logic analyzer based on BeagleLogic.

Describing it as an equation, (BeagleLogic Standalone) > (BeagleBone) + (BeagleLogic). The greater than comes from the fact that the BeagleBone + BeagleLogic has up to 14 logic inputs, BeagleLogic Standalone has 16. It has 1000Mbps Ethernet, while the BeagleBone has just 100Mbps.

This logic analyzer has networking capabilities (10/100/1000Mbps Ethernet); it can be used to used to debug circuits remotely. And as it is a full-featured Linux computer, you can run the sigrok set of tools directly on the BeagleLogic Standalone board (they come preinstalled in the BeagleLogic system image), or on your host PC. It has 16 channels and can sample up to 1.5 seconds of data at the maximum sample rate, which is 100MSamples/sec (3 seconds of data if using only the first 8 channels).

If you are interested in purchasing one of these boards, please fill up this form, and I’ll keep you informed.

The rest of this post is dedicated to describing the story behind the creation of BeagleLogic Standalone.

Read moreAnnouncing: BeagleLogic Standalone

The Printed Circuit Name Card

The events described in this article take place in July, when I spent some of the best days of my life so far as a summer intern based at Mountain View, CA.

My internship was coming to a close. I wanted to have a namecard to give out to the new connections I made while I was there – and with a little more than two weeks to go then before winding up, I had to make it fast. Having seen a lot of interesting projects that make business/name cards out of printed circuit boards, I decided to do my take on the concept and create a printed circuit namecard for myself. This article describes the thoughts behind the design and what I learned in the process. It was fun nonetheless. Before continuing over to the rest of the article, here’s what the finished card looks like:


The dream phase

“My card would have a small multi-colored LED that would glow whenever someone would hold it in their hands. It would contain my contact information as a QR code and on the back would be whitespace where I can leave someone a handwritten note”.

Which color would it be? The soldermask colors of choice were – OSHPark purple, blue, black and white. Had recently seen a board with a matte black soldermask and liked the feel of it so I was interested in trying it out too.

The card designs I had seen placed all the text on the silkscreen, but maybe putting some part of the text as exposed copper would add more contrast? The effect would look more dramatic when the board gets gold plated (ENIG) vs the usual solder finish (HASL), also the boards would cost more as a result.

The Mock-up

Business card size is 3.5″x2″ ( 89x51mm ). Mine’s 89x50mm to fit a 10cmx5cm bounding box for standard fabrication size.

Mock-up of the card was done in Inkscape. I generated a QR code for myself as a SVG file from here

The QR code is in vCard format, so scanning it reveals my contact details which can be straightaway saved on your phone. Also it was at this point that the fonts and font sizes were frozen. Once it looked good to me, I exported different sections of the file in PNG format, which I would later convert to footprint using KiCAD’s handy utility and position in the board layout as any other component.

The Electronics

For choosing the right microcontroller, there were four requirements – low power, 3 PWM channels, touch sensing and low price. Wanted to use the AVRs initially for ease of use, familiarity (the Arduino framework) but they were roughly around a dollar or more on DigiKey/Mouser. The PICs seemed to fit the bill though in just about half a dollar – selected the PIC12LF1572 microcontroller – that sounded good enough. Also, my first time using a PIC microcontroller – a new platform to learn!

The battery? CR2032s are pretty standard, and also have the highest possible mAh. The next smaller one with a comparable mAh was the CR1632. I decided to go with the CR1632, though next time I might consider a CR1220.

I needed a PIC programmer, found a project on Hackaday.io which meant I could repurpose one of my Arduino boards to do the programming – this project actually gave me faith that I could somehow program the PICs while still being on a tight deadline (thanks to hackaday.io user jaromir.sukuba for maintaining this!) and I decided to give it a shot.


Submitted a Mouser order for all the parts and it all arrived in 3-4 days. The BOM cost was ~$60 for enough to build 40 of these.

The Board Layout

As this was a relatively low complexity design, it was completed in a few hours on an extended night shift.

The key detail here was my name – unlike the other text I put it on the copper layer so that it would get gold plated . To ensure that the soldermask opened up around it, I added a 0.2mm “stroke” around the vector graphic corresponding to my name in Inkscape and put it on the F.Mask layer so that the text is exposed through the mask and would eventually get gold plated. Here you can see what I mean [red = top copper, magenta = top soldermask, green = bottom copper]:


Here’s a shot of the completed layout:


Board Fab

I used Elecrow for this order – they offered matte black soldermask option without a custom quote. I had used them once before and have been satisfied with the quality. One more good thing about them is that they don’t add production codes on the silkscreen of your boards [which can be so annoying on a board design like this] and also send you a picture of the finished board before they dispatch your order [which makes you even more excited to receive the boards once you’ve seen them made as you wanted them to be]. The quality was excellent. From silkscreen to plating and all the vias were properly covered.

I had selected DHL Shenzhen as the shipping method which is supposed to be faster than DHL Hong Kong, the price difference being roughly $5. For a total of 40 boards with matte black soldermask and ENIG finish, paid $100 total. Order was submitted on a Wednesday and the boards were delivered next Wednesday. The boards looked beautiful, just as I had designed them to be 🙂 .


There aren’t a lot of components here, this should have been pretty standard stuff but I had lead-free solder paste and that was a big pain to work with a toothpick, without a stencil. I should have probably ordered one along with my boards. And the last week internship rush meant that I couldn’t have enough time – had to choose between doing the firmware or building all 40 on a night stretch – I got a few assembled and that was it.


This was the the first time I was using a PIC microcontroller. Moving to a new microcontroller has its own surprises – it took me an hour to realize that the PWM generator won’t update its duty cycle automatically on changing contents in the duty cycle register and a bit has to be set in another register. The MPLAB IDE was mostly okay-ish, although I’d try and see if I can get it working with the SDCC compiler for future firmware revisions.

The MPLAB code configurator provided an easy way to implement the touch sensing functionality here as it had an mTouch library implementation and I was able to get it working just in time. [jaromir]’s PIC programming software worked well and I was able to quickly iterate and build v0.1 .

The chip being a LF series has pretty low standby current but there wasn’t enough time to bake that functionality in. I chose to run the MCU at a lower clock frequency instead (500kHz) to get active currents of the order of 0.3mA. Battery life is two weeks after which the LED dims out.

Here’s a close-up of the finished result. The 5 pins are for programming. A 2mm berg strip is manually held to the headers to program.


I was very satisfied with the finished result. Matte black with a gold finish looks beautiful and to me, was worth it. Also there was excellent service from Elecrow which I would whole-heartedly recommend based on my experience with them. [jaromir]’s PIC programmer allowed me to use one of my Arduino boards to program the PICs, it helped me a lot during this project. Thanks a lot!

With a working v0.1, I was able to build a few and give away. I’ve given away some blank boards too, they look pretty in their own right.

I want to rework the firmware, most likely with SDCC and add some more fade effects on the tri-color LED. And also implement proper low power support (likely by using a WDT to sleep every second and wake up and check for a touch). In the next PCB rev, I would:

  • Make the boards 0.8mm thick. Or even 0.6mm.
  • Make the QR code larger for better readability and push it to the back side of the PCB – it’s currently all white silkscreen
  • Add a little more clearance between the touch sensor pad and the ground plane all around it. It probably decreases the touch sensitivity a bit.
  • Use a slot in the board and add a split battery holder leading to a lower Z-height.
  • Make a pogo pin set and a 3D printed jig to hold the programming header in place. It’s awkward to hold and press the headers everytime something needs to be flashed.

I’m not posting the board design files yet. However I might post the firmware after adding more effects.

That’s all for this one. If we ever meet in person, ask me to show you and I might even give you one of these.

Here’s an animated GIF:


BeagleLogic: now also analog

A year and half ago I published a survey on the BeagleLogic wiki where prospective visitors and users were asked which feature they wanted to see in BeagleLogic the most. I compiled the raw responses so far, here they are:BeagleLogic survey

It’s interesting that a majority of prospective users wanted to be able to do analog sampling with BeagleLogic.

And today, it becomes a reality for BeagleLogic users, thanks to the efforts of a team working at Google Research who wanted to use the BeagleBone for data acquisition and had been working independently on an ADC cape for the BeagleBone for a while. When we first got in touch, they already had a board fabricated and assembled and kindly agreed to send me a prototype to let me help BeagleLogic support the board. It excited me as not only would analog sampling support in BeagleLogic become a reality but also their project could benefit from the kernel infrastructure BeagleLogic has established to capture data using the PRUs on the BeagleBone and move it to userspace to be able to realize the full performance of the ADC which would be difficult to achieve through a libprussdrv based solution.

I am happy to tell you today about the PRUDAQ project (link to announcement on Google Open Source Blogs) – an ADC cape for the BeagleBone for doing high-speed analog data acquisition. At the heart of the board is an Analog Devices AD9201, a 10-bit ADC that can sample two channels up to 20MSPS.
PRUDAQ board
The board has been designed by Jason Holt and his team at Google Research and the team at GroupGets as the manufacturing partner have been instrumental in getting the boards manufactured and ready for sale – the board itself is available at their store, and they also offer a bundle with pre-loaded SD card and other accessories.

GroupGets is a secure platform to create or join group buys for things that are out of reach for a single buyer like a high minimum order quantity for a specialty sensor. Their GetLab engineering team also creates custom hardware and software to enhance group buy targets or make them easier to use.

PRUDAQ with the BeagleLogic stack can be a great way for high performance analog data acquisition – the PRUs and the kernel driver can handle simultaneously both channels of the ADC at sample rates up to 19.9 MSPS 1, which corresponds to a data rate of 79.6 MB/s 2. The dual-PRU architecture of BeagleLogic that maintains a clear separation between the sampling operation itself (done by PRU1) and data transfer to memory (done by PRU0) means that on an application level, only the firmware on PRU1 needs to be changed in order to support PRUDAQ, no modification is necessary on the kernel driver side, raw analog data can be directly read through /dev/beaglelogic and this also opens the door to other ADC boards or sensors being able to make use of the high-speed capture framework provided by BeagleLogic by writing suitable firmware extensions.

There’s also an update to the BeagleLogic system image that also builds in PRUDAQ support, and this will be bundled by default in all new BeagleLogic image releases. It is immediately available for download from the images page on the BeagleLogic wiki. For those interested in the firmware, it’s present in the main repository now here. To learn more about the PRUDAQ project, their wiki is a good place to start.

The performance of PRUDAQ has been documented really well and it also highlights the bottlenecks in the later part of the document, which is also relevant to the performance of BeagleLogic.

I’d also take the opportunity to announce collaboration with GroupGets for the first production batch of BeagleLogic capes and a bundle offer – while the BeagleLogic capes aren’t available today, those who purchase the first batch of the PRUDAQ boards from GroupGets will get 10% off their purchase of a BeagleLogic cape as and when it is made available from GroupGets. The first batch buyers of the board will receive a discount code in their email when it’s ready.

Congratulations to the team at Google Research and GroupGets for a successful launch! I am also very grateful to the support provided to me by the Google Summer of Code program and the BeagleBoard.org Foundation in bringing this project to you over the summers 2 years ago, and looking forward to see the new applications that this will make possible.

  1. Due to the nature of the assembly code that samples the PRU pins, it is not possible to sample both channels at 20MSa/s as the external clock needs to be tracked and at 20MSa/s, samples were found to be dropped. However it works OK at 19.9 MSPS with an external clock [tests done by Jason and his team]. It is however, possible to sample both channels simultaneously at the max possible 20MSPS if PRU1 is allowed to clock the ADC. This is a special case and the firmware to do so will be made available in the future. 
  2. ADC samples are 10-bit but use up 16 bits = 2 bytes. So 19.9 MSa/S * 2 channels * 2 bytes/channel = 79.6 MB/s 

A day with Hackaday

The year 2015 saw my project BeagleLogic as a Best Product finalist in the 2015 Hackaday Prize, and my other project Smarter Power Pack  as a semifinalist in the main event. I also got to attend the Google Summer of Code Mentor Summit at Sunnyvale, CA (thanks to BeagleBoard.org) in November 2015 where I met many interesting people and had a great time.

Just the next weekend was Hackaday’s first hardware conference – the Hackaday SuperConference – which would later go on to be a grand success. I however knew that I couldn’t stretch my 8000-mile trip to the weekend because my university end-term examinations began next week and I had to leave on the Thursday before, the 12th of November. I let Sophi (of Hackaday) know and they invited me for lunch with them Thursday afternoon at the Supplyframe office at San Francisco.

Thursday morning, I boarded the Caltrain to San Francisco, and then roughly a 10-minute walk from the station to the office. This is how my entry was announced at the “Hackaday SuperConference” group chat channel at Hackaday.io :


The office at that time was packed and brimming with activity as the entire Hackaday crew had gathered and working together towards the Superconference. I sat on a table beside [Aleksandar Bradic] (CTO, Supplyframe) and [Chris Gammell] (of AmpHour podcasts and Contextual Electronics). Behind me were [Rich] and [Sophi] working on a video-shoot for the Hackaday Omnibus 2015 release (yes, I saw a part of the shoot that day). Here is a link to the released video on Vimeo.


Thanks to [Jasmine], I did not miss the SuperCon badge and T-Shirt. As a proofreader for the Hackaday Omnibus 2015, I received my copy as well.


Now there’s an interesting backstory of the badge here that I think I can share with you: As I put on the lanyard onto the badge, they realized that the drill hole for putting the lanyard on the badges was a tight fit and it would be cumbersome to remove and re-attach the badge without scratching the solder mask on top. The solution? A bunch of keyring rings which would then go on between the badge and the lanyard – like here.

I got to play with the new NVIDIA Jetson TX1 for a while, a test unit of which had just arrived at Hackaday ([Brian] would later publish a hands-on with it, which can be read here). I saw [Brian] and [Mike] typing on what would be the posts that would later go on to the Hackaday blog, and also the HaD tip line, in the short time I was there.

Once [Sophi] and [Rich] were done with a round of shooting for the Omnibus video, we proceeded for lunch to a nearby restaurant. The food was great. The lunch time conversations, even more great.


After lunch we walked back to the Supplyframe office and I bid goodbye to the entire team, and boarded my flight back to India later that night. I had got to see the team behind Hackaday up really close, and my respect for them has only grown since. They are doing a great job with their blog, the Hackaday.io community and the SuperConference which I really missed being a part of this year. A big shout out to them, and also to [Sophi] for making an exception while you all were so busy preparing for the SuperCon so that I could get to be there, and interact with you all. Thanks for having me there! Looking forward to meeting you all again.

Wishing all readers a very Happy New Year 2016 from The Embedded Kitchen.


An Adventure in self-fabricating PCBs

An Embedded Kitchen wouldn’t have been complete without the ability to fabricate circuit boards for rapidly prototyping circuits without waiting on the board house [fail fast, fail often applied to hardware design]. This week, it gained the capability to fabricate boards using the well-known toner transfer method.

I had done toner transfer first when I was at school, the results had been terrible. So terrible that I turned to wire-wrap prototyping for my circuits, even when I started using SMD components later. However the realization that dead-bugging SMD parts is too much effort which doesn’t scale brought me back to PCBs. This time because I couldn’t wait for two weeks to get my boards manufactured so I decided to take on toner transfer again. I knew the steps. Clean the board, print the design, iron it on. After the ironing when I rinsed the board under cold water, I was pleasantly surprised to see near-perfect pattern transfer from paper to board. Then I went on to etch them with ferric chloride.

The first boards to be so fabricated are an early stage prototype of my project Smarter Power Pack which uses spare laptop batteries to realize a better power bank (interested? here’s a link for further reading), and a respin of the LCD connector board for my BeagleBone LCD cape. Here are pics of the boards so fabricated:




After this was done, I threw in an additional step of tinning the whole board to protect it from tarnishing and also patch-up any imperfections in traces. Then placing components and reflowing the boards are usual. Here are the results.



They’re not as pretty as the ones got from the board house, but they’re beautiful in their own way. They’re cheaper in terms of money but more expensive in terms of extra effort (drilling and hole plating). And I don’t need to have 5 or 10 copies of tea coasters if I used a wrong footprint or inverted it. But in the end, it’s all worth it when you can get from an idea to prototype in days and not weeks.

BeagleLogic: Building a logic analyzer with the PRUs: Part 1

At demonstrations whenever I am asked a question about how BeagleLogic works, it takes time to be able to explain how a low cost SBC can actually sample at digital signals at 100 MHz, what makes the BeagleBone Black so special, why can’t this be done with something like a Pi without adding any extra hardware.

This blog post is an attempt to start from (almost) scratch and explain the nuts and bolts of the BeagleLogic assembly and document the design decisions made last year as a reference for future application scenarios. This should be the first in a series of posts.

The PRUs, or the “Programming Real-Time Units” on the AM3358 SoC on the BeagleBone Black are two 200 MHz microcontrollers that run side-by-side the 1 GHz ARM CPU. They can be started, stopped, reset, programmed via the CPU and also share the same bus so the PRUs can independent of the CPU access the core peripherals like GPIO, memory, ADC, DMA, … and also have a GPI/GPO subsystem of their own (“Enhanced GPIO subsytem”).

These PRUs can be programmed in C using the TI PRU C Compiler (recommended) or PASM (now considered obsolete but still works) or a GCC Compiler port (in progress) and of course, hand tuned assembly code one can write for the PRU as well.

At the heart of a logic analyzer…

… is a register that samples the input signal(s) at regular intervals (decided by the sample rate in case of a free running sampler) or at the edge of an external clock signal. These samples are then recorded into the sample buffer which can be then used to extract and analyze the captured digital signal.

Note that this is the simplest case, in real life scenarios there are often trigger conditions like ‘Hey, start recording when there is, say, “a falling edge on pin B when pin A is high”, or “after 5 rising edges on pin C when pin D is high and pin E is low”‘ and so on.

The core of BeagleLogic is the simplest possible implementation. It simply samples and records the inputs into a buffer at a sample rate that can be configured in integer divisions of 100 MHz i.e. 100 MHz (100 / 1), 50 MHz (100 / 2), 33.33 MHz (100 / 3) and so on.

Now let us look at the building blocks available in the PRUs of the BeagleBone Black that enable us to achieve this.

MOV, SBCO and LBCO Instructions

A quick primer on the 3 most important instructions for data transfer in PRU assembly. For a detailed overview refer to the PRU instruction set manual.

MOV is used for moving data between registers

MOV R1, R2         // R1 gets entire contents of R2
MOV R1.b0, R2.b0   // least significant byte of R2 copied to LSbyte of R1
MOV R2.w1, R3.w0   // least significant halfword of R3 copied to MSHWord [HWord=16bits]

SBCO is used for moving data between a physical address and a register. This physical address can be within the PRU (e.g. PRU data RAM, shared RAM, power and control registers) or outside peripherals like the GPIO subsystem, the ADC subsystem, the EMIF subsystem (the DDR SDRAM controller), the GPMC. General usage:

SBCO &src, destination, offset, n

Moves n bytes from the src register into (*destination)+offset, this is indirect addressing. If n > 4, then the subsequent registers are accessed as well. Offset can be a register or an immediate value.

When using SBCO to write to the DDR memory, note that we must provide only physical memory addresses as there is no MMU involved when PRU accesses it. This is what most examples of the PRU that demonstrate shared memory access do.

Using R0 = 0x40000000, R1 = 0x100, try and guess what SBCO &R2, R0, R1, 32 does

LBCO loads data from an external address into a destination register. If n > 4, data gets loaded into the subsequent registers as well.

LBCO &dest, src, offset, n

Clock cycle counting

I’ve used the Cortex M3/4 microcontrollers and these cores have this nice register called DWT_CYCCNT which provides number of clock cycles the processor has executed code. So one can take the difference of DWT_CYCCNT register before and after a code block and get the number of cycles this code takes to execute. This allows cycle-accurate code profiling.

Lucky for us, the PRU has this neat feature as well, that I used for determining the number of processor cycles each instruction takes to execute. This is known as the CYCLE register1. But before we can use it, we’ll have to enable it by setting bit 3 in the CTRL register using this code snippet:

MOV    R1, CTPPR_0
MOV    R2, 0x00000220      // C28 = 00_0220_00h = PRU0 CFG Registers
SBBO   &R2, R1, 0, 4

LBCO   &R1, C28, 0, 4      // Enable CYCLE counter
SET    R1, 3
SBCO   &R1, C28, 0, 4

Notice that we first modify the CTPPR_0 register which allows us to use the C28 register to refer to the PRU control registers instead of using up another register to hold the PRUCFG register address.

So, whenever we need to time a section of the code, we can do something like:

LBCO &R1, C28, 0xC, 4     // Load "before" cycle count into R1
// your assembly code here
LBCO &R2, C28, 0xC, 4     // Load "after" cycle count into R2

Now we can examine the contents of R1 and R2 to determine how many cycles it takes. You would also have to account for the “extra” clock cycles of the LBCO instruction.

I ran a lot of tests using this initially, here’s what I found:

  1. every MOV operation is one cycle. In fact, any operation that does not access external memory or peripherals completes in a single cycle i.e. 5ns. This is by design.
  2. LBCO and SBCO instructions with byte count 4 take 2 cycles. The way I hypothesize is that 1 cycle is spent to generate the address by adding the offset to it and then the actual data transfer operation takes 1 cycle per 32 bits transferred, thus O(n) time. Therefore the SBCO example in the previous section should take 9 cycles to complete (1+32/4), assuming there is no bus stall while writing the data to the memory.

We will use this information to help us with the timings needed for sampling.

Enhanced GPI/GPO feature

The PRU has an enhanced GPIO that operates at 200 MHz2 and it implements a “Direct Input” GPI mode. What it means that whatever be the pin value at the PRU inputs at the sampling instant will be captured whenever register R31 of the PRU is read. So, to sample first 8 bits of R31 into the register file, we can use the following PRU assembly code: (edited for more clarity, referring issue #9 on GitHub) The first line of the snippet below shows how one can sample the lowest 8 bits of R31 which is connected to the PRU input pins. The entire snippet shows how to make 5 samples of the input pins and store them into successive registers.

MOV R10.b0, R31.b0
MOV R10.b1, R31.b0
MOV R10.b2, R31.b0
MOV R10.b3, R31.b0
MOV R11.b0, R31.b0


  1. We can and pack up to 4 samples into a single 32-bit register using the PRU assembly instruction (Rn.bx refers to the x’th byte in the register – each register is 4 bytes).
  2. You might ask, why sample to the registers, and why not store this data in the 8 KB SRAM, or the 12 KB shared RAM, or even the DDR RAM?
    • From the previous sections, we see that every SBCO would take 2 cycles but register access is just 1 cycle, so we can achieve higher sampling rates.
    • While accessing the DDR RAM there is a very low but finite probability of bus conflicts while data is being written, and having such an instruction within the real-time sampling loop has the potential to compromise the sampling operation. So we would like to separate both of them.
  3. Right now, we’ve only stored data in the registers, and there isn’t enough space to store all samples in the registers. So we need to somehow get the data out of there.

The PRU0/1 Scratchpad and the XIN/XOUT instructions

​I remember initially reading this section in the PRU reference manual with skepticism and was disappointed to find scarce resources and/or example applications in the early phase of my GSoC period. But this turned out to be one of the important links in the puzzle.

​Apart from the 30 registers in the two PRUs (R30 and R31 are connected to the GPO / GPI respectively), there’s also 30×3 independent register banks available as a scratchpad; and this is connected to the registers on both the PRUs using a “broadside interface”. Broadside means that all 30 registers are connected, and all of them can be moved in parallel. This means that in a single clock cycle one can copy or swap all 30 registers of a PRU with one of these 3 banks.

Here’s an example [syntax:: XOUT <bank_id>, &Rn, count]:

// On PRU1
XOUT 10, &R16, 32  // Copies R16-R23 into Bank0 (Bank0 = 10)

// On PRU0 - after XOUT has executed on PRU1
XIN 10, &R16, 32   // Reads R16-R23 from Bank0 into PRU0

This ability of moving bytes across the PRU barrier is crucial for BeagleLogic as it means that:

  1. We can cleanly separate pin sampling (handled by PRU1) and data transfer (handled by PRU0).
  2. Since pin sampling operates only on registers, it is effectively shielded from bus latencies.
  3. Because register manipulation is cycle accurate we can design delay loops in PRU1 to give us a programmable sample rate, independent of PRU0 operation.
  4. PRU0 can now directly push data into the 512 MB(!) of DDR memory directly giving BeagleLogic a buffer capacity so large at this price point. Note that due to packing the samples into 32 bit words the number of write transactions is cut down by a factor of 1/4th (8 bit samples) or half (16-bit samples) as compared to the sample rate, giving us cycles to spare. [The actual PRU firmware of BeagleLogic writes 32 bytes at a time into the DDR memory.]

Inter-PRU signaling

The final piece in the puzzle for a basic implementation is to have a way so that PRU1 can signal PRU0 that it has pushed data into Bank0 using XOUT, and that it can take data in using XIN and write it to the DDR memory.​ Interrupts. By configuring the mapping appropriately in the PRU interrupt controller (PINTC) one can send a signal to the other PRU by writing a value into the R31 register which triggers an interrupt (note we generally read from R31, not write). Similarly by waiting on bit 30 of R30 to be set, one can know from the other PRU when an interrupt has occurred. The interrupt configuration and mapping is in general handled by the library (libprussdrv) or the pru_remoteproc kernel driver, we just assume within the PRU that everything is already configured for us and is working.

​Putting it all together

Let us try writing a sketch of a very simple firmware which implements a basic form of a 8-bit logic analyzer. This should be helpful when developing for a similar application scenario​ and understanding the behind-the-hood working of BeagleLogic.

PRU1 code:

    MOV R16.b0, R31.b0     // take an 8bit sample
​    MOV R16.b1​, R31.b0
    MOV R16.b2, R31.b0
L1: MOV R16.b3, R31.b0
    XOUT 10, R16, 4        // Move 4 samples to Bank0
    MOV R16.b0, R31.b0
    MOV R31, PRU0_INTR     // Signal PRU0
​    MOV R16.b1, R31.b​0
    MOV R16.b2, R31.b0
    JMP L1

Observe how writing an infinite loop this way allows us to sample 4 bytes, move it into Bank0 and signal the PRU0 that data is ready. Also, the time gap between two samples is 2 clock cycles, so this infinite loop samples the pins at 100 MHz. See? How could one sample at, say, 50 MHz?

Next, let’s write the segment of code in PRU0 to receive this and write it into memory

// Assume DDR buffer start address is in R0, offset is in R1
WBS R30, 30                // Wait till interrupt
// an SBCO to clear interrupt flag (omitted here for clarity)
XIN 10, R16, 4            // Receive data
SBBO &R16, R0, R1, 4      // Store it into the DDR mem
ADD R1, 4                 // Increment dest address
JMP loop

Now observing how BeagleLogic does it – PRU1 Sampling code and PRU0 memory writing code should give an idea of the basic processes happening that make BeagleLogic possible. Of course, there is the overhead but it’s the same basic principle.

A similar post explaining the BeagleLogic kernel module is on it’s way soon.

  1. section of the AM335x reference manual 
  2. section of the AM335x reference manual 

Coming Soon: (Yet another) BeagleBone Display+CapTouch cape

TL;DR: This creates a cape for the BeagleBone Black using readily available replacement spare TFT Panels and a capacitive touchscreen originally found in cheap tablets.

[ You can also follow the project at Hackaday.io ]

Thanks to the economy of scale, the market of lower-end tablets is flooded with N brands available at almost throwaway prices. Here one can buy a cheap one for less than ₹3000 ($45) and get a decent 7″ screen with capacitive touch. But when it comes to the BeagleBone Black I either saw that most of touch screens available were resistive and panels with capacitive screens were out of my budget, especially when you know you could leveraging the same economy of scale build one 🙂 . I decided to take on the challenge and build a cape for the BeagleBone out of these readily available parts.

This December during my winter break at Jakarta, I shopped at Glodok and Roxy – the electronics and mobile spare part shopping hubs respectively [compare that to HQB, Shenzhen but at a smaller scale though]. What really caught my attention was 7″ TFT panels and capacitive touch digitizers in shops – the LCDs looked very close to the cheap 7″ LCD panels being sold on eBay and other places with a Realtek-based HDMI converter which I wanted to check out lately (like this).

These panels are known by the name AT070TN9x (x=0,2,3,4) and are originally manufactured by Innolux and have 50-pin connectors with a TTL interface (datasheet here). But were the panels being sold in the markets the same AT070TN92 panels, their clones or something different? I decided to find out and requested at the shops to be able take a photo of a few models of these. Got home and tried to match the pinout on the panel with the AT070TN92. Bingo. Perfect match for almost every one of them. Even though the panels have slight dimensional differences in the bezel, they have the same pin layout and should (hopefully) be the same from the inside.

Tablet LCD - back

Here you can see the one which has KR070PM7T written on it. The giveaways – Pins 1 & 2 (VLED+) shorted, 3 & 4 (VLED-) shorted. Pins 45, 49 and 50 are not connected. If you refer the datasheets, the pin definitions seem to align.

I bought two panels and mix-matched them with available capacitive touch panels to see which one fitted the screen the best. I also bought from a nearby shop the ZIF FPC connectors for the display and the touchscreen. The display one is 50 pins, 0.5mm pitch and the touch panel has 6 pins at a 0.5mm pitch. Not exactly breadboard friendly but very PCB friendly. As seen on the image at the top of the post (not this one) the LCD is sitting over the capacitive touch panel and you can see the 6-pin connections there. No reverse engineering needed for the capacitive touch FPC as the connections for the 6 pins are already highlighted in the image!

Okay, now I had to find a good driver for the LCD. I was aware of the TI TFP401 DVI (HDMI) receiver and could get samples if I wanted. But hey, the BeagleBone converts from TTL to HDMI and then I’m gonna convert HDMI to TTL, right? Why not just cut through the layers and wire the display directly? Should just be D0-D15, VSYNC, HSYNC, PCLK, 3V3, backlight, adjust LCD driver resolution, timings and done. Turns out we’re not done, yet.

The catch

Every LCD requires a high voltage to control the twist of individual liquid crystals. This voltage is usually internally generated using charge pumps but turns out that this “dirty” LCD panel ( DirtyPCBs 😛 ) expects the voltages to be supplied externally to it. The LCD expects approximately 10.4V for AVDD, 16V for VGH and -7V for VGL to be supplied to it. Hmm, how do I generate these?

The answer was not very hard to find. I was able to get Allwinner’s A13 based reference design for tablets that use a display (no points for guessing which one) with a 50 pin interface. Looking at their gate voltage generation circuits, we get this:


This app note from Maxim Integrated explains what we’re looking at [scroll down to the end of the appnote]. The AVDD rail draws the maximum current so it gets powered it by the boost converter. Then the diode and the capacitors form a charge pump generating approximately +21 V and -10.4 V from that rail and the Zener regulates it down to the needed voltages. Very cool.

I’ve ordered some boost converters from AliExpress, the ones called SY7201 and XR1151 which are as of now stuck in the Chinese New Year holiday shutdown. Until then I would test with a TPS61061 which I have at hand.

The design

The schematic of a beta cape is almost done and I’m proceeding with routing of the PCB as at the time of writing. Here’s a peek on how the schematic looks right now, the final will be different from this one:


The beta version is to be a locally fabricated quick turnaround prototype so that I get something to work up the software side until the PCB for the first batch is manufactured. The production cape may include termination resistors or a 74LVC322245 buffer.

The capacitive touch side is simple. Two I2C pins and an interrupt pin to inform of touches. Turns out that the LCD uses only 47 out of the 50 signals and I can squeeze these three lines into the same FPC as the display using an extension cable and adapter PCB. So I’ve done it this way as can be seen above. The Linux kernel already contains a driver for the ft5x06 in drivers/input/touchscreen/edt_ft5x06.c . So getting the touch for the LCD should just be equivalent to writing some device tree code to invoke the module.

That was a long post. The next posts would feature testing of the capacitive touch panel and of the prototype.

The BeagleLogic cape: Assembly and Testing

A day before Christmas, I was greeted by a yellow DHL packet from Hong Kong. The panels had finally arrived after 8 days – I had chosen rush (48hrs turnaround) and express shipping, and it did work out well as I received them before Christmas. The boards turned out to be nice. Not dirty at all and at a price which was reasonable to me.


Ready for assembly
Before assembly – the board with applied solder paste and a few parts

I reflowed the board with a hot air gun. Applied solder paste with a toothpick and placed the parts with the naked eye using a pair of good quality tweezers. The passives are all 0603 size (1.5mm x 0.76mm). Since I was yet to receive the batch of BSS138 FETs I ordered, I used a BC847 NPN transistor for the build instead. Reflowing left a couple of solder bridges on the 74LVCH16T245 (now onwards referred to as ‘the 245’) buffer chip (smallest pitched part on board – 0.5mm) which I removed using a solder wick. Then soldered the pin headers manually.

Cleaned up after soldering using nail polish remover. It’s terribly inefficient, and I plan to get a bottle of concentrated isopropyl alcohol (IPA) at some point of time. Here’s the finished result:

“Smoke Test”

A colloquial reference to the first power-on of the assembled circuit. I plugged the cape into the BeagleBone Black and powered on the assembly. The LEDs lit up and the board booted. No smoke or burning smell. Woohoo, test passed, or …


I then probed the BeagleBone P8 header pins which serve as inputs to BeagleLogic using an LED, and did the same with the input pin headers. The LED were brighter on the inputs to the ‘245 than the P8 pins, which was contrary to expectations. A quick check on the schematics and turns out that I wired the 245 to translate from rail B to rail A while connecting the inputs to rail A and the Bone pins to rail B. Whoops! However I quickly resolved the problem by lifting pads 1 and 24 (DIR1 and DIR2) off the PCB (which was simple as those were on the ends) and soldered a bit of magnet wire through and to one of the pads of C1 going to +3.3V. Neatly done, and a lesson not to design and send off PCBs to a fab while pulling an all-nighter 😛 .


Once this was done all was back on track and the circuit worked as expected. The cape as designed did not interfere with the boot-up process due to the action of the transistor pull-down that ensured that the buffer did not drive the P8 pins until SYS_RESETn signal was high.

I tested it with SPI signals upto 24 MHz using the BeagleBoard itself and results are good at a 100Msps sample rate. I could also subject the input pins to 5V freely, without fear that I would burn the board. That’s exactly what the cape is meant for, and I’m happy with the results.

I am planning to give away the surplus boards from the first batch so anyone interested in testing it out can get in touch and depending upon where you are, I may be able to send you one of the surplus unpopulated boards (no parts included) which can be assembled yourself (1 IC, 4 passives, 1 transistor and pin headers). The Cape EEPROM section can be left unpopulated without affecting the functionality.

The design files are now available here. I made some changes to the design after fixing the errata, so the cape version is bumped to 1.1 .

Suggestions and feedback on the cape are welcome. The design is Open Hardware, so you have the freedom to use it and improve it as you like. Let me know if there’s anything that could be added in the cape as there is plenty of board real estate.

Introducing: The BeagleLogic Cape

BeagleLogic Cape - 3D Render

After coding up the BeagleLogic project, I thought that it would be great to have an add-on cape for the project that provides buffering and also makes the inputs of the BeagleBone Black tolerant to TTL logic voltage levels (up to 5.5V) allowing BeagleLogic to debug external projects with ease. Hence introducing the BeagleLogic cape, the 3D render of which you can see above. The design is done in KiCad.

The design source and gerbers will be made available on the BeagleLogic GitHub repository after I physically assemble and verify the design.

Design & Layout

The cape design is simple enough to just have a single layer layout, as you can see in the render above the top layer is entirely a ground plane but for a single trace. Since the top isn’t much populated I added useful information on the top silkscreen including indexing the pin headers on the Bone on both sides.

The logic channels are accessed via 2×14 right angled pin headers. The upper row of headers are the actual logic channels while the bottom row is all GND pins. The pin headers are arranged in a MSB-to-LSB fashion. This means that the rightmost pin when viewed from the top is raw bit 0 of the captured logic samples. Note that sigrok will use the names of the actual Bone pins so bit 0 (Channel 1) is to be identified as P8_45, bit 1 (Ch2) is P8_46 and so on. The numbering is a little non-obvious but it’s because that’s the way the pins are arranged on the BeagleBone GPIO header. But don’t worry as the cape lists the pin ID of each logic channel so you don’t have to look it up in the pin diagrams.

One important point here. Only the first 12 channels can be used by default. To use the last two channels, you must disable eMMC first and solder 0R resistors or bridge the two resistors R8 and R9 on the bottom side to enable them. Otherwise the buffer will drive those two pins and you will damage the eMMC of the board and also void the warranty.

Here’s a shot of the schematic (click to enlarge). This is for reference only with respect to the current board and the released schematic may or may not be the same

Cape Schematic

The active buffer is a TI 74LVCH16T245 or equivalent. The buffer is powered from the VDD_3V3B power rail. The OE pin initially pulled is driven using an arrangment of a BSS138 N-MOSFET whose gate is connected to SYS_RESETn of the Bone. This should ensure that the logic input pins, which are also the system boot pins, are not driven by the buffer until the startup has completed.

This version of the design has a 0R resistor through which the VDD_3V3B powers the VDDA side rail of the 16245. If you remove the short and connect it to a 1.8V supply it should become compatible with 1.8V logic levels. I am however thinking of a better solution to the problem and should address this in the next released design.

There’s the officially required cape EEPROM on the bottom side as well, I presume this could be rendered redundant as the community moves towards the Universal Cape concept. But the footprints are there, just in case.


The first prototype cape has been manufactured by DirtyPCBs.com as a 2-layer Black 10x10cm protopack. It has been shipped as of the time of writing and should reach me next week. I ordered the boards as a Rush order (48h turnaround time) and got it shipped via DHL so that I could have the boards in hand before Christmas rush. I would be using their services further if the boards work out well, looking forward to receive them!

Since I had left space on the panel, and there’s free panelizing so I managed to squeeze some more of my designs into the panel and make the best use of the available real estate. I would write more about those in the coming posts.

So that’s pretty much it. Design suggestions are welcome, and I’ll see if they can be accomodated in the subsequent hardware revisions. Once I test and it all works, the design files will be made available as I have written above.