Author Archive for Allan Wang

28
Mar

Adventures in VLSI: D flip-flop with asynchronous set/reset

D flip-flop with asynchronous set/reset (top), compared to a D-FF without set/reset (bottom). The major change is replacing the inverters with NAND gates.

14
Feb

Adventures in VLSI: edge-triggered D flip-flop layout

The next cell is an edge-triggered D flip-flop. Baker’s book introduces the transmission gate version first and from my simulations it seems to work at higher frequencies and use less power than the standard NAND gate implementation.

Here’s the initial layout. It is 190 λ x 105 λ = 20,000 λ2 (λ is half gate Ldrawn)

After some tweaking, now it’s 154 λ x 105 λ = 16,200 λ2

With some more iterations, I’ve kept the size the same but now only two metal layers are used, freeing up vertical routing resources. Also, the power rails are now on M1 instead of higher metal, so that power vias don’t block routing.

12
Feb

Adventures in VLSI: 4:1 and 16:1 multiplexer layout

I’m doing some digital VLSI design without too much experience. First up is a 4:1 mux implemented using transmission gates:

And this is why using the autoplacer/autorouter for small cells a bad idea:

After getting some protips, I re-layed it out. It is 230 λ x 82 λ = 18,860 λ2 (λ is half gate Ldrawn)

And here’s the 16:1 mux in 914 λ x 175 λ = 160,000 λ2

07
Feb

Current starved ring oscillator VCO for PDN characterization

As part of my ongoing research project, I designed a pretty simple wideband VCO in a 45nm process without too much effort. This will clock a bank of inverters to simulate a digital load at different clock frequencies to characterize the power supply impedance as seen on-die. Of course, a single current impulse is sufficient to characterize the entire impedance curve across all frequencies, but this requires precise knowledge of the current profile and its di/dt, which is difficult to measure on a real chip. I’m using this approach because it is as described in Altera’s paper on On-Chip PDN Noise Characterization and Modeling. What will be measured is more like VDD droop vs clock frequency, of which impedance can be vaguely related to.

It would be best to implement a scheme like On-Die Power Supply Noise Measurement Techniques, but I only have a limited amount of time to design this chip and space is limited.

Here are its pre-layout specs:
VDD = 0.9V, 3 stages.
Tuning range 500MHz – 9GHz for Vtune = 0.1V to 0.9V
Current consumption @ 0.9V, 27C: 60uA @ 500MHz, 280uA @ 9GHz
Current consumption @ 1.0V, 100C: 130uA @ 1.9GHz, 382uA @ 11.1GHz
Does not oscillate for smaller Vtune’s.
Poor temperature sensitivity especially at low frequencies.
Duty cycle is not 50% especially at lower frequencies.
A series of flip-flops will follow the output, which will divide the output clock by factors of 2 successively. This restores duty cycle back to 50%. The frequency dividers work up to 20GHz.

The thermal time constant will be very short due to the test chip being small, so the poor temperature performance is not too bad for characterizing PDN impedance. The generated frequency will be measured so it does not matter if it varies with total power dissipation. However, it is probably nice to have it be more stable. I’ll have to add a temperature-insensitive bias circuit.

28
Jan

XMOS SPI speed

On the ARM Cortex-M3 96MHz mbed, I was able to SPI write out 3456 bits of data at 1.6kHz — plus some integer math and memory look-up time. How fast is the XMOS, where each core is clocked at 400MHz, four times the speed of the mbed? Here’s my XMOS code, which does SPI writes in 32-bit chunks. There is no hardware peripheral specifically for SPI, so you have to make use of their buffered I/O ports.

I clocked it at 2.5kHz (108 transfers of 32-bits), which isn’t that much faster than my mbed code. The mbed had serial at 16MHz, but the XMOS code is 12.5MHz, so the comparison isn’t 100% fair. Below is the main code that writes out data. Note that the XMOS has to do some extra work for each transfer. Reconfiguring the port seems to take around 1 us, which is a waste of time.

Of course, the majority of the time is spent in the serial transfer itself. If there were absolutely no overhead, the maximum transfer rate would have been 12.5MHz / 3456 = 3.61kHz. The nature of XMOS buffered I/O and SPI mode 0 adds around 30% overhead.

void send_word(unsigned int data) {
	//check MSB of data and set MOSI line accordingly
	//skip check if you need SPI mode 3, and set clock pattern to 0xFFFFFFFF

	if (data & 0x80000000)
		configure_out_port(mosi, blk2, 1);
	else
		configure_out_port(mosi, blk2, 0);

	//reverse data to big endian order
	mosi <: (bitrev(data) >> 1);

	//clock pattern
	sclk <: 0x55555555;
	sclk <: 0x55555555;
	// wait until clocked output operation is finalized.
	sync(sclk);
}
20
Jan

RGB Propeller Clock – Final Day

What can 2 ECE’s, a MechE and a CS major do in only 5 days of hacking? This is a project done for CMU’s hack-a-thon session called Build18. The LED boards were previously designed by me but all other work was done in the week-long session.

96 RGB LED’s driven by 12x TLC5947 LED drivers. 12-bit PWM resolution per color channel per LED.
Diameter: 1.5 feet
Full brightness: 6 amps of LED current
Didn’t have time to figure out how to transfer through motor, so it is battery powered by 2x 4AA NIMH cells which have the perfect discharge curve for what the LED drivers need.

Don’t forget the obligatory Jimmy Wales face:

Flickering is due to interaction between camera frame rate and update rate of the propeller clock and is not that bad in real life. Thanks to all my group members who have done the majority of the work!

Costas Akrivoulis – Junior ECE, Coding, mounting, lots of stuff

Jitu Das – Junior CS, algorithms for display and pre-processing of images

John Howland – Senior MechE, designed the mount and all mechanical parts — the hardest part for us ECE majors

Allan Wang – Senior ECE, Designed LED boards, hardware

Much more technical details will be posted shortly after the dust settles from demo day.

16
Jan

RGB Propeller Clock: Day 1/5 — mbed Benchmarking

Start from Day 0: Introduction here

I’ll first start off with some speed tests to see the maximum refresh rate achievable with the mbed, which has a 96 MHz ARM Cortex-M3 processor.

Interfacing the MCU to the TLC5947′s is interesting because the TLC5947′s run off 5V, but the MCU runs off 3.3V. The TLC5947 datasheet also states the maximum clock is 15 MHz when cascaded. I tested with 2 cascaded sticks (8x TLC5947), and found I can clock SPI to 24 MHz (maximum speed supported by the MCU)! With 3 cascaded sticks, only 16 MHz works. This is still pretty amazing because the MCU output is driving a large capacitive load with voltage logic levels mismatched. From looking at the scope captures, the difference in logic high amplitude between 16 MHz vs 24 MHz is very small, yet one fails and the other works. Even dialing down the TLC5947 voltage doesn’t fix anything until they’re brought to around 3.6V. All of this isn’t a huge deal, because SPI communication is at most 5% of the total time spent in each cycle.

The test is to cycle through all the hue’s for each LED, displaying a rainbow across the stick. Therefore a conversion must be done from HSV space to RGB space. Then to display each pixel, a gamma correction calculation must be performed for each color, since the human senses are logarithmic. This involves doing an exponential and a few floating point operations. How fast can the ARM perform these operations?

There are 3 sticks cascaded, for a total of 96 RGB LED’s — 288 pixels with 12-bit resolution. My optimized code refreshes at 256 Hz, which is great for a stationary display, but probably a bit on the low side for a rapidly moving one, like using it for a propeller clock. I noticed a few tricks that improved performance by about 20% total:

  1. Explicitly cast all constant floating point numbers into floats. Otherwise the C compiler will default them to doubles and the ARM will perform double calculations, which is slower.
  2. Turn division of floating point numbers into multiplication.
  3. When possible, use integers as an operand for floating point math. For example, if hue is a float, hue / 60 is actually faster than hue * 0.0166667.

My messy benchmark code can be found here. P.S. the mbed interrupt API is really neat and simple to use.

I’m sure removing the floating point code entirely will greatly speed up the refresh rate, but this is harder to develop. Another thing to try is to use a lookup table to save the use of the exp() function and a floating point multiply. Just tried the LUT and got 600 Hz — pretty good improvement.

Then I converted the HSV to RGB code into pure integer math. This got it to 1590 Hz! Maybe the XMOS won’t be needed after all.

Tomorrow we’ll give the XMOS dev board a try, which has 1600 MIPS — almost 16x the speed of the mbed.

15
Jan

RGB Propeller Clock: Day 0/5 — Introduction

As part of my school’s electronics hacking/tinkering session called Build18, I’ll finally be spinning the RGB sticks I designed over two years ago. Starting this week on Sunday Jan. 15, groups will only have 5 days to complete their project. This “annual engineering festival” is a great way for ECE students of all years to participate in designing a practical embedded system with help from their peers.

This is the inspiration for the project:

I believe this is one of the best ones out there right now. But note that only 8 colors are possible — each R, G, B pixel can only be off or totally on. My RGB sticks use the TLC5947 to achieve 12-bit resolution per color, for 4096 viewable levels. However, this greatly complicates the controller. The refresh rate must be very high since the LED’s are in rapid motion, but adding resolution will greatly slow down the update rate.

Power will be transmitted through the spinning motor to the RGB sticks. I’m going to try to spin as many of the chained sticks as possible, each one adding 8″ to the diameter of the propeller clock. I have no idea if the mechanics of this will work, or even if modifying the motor to transmit power will work. I don’t think we’ll get DC power out of the motor due to the three brushes alternating contacts, but I could be entirely wrong. If all else fails, I’ll just bruteforce it and just spin 4x AA batteries and live with the few hour battery life :)

To communicate with the microcontroller on the spinning device, a pair of nRF2401A transceivers will be used. In ground frame, there will be an additional microcontroller hooked up to sensors and controlled by a laptop which will send high-level command signals to the spinning device.

For maximum refresh speed and awesomeness, we’ll try using the extreme overkill XMOS XC-1A dev board, which has a XS1-G4 micro, capable of operating at 1600 MIPS with high parallelism, perfect for generating data to send to shift registers really fast. If this ends up being too complicated, the fallback plan is to use an mbed, which has an ARM Cortex-M3 running at a measly 96 MIPS. Each RGB stick has 32x RGB LED’s, and the shift registers are 12-bits, so there’s 1152 bits to update per stick per refresh. Displaying an intensity correctly involves doing some math to perform gamma curve correction, or resorting to using a 4096×12 bit lookup table per pixel, which can be slow.

I’ll be posting progress updates throughout the week. Day 1: mbed Benchmarking

15
Aug

Yet another PCB business card (with touchpad!), part 1

I thought I’d jump on the PCB business card bandwagon and make one of my own. Except mine is going to be cooler and more functional, obviously ;) . Microchip has PIC microcontrollers in the $2 – $3 range that have 28 (!) capacitive inputs. This is big enough for a 16×12 matrix touchpad, for example. Also, they have a great app note on touchpad design and software algorithms for touch detection. It seems like using mutual capacitance is the way to go, allowing for multi-touch. My plan is using the PCB copper itself, making it very cheap and not reliant on an external touchpad component, which is difficult to obtain in small quantities. This is easy enough to do on a business-card sized PCB.

However, what use is a touchpad sensor if there’s no way to display information to the user? I was inspired by EEVBlog Dave’s uCalc design, which sandwiches a coin cell battery in between two PCB’s. Not only does this allows an extremely thin final product, but it makes use of the 3rd dimension to add more features! I can put LED’s on the bottom PCB and shine their light through the top PCB so that the user can see it. Something neat would be where a touch causes an outward ripple effect, or something.

My design will have two PCB’s: one on top, silkscreened with the usual business card info. All the components will be surface-mounted on the bottom surface of this top PCB, so that the top surface is entirely flat. The bottom one will have a grid of LED’s, arranged in a matrix. The top PCB will be very thin, 0.8mm, and the soldermask color will be white. Hopefully this will allow the light from the bottom LED’s to shine through the top. Of course, I won’t be able to use a regular LED matrix due to height reasons. Instead I’ll have to solder all the 60+ LED’s manually… To communicate between the two boards, a low profile, 1.8mm height SMT connector from Samtec is used.

Now, a coin cell battery is nominally 3.0V, but drops to 2.5 – 2.8V in operation. This is enough to light up red, orange, or yellow LED’s, but green and blue LED’s require over 3V to light up. So, I have to decide on whether to use two batteries. The single battery version is only 3.6mm thick (note that a credit card is 0.8mm thick). The stacked version is 4.8mm thick. I could try to use two batteries but space them out so they don’t need to stack, but then I lose the nice rectangle touchpad shape I have.

The difficult part will be controlling a matrix of LED’s from the PIC, which is relatively slow at 4 MIPS. Since my design will be low current, I’ve decided on using 74HC595 shift registers to both source and sink current for the matrix rows and columns. I’d like to be able to do PWM dimming of the LED matrix as well. From a quick prototype, it takes around 50us to update one row (the majority of the time is spent deciding what to send to the shift registers, not the communications themselves). At a target update frequency of 100Hz and 8 rows to update, I can only get 16 levels of PWM dimming, using up 64% of the processor, which is unfortunate… However I don’t really see a better way of doing this. I may end up getting an ATtiny or ATmega slave to do the LED matrix control since they are 16 MIPS. This would free up processing time for the capacitive sensing algorithms, which aren’t trivial anyway.

The total costs are less than $20, including PCB costs (thanks cheap prototyping manufacturer Seeed Studio). Once the boards come back in a few weeks I’ll post some pics and some initial results. For now, here’s a preview of the top board:


(click to enlarge)


(click to enlarge)

06
May

CMOS Op-Amp Design

Here’s a two-stage op-amp I designed for an undergraduate class. Fun times.

Spec Vcm = -0.1V Vcm = 0V Vcm = 0.1V
Open Loop DC Gain > 5000 V/V 5130 V/V 5074 V/V 4996 V/V
Unity Loop Gain Freq. > 10 MHz 10.89MHz 11.0MHz 11.07MHz
Loop Gain Phase Margin > 60° 60.48° 60.25° 60.09°
Input Offset Voltage < 3 mV 11.91uV -3.943uV -15.89uV
Slew Rate Positive > 10 V/us 56.24 V/us 59.34 V/us 56.11 V/us
Slew Rate Negative < -10 V/us -10.2 V/us -10.18 V/us -10.17 V/us
Output Swing Low -300mV -598mV -586mV -545mV
Output Swing High 300mV 497mV 557mV 554mV

 

Current 96.26 uA
Power 115.5 uW
Area 46.68 um^2
Compensation Cap 0.13 pF
Figure of Merit 1.427