Customers complain about excessive power consumption? Teach you how to find the "culprit"
I remember a time when a customer walked into my office with a processor board that was consuming too much power and draining the battery. Since we had proudly claimed that the processor was an ultra-low power device, the burden of proof was on us.
I was about to follow my customary routine of cutting power to the various components on the board one by one until I found the real culprit, when I remembered a similar case from not long ago where the culprit was an LED hanging alone between the power rail and ground, with no current limiting resistor to go with it. Whether the LED eventually failed due to overcurrent or simply because it was bored, I'm not entirely sure, but that's a digression.
From experience, the first thing I did was to check for blinking LEDs on the circuit. Unfortunately, there were no similar, hopeful signs of a problem this time. Also, I discovered that the processor was the only device on the board, and there was nothing else I could point the finger at. My mood was further dampened by the next piece of information the customer dropped: through lab testing, he had found power consumption and battery life to be at expected levels, but after deploying the system in the field, the battery was draining quickly. These types of problems are the most difficult to troubleshoot because they are extremely difficult to reproduce at the “first crime scene.” This adds the unpredictability and challenges of analog to problems in the digital world, which is usually just a predictable, simple world of 1s and 0s.
In the simplest sense, there are two main areas of processor power consumption: core and I/O. When it comes to suppressing core power consumption, I will check things such as: PLL configuration/clock speed, core power rails, and the amount of operations the core is running. There are many ways to make the core power consumption lower, such as: reducing the core clock speed, or executing certain instructions to force the core to stop running or go to sleep/hibernate. If I suspect that I/O is eating up all the power, I will focus on the I/O power supply, the I/O switching frequency, and the load it drives.
These were the only two areas I could investigate. It turned out that the problem had nothing to do with the core and therefore had to be I/O related. At this point, the customer stated that he was using the processor purely for compute and had very little I/O activity. In fact, most of the available I/O interfaces on the device were not being used.
"Wait! There are some I/Os you're not using? You mean these I/O pins are unused. How did you connect them?"
"I didn't connect them to anything!"
"I see!"
It was an ecstatic moment, I had finally found the problem, and although I didn't scream along the way, it took me a while to get over my excitement and sit down to explain it to him.
A typical CMOS digital input looks like this:
Figure 1. Typical CMOS input circuit (left) and CMOS level logic (right)
When this input is driven at the recommended high (1) or low (0) levels, the PMOS and NMOS FETs turn on one at a time, never at the same time. There is an uncertainty zone in the input drive voltage, called the "threshold region," where both the PMOS and NMOS may be partially turned on at the same time, creating a leakage path between the supply rail and ground. This can happen when the input is floating and encounters stray noise. This explains both the fact that the power dissipation on the customer board is high, and why the high power dissipation occurs randomly.
Figure 2. Both the PMOS and NMOS are partially turned on, creating a leakage path between power and ground.
In some cases, this can cause a condition like latch-up, where the device continues to draw too much current and eventually burns out. This is arguably easier to find and fix because there is smoke in the device, and that is the smoking gun. The problems my customers report are more difficult to deal with because they are fine when you test them in a cool environment in the lab, but can cause a lot of trouble when you get them into the field.
Now that we know the source of the problem, the obvious solution is to drive all unused inputs to a valid logic level (high or low). However, there are some subtleties to be aware of. Let's look at a few more cases where improper handling of CMOS inputs can cause trouble. We need to broaden the scope and consider not only inputs that are completely disconnected/floating, but also inputs that appear to be connected to appropriate logic levels.
If you simply connect the pin to the supply rail or ground through a resistor, you should be careful about the size of the pull-up or pull-down resistor used. It, along with the source/sink current of the pin, can shift the actual voltage of the pin to an undesired level. In other words, you need to make sure the pull-up or pull-down resistor is strong enough.
If you choose to actively drive the pin, be sure that the drive strength is good enough for the CMOS load you are using. If not, the noise surrounding the circuit may be strong enough to overwhelm the drive signal and force the pin into an unintended state.
A processor that works fine in the lab may inexplicably reboot in the field because noise is coupled into the RESET line that does not have a strong enough pull-up resistor.
Figure 3. Noise coupled into the RESET pin with a weak pull-up resistor can cause the processor to reset.
Imagine a situation where the CMOS input belongs to a gate driver controlling a high power MOSFET/IGBT that accidentally turns on when it should be off! That's terrible.
Figure 4. Noise overdrives a weakly driven CMOS input gate driver, causing a high-voltage bus short.
Figure 6. ADSP-SC58x/ADSP-2158x data sheet quick reference
Another related but less obvious problem situation is when the driving signal has a very slow rise/fall time. In this case, the input may stay at midscale for a certain period of time, causing various problems.
Figure 5. CMOS inputs rise/fall slowly, causing a temporary short circuit during the transition.
Now that we have discussed some of the problems that can occur with CMOS inputs in a general sense, it is worth noting that some devices are designed to handle these problems better than others. For example, devices with Schmitt trigger inputs are better able to handle signals with high noise or slow edges.
Some of our latest processors are aware of this issue and have taken special precautions in their designs or have published clear guidelines to ensure smooth operation. For example, the ADSP-SC58x/ADSP-2158x datasheet clearly states that some pins have internal termination resistors or other logic circuitry to ensure that these pins do not float.
Finally, as always, it is important to get all the finishing touches right, especially with CMOS digital inputs.
Featured Posts