1554 views|0 replies

3836

Posts

19

Resources
The OP
 

Introducing two methods of measuring CPU cycles for DSP OMAP program time consumption [Copy link]

     In DSP development, we often need to measure the cycles consumed by a function or a section of code. The commonly used profiling and clock() are generally used in simulation. When it comes to emulation on the board, the results measured in this way are not so reliable because we have to consider the storage location and reading time of the data and the code under test on the board. In fact, there are two counting registers TSCL/TSCH on the c64x+ core. They are at the same frequency as the CPU and together represent a 64-bit number. When the CPU runs a cycle, the register is increased by 1, so they can be used to accurately measure the cycles consumed by the CPU in a certain execution segment. Generally, we only use the TSCL register. At 594MHz, 32-bit can be tested to 7s, and TSCH is the upper 32 bits. Unless the entire project is tested, it is generally not used. Let's talk about the specific usage: First, assign the function to be tested to L1P through Link, and assign the used data to L1D. The purpose of this is to eliminate the data and instruction transfer time when the code is executed (otherwise the measured time includes the transfer time of data and instructions from outside the chip to inside the chip). Then, write to TSCL before the function or code to be tested, write register A0 to TSCL, initialize it, and start counting; Finally, read the value of the TSCL register at the end of the function or after the code segment to be tested. The read value is the CPU cycles consumed by the function or code segment. Remember that the CPU must be restarted before each test, because the counter will only stop counting under two conditions, and it cannot be stopped by programming: a. Exit the reset state, that is, after restarting b. The CPU is completely powered down. In general, because these two registers are registers inside the core and have the same frequency as the CPU, using them to measure time is very accurate, and even the consumption of the compressed instruction package fpread statement (1 cycle) has been taken into account. It is especially effective when testing handwritten assembly, and it can even clearly see how many cycles an instruction is delayed. Usage: long time wide range clock measurement unsigned long long t1, t2; t1=_itoll(TSCH,TSCL); code_wait_test; t2=_itoll(TSCH,TSCL); printf("#cycle=%d", t2-t1); short time (7 seconds) narrow range clock measurement: T1=TSCL; ...process code ... T2=TSCL; Printf("#cycle=%d", t2-t1); Method 2, you can also use the biosAPI method LgUns time1=CLK_gethtime(); ...process code ... LgUns time2=CLK_gethtime(); Cpucycles=(time2-time1)*CLK_cpucyclePerhtime; Prinf("#cycle=%d", Cpucycle);

This post is from DSP and ARM Processors
 

Guess Your Favourite
Find a datasheet?

EEWorld Datasheet Technical Support

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京B2-20211791 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号
快速回复 返回顶部 Return list