J*******i 发帖数: 2162 | 1 最近需要分析一个程序的性能瓶颈,所以需要profile这个程序,分析每个function里
花的时间
以前没怎么弄过,请教大家什么工具最方便易用而且准确?gprof如何?
先谢过各位猛将~ |
T*******i 发帖数: 4992 | 2 aqtime
【在 J*******i 的大作中提到】 : 最近需要分析一个程序的性能瓶颈,所以需要profile这个程序,分析每个function里 : 花的时间 : 以前没怎么弄过,请教大家什么工具最方便易用而且准确?gprof如何? : 先谢过各位猛将~
|
w***g 发帖数: 5958 | 3 gprof对C++基本失效。我以前用过vtune,基本达到目的。不知道aqtime怎么样。
【在 J*******i 的大作中提到】 : 最近需要分析一个程序的性能瓶颈,所以需要profile这个程序,分析每个function里 : 花的时间 : 以前没怎么弄过,请教大家什么工具最方便易用而且准确?gprof如何? : 先谢过各位猛将~
|
h***i 发帖数: 1970 | 4 google profiler
【在 J*******i 的大作中提到】 : 最近需要分析一个程序的性能瓶颈,所以需要profile这个程序,分析每个function里 : 花的时间 : 以前没怎么弄过,请教大家什么工具最方便易用而且准确?gprof如何? : 先谢过各位猛将~
|
m***i 发帖数: 2480 | 5 To find which function is using most of the time (hot functions) use gprof.
It'll create a trace file and you can use gprof to generate a report later
on.
For a particular hot function, if you'd like to know whether it is spending
time on memory / floating point or integer instructions, use intel VTune.
If most of the time is spent in memory instructions, cache optimizations are
important (tiling, prefetching).
If there are a lot of floating point instructions, compiling it to 64 bit
binary coul
【在 J*******i 的大作中提到】 : 最近需要分析一个程序的性能瓶颈,所以需要profile这个程序,分析每个function里 : 花的时间 : 以前没怎么弄过,请教大家什么工具最方便易用而且准确?gprof如何? : 先谢过各位猛将~
|
J*******i 发帖数: 2162 | 6 Thanks a million!
.
spending
are
registers
【在 m***i 的大作中提到】 : To find which function is using most of the time (hot functions) use gprof. : It'll create a trace file and you can use gprof to generate a report later : on. : For a particular hot function, if you'd like to know whether it is spending : time on memory / floating point or integer instructions, use intel VTune. : If most of the time is spent in memory instructions, cache optimizations are : important (tiling, prefetching). : If there are a lot of floating point instructions, compiling it to 64 bit : binary coul
|
c*******3 发帖数: 21 | 7 If using Mac, I found shark to be very useful:
http://developer.apple.com/tools/sharkoptimize.html
【在 J*******i 的大作中提到】 : 最近需要分析一个程序的性能瓶颈,所以需要profile这个程序,分析每个function里 : 花的时间 : 以前没怎么弄过,请教大家什么工具最方便易用而且准确?gprof如何? : 先谢过各位猛将~
|
t****t 发帖数: 6806 | 8 You don't really need 64-bit to enable SSE/2/3. gcc have separate option for
this. 64-bit just enable them by default (as it's also required by ABI).
.
spending
are
registers
【在 m***i 的大作中提到】 : To find which function is using most of the time (hot functions) use gprof. : It'll create a trace file and you can use gprof to generate a report later : on. : For a particular hot function, if you'd like to know whether it is spending : time on memory / floating point or integer instructions, use intel VTune. : If most of the time is spent in memory instructions, cache optimizations are : important (tiling, prefetching). : If there are a lot of floating point instructions, compiling it to 64 bit : binary coul
|
m***i 发帖数: 2480 | 9
for
Thanks. good to know
【在 t****t 的大作中提到】 : You don't really need 64-bit to enable SSE/2/3. gcc have separate option for : this. 64-bit just enable them by default (as it's also required by ABI). : : . : spending : are : registers
|
z******i 发帖数: 59 | 10 To be frank, this is one most difficult problem.
If in windows platform:
- If it is Intel CPU, intel vtune.
- If it is AMD CPU, AMD's free profiler (forgot the name, codeanalyst?)
- Microsoft's free kernrate tool
If it is windows platform, IO limited or problem is not in your own code (
like
graphic driver problem).
- use system internal's process explorer profiler, give u call stacks.
If it is in Linux
- oprofile
- valgrid?
If you are almost sure where is your problem.
- just use -- timerstart(
【在 J*******i 的大作中提到】 : 最近需要分析一个程序的性能瓶颈,所以需要profile这个程序,分析每个function里 : 花的时间 : 以前没怎么弄过,请教大家什么工具最方便易用而且准确?gprof如何? : 先谢过各位猛将~
|
s****i 发帖数: 150 | 11 xperf非常好,你可以试试。
不过想要profile memory,基本没戏。。。
【在 J*******i 的大作中提到】 : 最近需要分析一个程序的性能瓶颈,所以需要profile这个程序,分析每个function里 : 花的时间 : 以前没怎么弄过,请教大家什么工具最方便易用而且准确?gprof如何? : 先谢过各位猛将~
|
c****e 发帖数: 1453 | 12 On windows, XPerf is also a very good tool.
【在 z******i 的大作中提到】 : To be frank, this is one most difficult problem. : If in windows platform: : - If it is Intel CPU, intel vtune. : - If it is AMD CPU, AMD's free profiler (forgot the name, codeanalyst?) : - Microsoft's free kernrate tool : If it is windows platform, IO limited or problem is not in your own code ( : like : graphic driver problem). : - use system internal's process explorer profiler, give u call stacks. : If it is in Linux
|