In order to measure routine time and find bottle neck in С++ sources (do profiling), and especially if we have enough time, we can try some of existing profilers: http://stackoverflow.com/questions/67554/whats-the-best-free-c-profiler-for-windows-if-there-are
But simple and quite powerful decision could be using of self-written profiler (for non-thread-safe application). Let's create a C++ class that you just insert at the begining of tested functions or even {} scopes, for example:
f()
{
Profile me;
f1();
f2();
}
f1()
{
Profile me;
...
}
f2()
{
Profile me;
...
}
What does it do?
Class Profile:
static st_profile_indent_count;
LARGE_INTEGER start;
Profile() {
st_profile_indent_count++;
QueryPerformanceCounter(&start)
}
~Profile()
{
st_profile_indent_count--;
LARGE_INTEGER stop;
::QueryPerformanceCounter(&stop);
OutputIndent(st_profile_indent_count);
OutputTimeSpan(stop.QuadPart - start.QuadPart);
}
}
Here is is. Then we can improve it somehow, add comments or scope name:
f()
{
Profile me("f()");
f1();
f2();
}
We can learn Profile to report % of total time load, for example:
f()
{
Profile me("f()"); // Текст для трассировки
for (int i = 0; i< 10000000; i ++)
{
f1();
}
f2();
}
It's clear that f1 is a critical one function here.
PS Here are just practical ideas, not instructions. I've used such a technique in KLA-Tencor project when optimizing 3D engine.
But simple and quite powerful decision could be using of self-written profiler (for non-thread-safe application). Let's create a C++ class that you just insert at the begining of tested functions or even {} scopes, for example:
f()
{
Profile me;
f1();
f2();
}
f1()
{
Profile me;
...
}
f2()
{
Profile me;
...
}
What does it do?
Class Profile:
- measures time intervals between its constructor and destructor calls,
at desctructor saving result to some place: to memory stream or via OutputDebugString, not to file!!. - counts its instances in order to support indenting in reports
static st_profile_indent_count;
LARGE_INTEGER start;
Profile() {
st_profile_indent_count++;
QueryPerformanceCounter(&start)
}
~Profile()
{
st_profile_indent_count--;
LARGE_INTEGER stop;
::QueryPerformanceCounter(&stop);
OutputIndent(st_profile_indent_count);
OutputTimeSpan(stop.QuadPart - start.QuadPart);
}
}
Here is is. Then we can improve it somehow, add comments or scope name:
f()
{
Profile me("f()");
f1();
f2();
}
We can learn Profile to report % of total time load, for example:
f()
{
Profile me("f()"); // Текст для трассировки
for (int i = 0; i< 10000000; i ++)
{
f1();
}
f2();
}
It's clear that f1 is a critical one function here.
PS Here are just practical ideas, not instructions. I've used such a technique in KLA-Tencor project when optimizing 3D engine.
There is an enhancement of scope-based profiler: http://code.google.com/p/high-performance-cplusplus-profiler/
ReplyDelete