I'd recommend using Hatari debugger's profiling functionality instead:
* It doesn't have overhead like -pg has and it's more accurate (on ST/STE, cycle accurate as Hatari emulation for them is cycle accurate)
* it can give much more information (callgraphs for instruction counts & cycle counts, cycle counts per memory-address...)
* You don't need to recompile anything, binaries just need to be unstripped (have debug symbols)
See: https://hg.tuxfamily.org/mercurialroot/ ... #Profiling
To see function level information, it's enough to compile binaries with debug symbols, as Hatari v2.1 debugger automatically loads GCC's a.out and Atari DRI/GST debug symbols from the binaries.
Main issues one needs to be aware with function level profiling info is quality of debug symbol data. E.g. some MiNTlib versions strip debug symbols away from performance-wise important functions, in which case noticeable perf costs get instead assigned to functions preceeding those. And if loops have labels (debug symbols), call/visit counts may cause some confusion. If in doubt, all these issues can be resolved by looking at the profiler disassembly output (assumes one is familiar with what function calls look like at m68k assembly level).