Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Gcc opt

1.764 Aufrufe

Veröffentlicht am

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Gcc opt

  1. 1. The lesser known GCC optimizations Mark Veltzer mark@veltzer.net
  2. 2. Using profile information for optimization
  3. 3. The CPU pipe line ● Most CPUs today are built using pipelines ● This enables the CPU to clock at a higher rate ● This also enables the CPU to use more of it's infrastructure at the same time and so utilizing the hardware better. ● It also speed things up because computation is done in parallel. ● This is even more so in multi-core CPUs.
  4. 4. The problems with pipe lines ● The pipe line idea is heavily tied to the idea of branch prediction ● If the CPU has not yet finished certain instructions and sees a branch following them then it needs to be able to guess where the branch will go ● Otherwise the pipeline idea itself becomes problematic. ● That is because the execution of instructions in parallel at the CPU level is halted every time there is a branch.
  5. 5. Hardware prediction ● The hardware itself does a rudimentary form of prediction. ● The assumption of the hardware is that what happened before will happen again (order) ● This is OK and hardware does a good job even without assistance. ● This means that a random program will make the hardware miss-predict and will cause execution speed to go down. ● Example: "if(random()<0.5) {"
  6. 6. Software prediction ● The hardware manufacturers allow the software layer to tell the hardware which way a branch could go using hints left inside the assembly. ● Special instructions were created for this by the hardware manufacturers ● The compiler decides whether to use hinted branches or non hinted ones. ● It will use the hinted ones only when it can guess where the branch will go. ● For instance: in a loop the branch will tend to go back to the loop.
  7. 7. The problem of branching ● When the compiler sees a branch not as part of a loop (if(condition)) it does not know what are the chances that the condition will evaluate to true. ● Therefor it will usually use a hint-less branch statement. ● Unless you tell it otherwise. ● There are two ways to tell the compiler which way the branch will go.
  8. 8. First way - explicit hinting in the software ● You can use the __builtin_expect construct to hint at the right path. ● Instead of writing "if(x) {" you write: "if(__builtin_expect((x),1)) {" ● You can wrap this in a nice macro like the Linux kernel folk did. ● The compiler will plant a hint to the CPU telling it that the branch is likely to succeed. ● See example
  9. 9. Second way - using profile information ● You leave your code alone. ● You compile the code with -fprofile-arcs. ● Then you run it on a typical scenario creating files which show which way branches went (auxname.gcda for each file). ● Then you compile your code again with -fbranch- probabilities which uses the gathered data to plant the right hints. ● In GCC you must compile the phases using the exact same flags (otherwise expect problems) ● See example
  10. 10. Second way - PGO ● This whole approach is a subset of a bigger concept called PGO – Profile Generated Optimizations ● This includes the branch prediction we saw before but also other types of optimization (reordering of switch cases as an example). ● This is why you should use the more general flags -fprofile-generate and -fprofile-use which imply the previous flags and add even more profile generated optimization.