Optimization is the process of transforming the code to make it more efficient (uses fewer resources) while preserving its semantics (has the same meaning).
In general, during optimization less efficient high-level structures are replaced with more efficient low-level structures. Optimization is usually implemented as a series of optimizing transformations (algorithms). The main goal is usually to make the program run faster, and - to a lesser degree - to use less memory.
Optimization is tipically a very CPU and memory intensive process that can significantly increase compilation times.
With that said, optimization algorithms should be designed with a few important objectives in mind:
- The optimization must be correct, it must not change the semantics of the program
- It should increase the performance of the program
- The running time of the optimizing algorithm must be kept at the minimum, it should not delay the whole compilation process significantly
Optimization can happen at different levels of the compilation process:
- Before the compilation via the programmer, by manually rearranging/rewriting parts of the source code.
- After generating the intermediate representation. This is called machine-independent optimization because it is applied to the IR and thus it does not depend on the target architecture.
- During or after generating the target machine code. It is called machine-dependent optimization because
it depends on the specifics of the target architecture. It generally involves steps to optimize
access to variables by caching certain values in CPU registers and using absolute memory
addresses instead of relative ones.
Examples of optimization steps
There are a lots of optimization techniques, let’s see a couple of examples:
Common subexpression elimination: In an expression
(x * y) + (x * y) / 2the compiler realizes that
x * yis part of the expression multiple times and it will calculate this value only once.
Constant folding: Replacing expressions consisting of only constants with their final value.
5 - 2becomes
3during compilation and will be used everywhere in place of the original expression.
Dead code elimination: Removing the instructions that will not affect the correctness of the program. For example, definitions which are not used. This reduces the final size of the executable and eliminates unnecessary computations.
Loop unrolling: Duplicating the body of a loop multiple times. This increases the size of the code but becomes more efficient due to eliminating testing of the loop condition, and decreasing the number of jumps which would hurt performance by impairing the instruction pipeline. Completely unrolling the loop requires the number of iterations to be known at compile time.
You can find a more comprehensive list on Wikipedia.
Next time I’ll talk about code generation.
Stay tuned. I’ll be back.