USING SOURCE CODE TRANSFORMATION TO AUTOMICALLY OPTIMIZE ALGORITHM-ORIENTED PROGRAMS
Objective
More automated compared to OpenMP. (OpenMP enables multiprocessing but requires manual notations)
Higher-level, multithreaded compared to compiler PGO. (PGO usually focuses on low-level modifications like instructions)
Some other changes including IO/memory/variable scope...
Idea overview
An input program is parsed first to get a graph, each node stands for a possible modification, nodes are connected by edges for dependency and conflicts
A not necessarily feasible random binary vector $m$ is drawn from a distribution $\pi$, each bit in it corresponds to whether a possible modification is applied. It is fixed according the dependency and conflicts information from the graph. Then a modified program can be obtained from $m$
Modified programs are compiled and run, its behaviour is compared to the original program
According to the result, $\pi$ is updated in order to prefer a better solution space
Finally, a best modified candidate is selected for output.
Requirement: a data generator for testing application behaviour.
Technical overview
The optimizer takes C++ source code as input and output processed C++ code and a summerized report
C++ code analysis and modification is based on Clang and LLVM
Both AST visiter and AST matcher are used to match and rewrite code patterns
The whole task is splitted into batches and sent to a PBS system, utilizing nodes from a cluster
Singularity container is used to satisfy GLIBC dependencies and create an isolated environment for untrusted code execution
Around 760 code samples are collected from Codeforces and then used for testing and benchmarking the optimizer
The implementation is mainly in C++, with some Python, Bash, JavaScript and PHP
Results
After the transformation, the running time reduced by 14.5% on average. The baseline (running the original program for all epochs) obtained 2.9% decreasement (due to running time fluctuation).
[1] Stelios Sidiroglou-Douskos, Eric Lahtinen, Fan Long, Martin Rinard:
Automatic Error Elimination by Horizontal Code Transfer across Multiple
Applications. PLDI'15, 2015.
[2] Gábor Antal, Dávid Havas, István Siket, Árpád Beszédes, Rudolf
Ferenc, József Mihalicza: Transforming C++11 Code to C++03 to Support
Legacy Compilation Environments. Source Code Analysis and Manipulation
(SCAM), 2016 IEEE 16th International Working Conference, 2016
[3] Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando
Solar-Lezama: Towards Optimization-Safe Systems: Analyzing the Impact of
Undefined Behavior. SOSP'13, 2013
[4] Dehao Chen: Taming hardware event samples for fdo compilation.
CGO '10 Proceedings of the 8th annual IEEE/ACM international symposium
on Code generation and optimization, P. 42-52. 2010
[5] “Clang” CFE Internals Manual
https://clang.llvm.org/docs/InternalsManual.html
[6] Clang doxygen documentation https://clang.llvm.org/doxygen/
[7] GCC, the GNU Compiler Collection https://gcc.gnu.org/
[8] LLVM clang compiler infrastructure https://llvm.org/
[9] Kurtzer GM, Sochat V, Bauer MW: Singularity: Scientific containers
for mobility of compute. PLoS ONE 12(5): e0177459. 2017
[10] Xin Dong, Gene Cooperman, and John Apostolakis: Multithreaded
Geant4: Semi-automatic Transformation into Scalable Thread-Parallel
Software. Euro-Par 2010 - Parallel Processing, P. 287-303. 2010
[11] D. Dheeraj, B. Nitish, Shruti Ramesh: Optimization of Automatic Conversion of Serial C to Parallel OpenMP. Cyber-Enabled Distributed
Computing and Knowledge Discovery (CyberC), 2012 International
Conference. 2012
[12] Gerald Aigner, Urs Hölzle: Proceedings chapter Eliminating virtual
function calls in C++ programs. ECOOP ‘96 Conference Proceedings,
Springer Verlag LNCS 1098, P. 142-166. 1996
[13] Clang performance drop for specific C++ random number
generation https://stackoverflow.com/questions/23240586/
clang-performance-drop-for-specific-c-random-number-generation
[14] Olaf Krzikalla, Kim Feldhoff, Ralph Müller-Pfefferkorn, Wolf-Gang
E. Nagel: A Source-to-Source Transformator for SIMD-Optimizations.
Euro-Par 2011: Parallel Processing Workshops. 2011.
[15] M. Alkhalaf, A. Aydin, and T. Bultan. Semantic differential repair
for input validation and sanitization. International Symposium on Software
Testing and Analysis, ISSTA '14 P. 225–236, 2014.
[16] M. Gabel, J. Yang, Y. Yu, M. Goldszmidt, and Z. Su. Scalable and
systematic detection of buggy inconsistencies in source code. Proceedings of
the 25th Annual ACM SIGPLAN Conference on Object-Oriented
Programming, Systems, Languages, and Applications, OOPSLA 2010, P.
175–190. 2010
[17] F. Logozzo and T. Ball. Modular and verified automatic program
repair: Proceedings of the ACM International Conference on Object Oriented
Programming Systems Languages and Applications, OOPSLA '12',
P.133–146. 2012
[18] S. Misailovic, D. Kim, and M. Rinard: Parallelizing sequential
programs with statistical accuracy tests. ACM Transactions on Embedded
Computing Systems (TECS) Volume 12, issue 2, Article 88. 2013
[19] Kazuhiro Suzuki1, Dongvu Tonien, Kaoru Kurosawa, and Koji Toyota: Birthday Paradox for Multi-collisions Information Security and
Cryptology – ICISC 2006, P. 29-40. 2006