Performances of Methods for Solving a Linear System of Equations in the Architecture of GPU Accelerator in Case of Small Matrices
Keywords:
GPU accelerator, MAGMA, linear algebra operation, LU factorization, small matrix, performance, batched computation, Random Butterfly Transformation.Abstract
The algebraic operations with a large number of small matrices are a very important issue in science. The solutions for linear system of equations with LU factorization are specific of the mentioned operations that have numerous applications of algebraic operations with small matrices. In this work we consider the performances of methods for solving linear system of equations with batched LU factorization for small complex matrices on the graphic processor NVIDIA K40c.The versions with Partial Pivoting, without Pivoting and Random Butterfly Transformations of batched LU factorization for small matrices are presented and shown, which of these versions is the effective one, in which case we achieve high performance.
References
O.E.B. Messer, J. A. Harris, S. Parete-Koon and M. A. Chertkow, “Multicore and accelerator development for a leadership-class stellar astrophysics code”, In Proceedings of "PARA2012: State-of-the-Art in Scientific and Parallel Computing", 2012.
J. C. Liao Khodayari, A. R. Zomorrodi and C. D. Maranas, “A kinetic model of Escherichia coli core metabolism satisfying multiple sets of mutant flux data”, Metabolic engineering, vol. 25, pp. 50–62, 2014.
A. A. Auer, G. Baumgartner, D. E. Bernholdt, A. Bibireata,V. Choppella, D. Cociorva, X. Gao, R. Harrison, S. Krishnamoorthy, S. Krishnan, C.-C. Lam, Q. Luc, M. Nooijene,R. itzerf, J. Ramanujamg, P. Sadayappan and A. Sibiryakovc, “Automatic code generation for many-body electronic structure methods: the tensor contraction engine’, Molecular Physics, vol. 104, no. 2, pp. 211–228, 2006.
T. Dong, V. Dobrev, T. Kolev, R.Rieben, S.Tomov and J. Dongarra, “A step towards energy efficient computing: Redesigning a hydrodynamicapplication on CPU-GPU”, In IEEE 28th International Parallel Distributed ProcessingSymposium (IPDPS), 2014.
Eun-Jin Im, K. Yelick and R. Vuduc, “Sparsity: Optimization frameworkfor sparse matrix kernels”, Int. J. High Perform. Comput. Appl., vol. 18, no. 1, pp. 135–158, 2004.
J. M. Molero, E. M. Garzón, I. García, E. S. Quintana-Ortí, and A. Plaza, “Poster: A batched Cholesky solver for local RX anomaly detection on GPUs”, PUMPS, 2013.
M. J. Anderson, D. Sheffield and K. Keutzer, “A predictive model for solving smalllinear algebra problems in gpu registers’, In IEEE 26th International Parallel DistributedProcessing Symposium (IPDPS), 2012.
Matrix algebra on GPU and multicore architectures (MAGMA), MAGMA Release 1.6.1, 2015. Online. [Available]: http://icl.cs.utk.edu/magma/
H. V. Astsatryan and E. E. Gichunts, “Performances of methods for solving a linear system of equations in the architecture of GPU accelerator”, Transactions of IIAP NAS RA, Mathematical Problems of Computer Science, vol. 45, pp. 44—52, 2016.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.