Matrix-Vector Multiplication Performances in Multi-Accelerator Architectures

Authors

  • Edita E. Gichunts Institute for Informatics and Automation Problems of NAS RA

DOI:

https://doi.org/10.51408/1963-0146

Keywords:

Multiple GPUs, MAGMA, Matrix-Vector Multiplication

Abstract

This paper presents the performance of symmetric and Hermitian matrix-vector multiplication on two Volta 100 graphics processors in single and double precision. The implementations were performed using the Magma library.
The goal of this work is to present the implementations and performance evaluations of symmetric and Hermitian matrix-vector multiplication on 2 GPUs, and to compare their performance with that of a 1-GPU implementation.

References

NVIDIA, “NVIDIA CUDA Parallel Computing Platform”, [Online]. Available:http://www.nvidia.com/object/cuda_home_new.html, NVIDIA, 2013.

International Business Machines Corporation. System/360 Scientific Subroutine Package (360A-CM-03X) Version II, Programmer’s Manual. IBM Technical Publications Department, White Plains, NY, 1967.

B. S. Garbow, “EISPACK-a package of matrix eigensystem routines”, Computer Physics Communications, vol. 7, no. 4, pp. 179–184, 1974.

C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh, “Basic Linear Algebra Subprograms for Fortran Usage”, ACM Trans. Math. Softw., vol. 5, no. 3, pp. 308–323, 1979.

J. Dongarra, C. B. Moler, J. R. Bunch, and G. W. Stewart, LINPACK Users’ Guide, vol. 8. SIAM, 1979.

J. Dongarra and P. Luszczek, “Linpack benchmark”, Encyclopedia of Parallel Computing, pp. 1033–1036, 2011.

J. Dongarra, P. Luszczek and A. Petitet, The LINPACK benchmark: Past, present, and future. Concurrency and Computation: Practice and Experience. Concurrency and Computation: Practice and Experience, 15:2003, 2003.

TOP500 Supercomputer Site. http://www.top500.org.

J. Dongarra, J. Du Croz, S. Hammarling and R. J. Hanson, “Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs”, ACM Transactions on Mathematical Software (TOMS), vol.14, no. 1, pp. 18–32, 1988.

J. Dongarra, J. Du Croz, S. Hammarling and R. J. Hanson, “An extended set of fortran basic linear algebra subprograms”, ACM Trans. Math. Softw., vol.14, no. 1, pp.1–17, 1988.

J. Dongarra, J. Cruz, S. Hammerling and I. S. Duff, “Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs”, ACM Trans. Math. Softw., vol. 16, no. 1, pp. 18–28, 1990.

J. Dongarra, J. Du Croz, S. Hammarling and I. S. Duff, “A set of level 3 basic linear algebra subprograms”, ACM Trans. Math. Softw., vol. 16, no.1, pp. 1–17, 1990.

E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney and D. Sorensen, LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, third edition, 1999.

CUDA Nvidia. Cublas library. NVIDIA Corporation, Santa Clara, California, 15, 2008.

“MAGMA Matrix Algebra on GPU and Multicore Architectures”, [Online]. Available: http://icl.cs.utk.edu/magma/, 2014.

[Online]. Available: https://docs.nvidia.com/cuda/cublas/index.html#using-the-cublasxt-api.

Downloads

Published

2026-06-01

How to Cite

Gichunts, E. E. (2026). Matrix-Vector Multiplication Performances in Multi-Accelerator Architectures. Mathematical Problems of Computer Science, 65, 63–71. https://doi.org/10.51408/1963-0146