Don't let that CPU sit idle: Hardware-aware heterogeneous general matrix multiplication (GEMM)

dc.contributor.author	Warawa, Johnathan
dc.contributor.author	Chester, Sean
dc.date.accessioned	2026-04-20T21:43:30Z
dc.date.available	2026-04-20T21:43:30Z
dc.date.issued	2026
dc.description.abstract	Matrix multiplication is a fundamental operation used to train neural networks for machine learning. GPUs are well-optimized for several stages of this operation and are thus used to accelerate the work, however, GPUs must be "hosted" by CPUs that remain underutilized while the GPU works, burning cycles that could be put to use by a more sophisticated, heterogeneous algorithm that makes use of both the GPU and CPU at the same time. In this project, we increase the speed of these operations beyond what a single processor could accomplish by developing a heterogeneous algorithm which efficiently divides and interleaves these operations.
dc.description.reviewstatus	Reviewed
dc.description.scholarlevel	Undergraduate
dc.description.sponsorship	Jamie Cassels Undergraduate Research Awards (JCURA)
dc.identifier.uri	https://hdl.handle.net/1828/23644
dc.language.iso	en
dc.publisher	University of Victoria
dc.subject	GPU
dc.subject	SIMD
dc.subject	heterogeneous computing
dc.subject	GEMM
dc.subject	linear algebra
dc.subject	tensor cores
dc.subject	Jamie Cassels Undergraduate Research Awards (JCURA)
dc.subject.department	Department of Computer Science
dc.title	Don't let that CPU sit idle: Hardware-aware heterogeneous general matrix multiplication (GEMM)
dc.type	Poster

Files

Now showing 1 - 1 of 1

Now showing 1 - 1 of 1