Don't let that CPU sit idle: Hardware-aware heterogeneous general matrix multiplication (GEMM)
| dc.contributor.author | Warawa, Johnathan | |
| dc.contributor.author | Chester, Sean | |
| dc.date.accessioned | 2026-04-20T21:43:30Z | |
| dc.date.available | 2026-04-20T21:43:30Z | |
| dc.date.issued | 2026 | |
| dc.description.abstract | Matrix multiplication is a fundamental operation used to train neural networks for machine learning. GPUs are well-optimized for several stages of this operation and are thus used to accelerate the work, however, GPUs must be "hosted" by CPUs that remain underutilized while the GPU works, burning cycles that could be put to use by a more sophisticated, heterogeneous algorithm that makes use of both the GPU and CPU at the same time. In this project, we increase the speed of these operations beyond what a single processor could accomplish by developing a heterogeneous algorithm which efficiently divides and interleaves these operations. | |
| dc.description.reviewstatus | Reviewed | |
| dc.description.scholarlevel | Undergraduate | |
| dc.description.sponsorship | Jamie Cassels Undergraduate Research Awards (JCURA) | |
| dc.identifier.uri | https://hdl.handle.net/1828/23644 | |
| dc.language.iso | en | |
| dc.publisher | University of Victoria | |
| dc.subject | GPU | |
| dc.subject | SIMD | |
| dc.subject | heterogeneous computing | |
| dc.subject | GEMM | |
| dc.subject | linear algebra | |
| dc.subject | tensor cores | |
| dc.subject | Jamie Cassels Undergraduate Research Awards (JCURA) | |
| dc.subject.department | Department of Computer Science | |
| dc.title | Don't let that CPU sit idle: Hardware-aware heterogeneous general matrix multiplication (GEMM) | |
| dc.type | Poster |