Document Type

Thesis

Date of Award

Spring 5-31-2012

Degree Name

Master of Science in Computer Engineering - (M.S.)

Department

Electrical and Computer Engineering

First Advisor

Sotirios Ziavras

Second Advisor

Roberto Rojas-Cessa

Third Advisor

Edwin Hou

Abstract

Matrix multiplication is at the core of high-performance numerical computation. Software methods of accelerating matrix multiplication fall into two categories. One is based on calculation simplification. The other one is based on increasing the memory access efficiency. Also matrix multiplication can be accelerated using vector processors. In this investigation, various matrix multiplication algorithms and the vector-based hardware acceleration method are analyzed and compared in terms of performance and memory requirements. Results are shown for Intel and Xilinx FPGA platforms. They show that when the CPU is fast, Goto's algorithm runs faster than Strassen's algorithm because the data access speed is the bottleneck in this case. On the contrary, when the CPU is slow, Strassen's algorithm runs faster because the computation complexity becomes the key factor in this case. Also, the results show that SIMD platforms, such as Intel Xeon and SIMD extensions and an in-house developed VP (Vector co-Processor), for an FPGA, can accelerate matrix multiplication substantially. It is even shown that the VP runs faster than MKL (Intel's optimized Math Kernel Library). This is because not only can the VP take advantage of larger vector lengths but it also minimizes inherent hardware overheads.

Recommended Citation

Li, Gang, "High-performance matrix multiplication on Intel and FGPA platforms" (2012). Theses. 136.
https://digitalcommons.njit.edu/theses/136

Download

Included in

Computer Engineering Commons

COinS

Theses

High-performance matrix multiplication on Intel and FGPA platforms

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Theses

High-performance matrix multiplication on Intel and FGPA platforms

Author

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links