Content area
Full Text
Abstract - Parallel programs previously implemented on complex parallel computers resulted in small number of software developers programming on these computers except for those who understand the application domain and had resources and skill to program on this platform. Most general-purpose computers today having multicore processors and are parallel in architecture. Hence, software developers can cost-effectively take advantage of implementing multicore multithread parallel applications gaining benefits in computing power. This study explores shared memory multicore multithread programming with OpenMP in small scale computers. Laptops with Unix and Linux based platform, Mac OS X and Ubuntu are used to run a popular regular program, matrix multiplication to show the improvement in performance. OpenMP design issues including threading parallelized loop, loop scheduling and partitioning, and sharing and declaring data or memory is emphasized. This can motivate software developers implementing multicore multithread parallel programs in smaller-scale systems and advance to specialized hardware as requirements grow.
Keywords: Multicore, multithread, OpenMP, shared memory programming, small-scale system
(ProQuest: ... denotes formulae omitted.)
1 Introduction
Multicore and multithreading processing has been the trend of parallel computing today. This resulted from the ubiquity of multicore processors in all general-purpose computers. With newer and faster multicore multithreading processor designs making processing power more affordable and manageable by novice software and application developers when creating their programs from scratch. Many shared memory based parallel algorithms were tested on complex parallel architecture. In [1], the authors implemented parallel Gaussian Elimination on IBM RS/6000 SP machine. Authors in [2] discussed parallel matrix-multiplication algorithm on Intel Server System SR1670HV and Intel Server System SR1600UR. [3] implemented Genehunter genetic program on Compaq Alpha Tru64 systems. [15] implemented parallel Boundary Integral Method on Sequent Symmetry 5000 SE30. Most of the implementations claimed improvement in performance but none of the algorithms can be clearly claimed achieving optimal performance. Hence, these developers need to switch using complex and expensive parallel computers to small scale computer since less complex and easier to manage. Using these general-purpose computers running parallel applications are effortlessly as compare to using parallel computer. Advanced scientific and engineering communities have long used parallel computing solving large-scale computer problems on parallel computer, but these scientists find it hard to implement parallel applications effectively due to the complexity of the...