Engineering Mechanics

International Conference

Proceedings Vol. 22 (2016)


ENGINEERING MECHANICS 2016

22nd INTERNATIONAL CONFERENCE
May 9 – 12, 2016, Svratka, Czech Republic
;
Editors: Igor Zolotarev and Vojtěch Radolf

Copyright © 2016 Institute of Thermomechanics, Academy of Sciences of the Czech Republic, v.v.i., Prague

ISBN 978-80-87012-59-8 (printed)
ISSN 1805-8248 (printed)
ISSN 1805-8256 (electronic)

list of papers scientific commitee

ON PARALLELIZATION OF ASSEMLY OPERATIONS IN FINITE ELEMENT SOFTWARE
Bošanský M., Patzák B.
pages 87 - 92, full text

Current development in computer hardware brings in new opportunities in numerical modelling. Computers with a single processing unit, where only one instruction can be processed at any moment in time, allow us to run simulation codes only sequentially. The performance of single processing units is reaching the physical limits, given by transmission delays and heat build-up on the silicon chips. The future of scientific computing seems to be in parallel computing, that allows to overcome the limitations of traditional sequential processing units. Parallel computing is based on simultaneous use of multiple processing units. The fundamental paradigm in parallel computing is based on work decomposition into pieces of work that can be processed simultaneously. This contribution focuses on parallelization of sparse matrix and global vector assembly operations, which are typical to any finite element code. The aim of presented work is to propose an alternative approach to assembly operation based on decomposition of the work into independent element groups, members of which can be processed concurrently without blocking operation. The individual groups contain elements contributing to distinct entries in sparse matrix or global vector. Such decomposition is done using colouring algorithm. As the elements in group contribute to distinct locations, there is no need to prevent the race condition, that can occur when the same location is updated simultaneously. It is only necessary to enforce synchronization before processing each element group. Efficiency of implemented approach is compared to approach based on decomposition of assembly loop using OpenMP and POSIX threads directives and explicit locking of updated locations in sparse matrix or global vector, which was published by the authors in (Bosansky & Patzak, 2016a) and (Bosansky & Patzak, 2016b).


back to list of papers

Text and facts may be copied and used freely, but credit should be given to these Proceedings.

All papers were reviewed by members of the scientific committee.


Publication Ethics - Ethical guidelines for publication
Webmaster contact: admin@it.cas.cz

imce   Powered by Imce 3.20  © 2023, Pavel Formánek, Institute of Thermomechanics AS CR, v.v.i. [generated: 0.0268s]