Резултати от работата на Националния център за суперкомпютърни приложения за 2009г



страница2/2
Дата25.07.2016
Размер290.52 Kb.
#6291
1   2

YACS Module


YACS module allows build, edit and execute calculation schemes. A calculation scheme defines a chain or a coupling of computer codes (SALOME components or calculation components, see General Principles of YACS).

YACS module has been introduced in SALOME series 4x and replaced obsolete Supervisor module in SALOME series 5x.



General operations:

  • Activate YACS module

  • Import/Export a schema

  • Open/Save a study

  • Set user preferences

  • Select an object

  • Activate context popup menu

  • Set active schema or run of a schema

Modification of a schema:

  • Create an object

  • Edit an object

  • Delete an object

  • Representation of a schema:

  • Change 2D representation of a schema

  • Auto-arrange schema nodes

  • Rebuild links between nodes

  • Execution of a schema:

  • Execute a schema

  • Save/Restore execution state

Create new edition

PETSc-FEM

Centro Internacional de Métodos Computacionales en Ingeniería, Argentina

9.88 MB

PETSc-FEM is a general purpose, parallel, multi-physics FEM (Finite Element Method) program for CFD (Computational Fluid Dynamics) applications based on PETSc . PETSc-FEM comprises both a library that allows the user to develop FEM (or FEM-like, i.e. non-structured mesh oriented) programs, and a suite of application programs. It is written in the C++ language with an OOP (Object Oriented Programming) philosophy, keeping in mind the scope of efficiency. PETSc-FEM may run in parallel using the MPI standard on a variety of architectures.



Theoretical Modeling

In this section a mathematical model to simulate 3D and time-dependent

electrokinetic flow and transport phenomena in microchannels is presented.First

the fluid mechanics and the basis of electroosmotic flow is discussed, then the

mass transport balance is presented. We considered the case of microchannel

networks filled with an aqueous strong electrolyte solution.



Governing Equations

Electrokinetic effects arise when there is a movement of the liquid caused by

the migration of ions under the effects of an electric field, in relation to some

solid wall. In the framework of continuum fluid mechanics, fluid velocity u,

pressure p, and electric E fields are governed by the following set of coupled

equations



equation 1 expresses the conservation of mass for incompressible fluids.

Equation 2 (Navier–Stokes equation) expresses the conservation of momentum

for Newtonian fluids of density _, viscosity μ, and stress tensor



, subjected to gravitational field of acceleration g and

electric field intensity E. The last term on the right hand side of equation 2

represents the contribution of electrical forces to the momentum balance, where

k= zkck is the electric charge density of the electrolyte solution, obtained

as the summation over all type-k ions, with valence zk and molar concentration

ck, and F is the Faraday constant.

Equation 3 (Poisson equation) establishes the relation between electric potential

and charge distributions in the fluid of permittivity _. Here it is relevant

to mention that the ion distributions ck (to be included in equations 2 and 3

through ρe must be derived from Nernst-Planck equation, which accounts for

the flux of type-k ions due to electrical forces, fluid convection and Brownian

diffusion..

The mass transport of sample ions and buffer electrolyte constituents can be

modeled by a linear superposition of migrative, convective and diffusive transport

mechanisms and a reactive term. Considering only strong electrolites,

reactive term vanishes. Thus in a non-stationary mode, for the j-type specie,

the present work considers the following transport equation

which governs the molar concentration cj of type-j species in the electrolyte

solution. In equation 4, Dj is the diffusion coefficient , is the mobility, and

F is the Faraday constant. Therefore, once velocity u and electric field E are

obtained from equations 1-3, the molar concentration profile cj of different j

species is derived from equation (4).


PETSc-FEM provides a core library in charge of managing parallel data distribution and assembly of residual vectors and Jacobian matrices, as well as facilities for general tensor algebra computations at the level of problem-specific finite element routines. Additionally, PETSc-FEM provides a suite of specialized application programs built on top of the core library but targeted to a variety of problems (e.g., compressible/incompressible Navier–Stokes and compressible Euler equations, general advective-diffusive systems, weak/strong fluid-structure interaction). In particular fluid flow computations presented in this article are carried out within the Navier–Stokes module available in PETSc-FEM. This module provides the required capabilities for simulating mass transport and incompressible fluid flow through a monolithic SUPG/PSPG stabilized formulation for linear finite elements. Electric Computations are carried out whith the Laplace’s and the Poisson-Boltzmann modules.

ELMER

CSC, Finnish IT Center for Science

169 MB
Elmer is Finite Element Software for Multiphysical Optimization Problems
Physical models in Elmer

The Elmer package contains solvers for a variety of mathematical models. The following list summarizes the capabilities of Elmer in specialized fields.

-Heat transfer: models for conduction, radiation and phase change

-Fluid flow: the Navier-Stokes, Stokes and Reynolds equations, k-" model

-Species transport: generic convection-diffusion equation

-Elasticity: general elasticity equations, dimensionally reduced models for plates and shells

-Acoustics: the Helmholtz equation

-Electromagnetism: electrostatics, magnetostatics, induction

-Microfluidics: slip conditions, the Poisson-Boltzmann equation

-Levelset method: Eulerian free boundary problems

-Quantum Mechanics: density functional theory (Kohn-Sham)

Numerical methods in Elmer

For approximation and linear system solution Elmer offers a great number of possibilities. The followinglist summarizes some of the most essential ones

- All basic element shapes in 1D, 2D and 3D with the Lagrange shape functions of degree k _ 2

- Higher degree approximation using p-elements

-Time integration schemes for the first and second order equations

-Solution methods for eigenvalue problems

-Direct linear system solvers (Lapack & Umfpack)

- Iterative Krylov subspace solvers for linear systems

- Multigrid solvers (GMG and AMG) for some basic equations

- ILU preconditioning of linear systems

-Parallelization of iterative methods

-The discontinuous Galerkin method

-Stabilized finite element formulations, including the methods of residual free bubbles and SUPG

-Adaptivity, particularly in 2D

-BEM solvers (without multipole acceleration)
Discretization

-Galerkin, Discontinous Galerkin (DG)

-Stabilization: SUPG, bubbles

-Lagrange, edge, face, and p-elements

Matrix equation solvers

-Direct: Lapack, Umfpack, (SuperLU, Mumps, Pardiso)

-Iterative Krylov space methods (own & Hypre)

-multigrid solvers (GMG & AMG) for “easy” equations (own & Hypre)

-Preconditioners: ILU, Parasails, multigrid, SGS, Jacobi,…
Parallellism

-Parallel assembly and solution (vector-matrix product)


Adaptivity

-For selected equations, works well in 2D


Pros and Cons of Elmer

Potential users may find a list of the possible pros and cons of Elmer package useful. The following summary is naturally open to subjective judgment and not complete either.

-Since Elmer is an open source product, it is possible to verify and modify the solution procedures

-Elmer has a modern programmable graphical user interface

-Elmer can handle the coupling of field equations in a flexible manner and new field variables can be introduced easily

-All material parameters may depend on the field variables and other parameters in a free manner

-Elmer offers a large selection of modern numerical methods

-Elmer enables the user to use most generally used finite elements

-Assembly and iterative solution can also be done in parallel

-Elmer has a graphical preprocessing interface for simple problem setups

-Elmer has a steadily growing user community and it has already been used in tens of scientific papers

-The different aspects of the code (solver, interface, documentation) are not always at the same development phase. For example documentation is not up-to-date and interface lacks many of the more esoteric physical models provided by the solver.

- Getting acquainted with the large package may take time. Previous experience on FEM packages may therefore be useful.

- Elmer itself does not include proper geometry or mesh generation tools for geometrically complicated problems. Only mesh import interfaces are supported.

- As a multiphysical solver Elmer may sometimes lack features in some areas that are standard for established single-field codes. Thus, some users may find the capabilities of Elmer inadequate for their needs.
Библиотеки програми

F. PETCs (Portable, Extensible Toolkit for Scientific Computing)

Mathematics and Computer Science Division, Argonne National Laboratory

The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.

PETSc includes an expanding suite of parallel linear, nonlinear equation solvers and time integrators that may be used in application codes written in Fortran, C, and C++. PETSc provides many of the mechanisms needed within parallel application codes, such as parallel matrix and vector assembly routines. The library is organized hierarchically, enabling users to employ the level of abstraction that is most appropriate for a particular problem. By using techniques of object-oriented programming, PETSc provides enormous flexibility for users.

PETSc is a sophisticated set of software tools; as such, for some users it initially has a much steeperlearning curve than a simple subroutine library. In particular, for individuals without some computer science background or experience programming in C or C++, it may require a significant amount of time to take full advantage of the features that enable efficient software use. However, the power of the PETSc design and the algorithms it incorporates may make the efficient implementation of many application codes simpler than “rolling them” yourself.

- For many simple (or even relatively complicated) tasks a package such as Matlab is often the best tool; PETSc is not intended for the classes of problems for which effective Matlab code can be written.

_-PETSc should not be used to attempt to provide a “parallel linear solver” in an otherwise sequential code. Certainly all parts of a previously sequential code need not be parallelized but the matrix generation portion must be to expect any kind of reasonable performance. Do not expect to generate your matrix sequentially and then “use PETSc” to solve the linear system in paralle.



G. ParMETIS (Parallel Partitioning and Sparse Matrix Ordering Library )

University of Minnesota, Department of Computer Science and Engineering

Army HPC Research Center

Minneapolis
The algorithms implemented in METIS are based on the multilevel recursive-bisection, multilevel k-way, and multi-constraint partitioning schemes developed in our lab.

METIS is being developed entirely in ANSI C, and is therefore portable on most Unix systems that have an ANSI C compiler (the GNU C compiler will do).


Provides high quality partitions:
Experiments on a large number of graphs arising in various domains including finite element methods, linear programming, VLSI, and transportation show that METIS produces partitions that are consistently better than those produced by other widely used algorithms. The partitions produced by METIS are consistently 10% to 50% better than those produced by spectral partitioning algorithms.

Produces low fill orderings:
The fill-reducing orderings produced by METIS are significantly better than those produced by other widely used algorithms including multiple minimum degree. For many classes of problems arising in scientific computations and linear programming, METIS is able to reduce the storage and computational requirements of sparse matrix factorization, by up to an order of magnitude. Moreover, unlike multiple minimum degree, the elimination trees produced by METIS are suitable for parallel direct factorization.

H. LAPACK (Linear Algebra PACKage )

LAPACK is written in Fortran90 and provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems. The associated matrix factorizations (LU, Cholesky, QR, SVD, Schur, generalized Schur) are also provided, as are related computations such as reordering of the Schur factorizations and estimating condition numbers. Dense and banded matrices are handled, but not general sparse matrices. In all areas, similar functionality is provided for real and complex matrices, in both single and double precision.

The original goal of the LAPACK project was to make the widely used EISPACK and LINPACS libraries run efficiently on shared-memory vector and parallel processors. On these machines, LINPACK and EISPACK are inefficient because their memory access patterns disregard the multi-layered memory hierarchies of the machines, thereby spending too much time moving data instead of doing useful floating-point operations. LAPACK addresses this problem by reorganizing the algorithms to use block matrix operations, such as matrix multiplication, in the innermost loops. These block operations can be optimized for each architecture to account for the memory hierarchy, and so provide a transportable way to achieve high efficiency on diverse modern machines. We use the term "transportable" instead of "portable" because, for fastest possible performance, LAPACK requires that highly optimized block matrix operations be already implemented on each machine.
I. ScaLAPACK (Scalable LAPACK)
The ScaLAPACK library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. It is currently written in a Single-Program-Multiple-Data style using explicit message passing for interprocessor communication. ScaLAPACK is designed for heterogeneous computing and is portable on any computer that supports MPI and PVM. J. SuperLU
Computer Science Division, University of California, Berkeley

SuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations on high performance machines. The library is written in C and is callable from either C or Fortran. The library routines will perform an LU decomposition with partial pivoting and triangular system solves through forward and back substitution. The LU factorization routines can handle non-square matrices but the triangular solves are performed only for square matrices. The matrix columns may be preordered (before factorization) either through library or user supplied routines. This preordering for sparsity is completely separate from the factorization. Working precision iterative refinement subroutines are provided for improved backward stability. Routines are also provided to equilibrate the system, estimate the condition number, calculate the relative backward error, and estimate error bounds for the refined solutions



K. MUMS (Multifrontal massively parallel sparse direct solver)

ENSFEIHT – IRIT

The system Ax = b is solved in three main steps:

1. Analysis. The host performs an ordering (see Section 2.2) based on the symmetrized pattern A + AT, and carries out the symbolic factorization. A mapping of the multifrontal computational graph is then computed, and symbolic information is transferred from the host to the other processors. Using this information, the processors estimate the memory necessary for factorization and solution.

2. Factorization. The original matrix is first distributed to processors that will participate in the numerical factorization. Based on the so called elimination tree , the numerical factorization is then a sequence of dense factorization on so called frontal matrices. The elimination tree also expresses independency between tasks and enables multiple fronts to be processed simultaneously. This approach is called multifrontal approach. After the factorization, the factor matrices are kept distributed (in core memory or on disk); they will be used at the solution phase.

3. Solution. The right-hand side b is broadcasted from the host to the working processors that compute the solution x using the (distributed) factors computed during factorization. The solution is then either assembled on the host or kept distributed on the working processors.

Each of these phases can be called separately and several instances of MUMPS can be handled simultaneously. MUMPS allows the host processor to participate to the factorization and solve phases, just like any other processor .

For both the symmetric and the unsymmetric algorithms used in the code, we have chosen a fully asynchronous approach with dynamic scheduling of the computational tasks. Asynchronous communication is used to enable overlapping between communication and computation. Dynamic scheduling was initially chosen to accommodate numerical pivoting in the factorization. The other important reason for this choice was that, with dynamic scheduling, the algorithm can adapt itself at execution time to remap work and data to more appropriate processors. In fact, we combine the main features of static and dynamic approaches; we use the estimation obtained during the analysis to map some of the main computational tasks; the other tasks are dynamically scheduled at execution time. The main data structures (the original matrix and the factors) are similarly partially mapped during the analysis phase.



L. HYPRE (High Performance Preconditioners)

Lawrence Livermore National Laboratory

University of California

HYPRE is a software library for solving large, sparse linear systems of equations on massively parallel computers. The library was created with the primary goal of providing users with advanced parallel preconditioners. Issues of robustness, ease of use, flexibility, and interoperability also play an important role.

The HYPRE library was created with the primary goal of providing users with advanced parallel preconditioners. The library features parallel multigrid solvers for both structured and unstructured grid problems.:

• Scalable preconditioners provide efficient solution on today’s and tomorrow’s systems: hypre contains several families of preconditioner algorithms focused on the scalable solution of very large sparse linear systems. (Note that small linear systems, systems that are solvable on a sequential computer, and dense systems are all better addressed by other libraries that are designed specifically for them.) hypre includes “grey-box” algorithms that use more than just the matrix to solve certain classes of problems more efficiently than general-purpose libraries. This includes algorithms such as structured multigrid.

• Suite of common iterative methods provides options for a spectrum of problems: hypre provides several of the most commonly used Krylov-based iterative methods to be used in conjunction with its scalable preconditioners. This includes methods for nonsymmetric systems such as GMRES and methods for symmetric matrices such as Conjugate Gradient.

• Intuitive grid-centric interfaces obviate need for complicated data structures and provide access to advanced solvers: hypre has made a major step forward in usability from earlier generations of sparse linear solver libraries in that users do not have to learn complicated sparse matrix data structures. Instead, hypre does the work of building these data structures for the user through a variety of conceptual interfaces, each appropriate to different classes of users. These include stencil-based structured/semi-structured interfaces most appropriate for finite-difference applications; a finite-element based unstructured interface; and a linear-algebra based interface. Each conceptual interface provides access to several solvers without the need to write new interface code.

• User options accommodate beginners through experts: hypre allows a spectrum of expertise to be applied by users. The beginning user can get up and running with a minimal amount of effort. More expert users can take further control of the solution process through various parameters.

Configuration options to suit your computing system: hypre allows a simple and flexible installation on a wide variety of computing systems. Users can tailor the installation to match their computing system. Options include debug and optimized modes, the ability to change required libraries such as MPI and BLAS, a sequential mode, and modes enabling threads for certain solvers. On most systems, however, hypre can be built by simply typing configure followed by make.



M. GotoBLAS
The GOTO library is an optimised implementation of BLAS routines, developed by Kazushige Goto at the University of Texas at Austin.

GotoBLAS is a highly optimised implementation of BLAS. The high efficiency obtained is achieved by utilising architecture specific information. The distribution of

GotoBLAS provides implementations of the BLAS functions and a select number of implementations from LAPACK As of version 1.20 this consisted of (Xgesv, Xgetf2, Xgetrf, Xgetrs, Xlaswp, Xlauu2, Xlauum, Xpotf2, Xpotrf, Xpotri, Xtrti2, Xtrtri, lsame) where X denotes the data type, i.e. (d)ouble, (s)ingle or complex (z). To combine GotoBLAS with a full suite of LAPACK functions one could build LAPACK using the GotoBLAS library, but this would preclude using the LAPACK symbols defined within the GotoBLAS library. To get around this, I wrote a script which combines GotoBLAS and LAPACK symbols into two new libraries.
N. PVFS2 (A Parallel File System for Linux Clusters and Supercomputers)

Argonne National Laboratory


As Linux clusters and superconputers have matured as platforms for low-cost, high-performance parallel computing, software packages to provide many key services have emerged, especially in areas such as message passing and networking. One area devoid of support, however, has been parallel file systems, which are critical for high-performance I/O on such clusters. We have developed a parallel file system for Linux clusters and supercomputers Blue Gene/L and Blue Gene/P, called the Parallel Virtual File System (PVFS). PVFS is intended both as a high-performance parallel file system that anyone can download and use and as a tool for pursuing further research in parallel I/O and parallel file systems for Linux clusters.

Configuration



Block Allocation

•I/O server exports file/object oriented API

– Storage object (“dataspace”) on an I/O server addressed

by numeric handle

– Dataspace can be stream of bytes or key/value pairs

-Create data space, delete dataspace, read/write-

•Files & directories mapped onto dataspaces

– File may be single dataspace, or chunked/striped over several

•Each I/O server manages block allocation for its local storage

•I/O server uses local filesystem to store dataspaces

•Key/value dataspace stored using Berkeley DB table

Metadata Management

•Directory dataspace contains list of names & metafile handles

•Metafile dataspace contains

– Attributes (permissions, owner, xattrs)

– Distribution function parametersю

-Data file handles

•Datafile(s) store file data

– Distribution function determines pattern

– Default is 64 KB chunk size and round-robin placement

•Directory and metadata updates are atomic

– Eliminates need for locking

– May require “losing” node in race to do significant cleanup

•System configuration (I/O server list, etc.) stored in static file on all I/O servers

Reliability

• Similar to GPFS

– RAID underneath I/O server to handle disk failures & sector errors

– Dual attached RAID to primary/backup I/O server to

handle I/O server failures

• Linux HA used for generic failover support

• Sequenced operations provide well-defined crash behavior

• Create datafiles

• Create metafile that points to datafiles

• Link metafile into directory (atomic)



32

Председател на Управителния съвет



Стоян Марков


Сподели с приятели:
1   2




©obuch.info 2024
отнасят до администрацията

    Начална страница