 |
|
 |
Modern High Performance Computing systems are being
built from larger and larger numbers of computers.
Today's largest installations can consist of thousands
of nodes across which computation, file and storage
management, system administration, and communications
tasks are allocated. Managing such large systems so
that they meet users' requirements for performance,
availability, conformance to established operational
procedures and other characteristics is a major challenge.
Some system managers find themselves augmenting standard
management infrastructures with tools of their own,
possibly because of specific operational requirements
or to help diagnose a specific problem. The HP CCN
HPC System Management Collaboration provides a forum
for members to exchange ideas and benefit from each
others' experience in meeting the challenge. The HPC
System Management Collaboration is currently focusing
its attention on performance and fault management,
including consideration of both compute nodes and file
system nodes of large, multi-computer systems. Tools
and techniques for measuring performance and for analyzing
detailed subsystem performance data to understand overall
system behavior and to diagnose faults are our subject.
Performance management using these tools and techniques
may include performance optimization for workloads
with different characteristics. In its exploration
and sharing of members experiences, findings, and recommendations,
the collaboration considers tools for performance characterization
and analysis from HP and other sources.
|
 |
|
 |
Members participate in forums to exchange experiences
as well as the latest information on system management
tools and methods. Opportunities for collaboration
and competency-sharing in problem-solving methods are
to be found among HP CCN members and different parts
of HP, including the High Performance Computing Division
and HP Labs. Members will have the opportunity to promote
the capabilities of their institutions through this
HP CCN sponsored web portal and our annual conference.
|
 |
|
 |
- Web Portal: HP CCN
provides a dedicated System Management web portal
to host research results, white papers and software
tools from members and HP.
- Annual Conference: Members
congregate for presentations and discussions with
other members interested in System Management issues
at the annual HP CCN conference.
- Webcast Conferences: Periodic
web cast conferences are scheduled to keep members
updated and to discuss research projects.
|
 |
|
 |
For additional information about high performance computing system management resources, including
white papers and related links, go to the resources
page.
To obtain a free copy of the latest Adobe Acrobat Reader, go
to the Adobe download
page.
|
 |
|
 |
For a list of and information on the current members of the System Management
collaboration, go to the members
page.
|
 |
|
 |
The collaboration advocate for the System Management collaboration can be contacted
via this e-mail
form.
If you are interested in joining the System Management
collaboration, please fill out and submit this application
form.
|
 |
| |
|
|
 |
|