Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in...

Post on 01-Apr-2015

221 views 2 download

Transcript of Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in...

Real Time Power and Performance Monitoring of Supercomputer Application

Shankar Prajapati BS in Computer Science

Claflin Universityshan08kar@gmail.com

Nate Rini(Mentor)

3 flops 34,000,000,000,000,000flops

VS

History and progress of Supercomputer

Our demand for faster supercomputers is

increasing faster than the combined might of Moore’s law and Dennard scaling.

Most of the world’s computing superpowers have announced their intentions to create exascale (1000 petaflops) supercomputers by 2020.

Supercomputer in progress

The speed limit for modern

supercomputers is now

set by power consumption

Performance α Power

Relationship

How best to optimize applications to fully utilize available system resources efficiently ?

We are monitoring from the system level

How?

Objectives

Understand the relationship between power consumptions and performance

Monitor the power and performance of supercomputer application

We don’t want to interfere with the users

Benchmarking

Measuring Performance

High Performance Linpack(HPL)

Tools to measure Efficiency and Performance

High Performance LINPACK(HPL)

Since 1993, the fastest supercomputers have been ranked on the TOP500 list according to their LINPACK benchmarking results.

Message Passing Interface (MPI)

High Performance

Portability Scalability

Most programs now running on highly parallel computers are built on the Message-Passing Interface, or MPI.

OSU Micro-Benchmarking

Bandwidth test

Latency test

Message Rate test

The test cluster we setup of ganglia mimics the real cluster in smaller scale

Ganglia test cluster

In a Virtual box, we installedlatest version of Centos.

We installed ganglia from source code in one of the main node.

Installed all the dependent packages like APR ,libConfuse , expat, pkg-config, python,PCRE,RRDtool and few other packages on which ganglia depends.

Ganglia Architecture

The Ganglia MONitor Daemon (GMOND)

The Ganglia METAdata Daemon (GMETAD)

Round Robin Database Tools

(RRDTools)PHP-based Web

interface

Web server

Ganglia Web Interface for Jellystone test cluster

Ganglia Load vs time graphLINPACK over 28 nodes on

Jellystone test cluster

Model Specific Register

Various Control registers in the x86 instruction setused for debugging, program execution tracing,Computer performancemonitoring, and toggling certain CPU features.

MSR and Librapl

Librapl simplifies access to the RAPL values

In the MSR registers of modern Intel CPUs like

SandyBridge processors.

Intel® Power Gadget

Software based power

usage monitoring

tool for 2nd generation

Intel Core processors

or later.

• Package power limit

• Energy of the CPU/processor cores

• Energy of the processor graphics

Logfile data

Host metrics from rvitals

rvitals retrieves hardware vital information from the on-board Service Processor for a single or range of nodes and groups.

IBM iDataplex iPDU

PDU is queried via SNMP. SNMP is a set of protocols for managing

complex networks

The data collection includes power usage from Intel processors, PDUs and node power supplies.

The data is collected while running selected jobs.

The analysis of sampled data helps us to understand how jobs affects the real-time

power usage in supercomputers.

Data Collection

Intel® Power Gadget

IPMI via rvitals

IBM iDataplex iPDU via

SNMP

Tools

V S

Observations

Observations

Conclusion

We were able to monitor the power usage by different nodes while running jobs with a one minute granularity

We also compared the data output fromthree different power measurement tools.

We successfully achieved our goal to monitor the real time power and performance of a super computer.

Future

Supercomputer Systems Group Shawn StrandeNathan Rini ( Mentor )Irfan ElahiShawn NeedhamRohan Rodrigues Jonathan RobertsStormy KnightTom Gowan Benjamin Mathews

Acknowledgement

Ananta TiwariLaura Carrington

Questions