Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in...

38
Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University [email protected] Nate Rini(Mentor)

Transcript of Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in...

Page 1: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Real Time Power and Performance Monitoring of Supercomputer Application

Shankar Prajapati BS in Computer Science

Claflin [email protected]

Nate Rini(Mentor)

Page 2: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

3 flops 34,000,000,000,000,000flops

VS

History and progress of Supercomputer

Page 3: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Our demand for faster supercomputers is

increasing faster than the combined might of Moore’s law and Dennard scaling.

Page 4: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Most of the world’s computing superpowers have announced their intentions to create exascale (1000 petaflops) supercomputers by 2020.

Supercomputer in progress

Page 5: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

The speed limit for modern

supercomputers is now

set by power consumption

Performance α Power

Relationship

Page 6: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

How best to optimize applications to fully utilize available system resources efficiently ?

Page 7: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

We are monitoring from the system level

Page 8: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

How?

Page 9: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Objectives

Understand the relationship between power consumptions and performance

Monitor the power and performance of supercomputer application

We don’t want to interfere with the users

Page 10: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Benchmarking

Measuring Performance

Page 11: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

High Performance Linpack(HPL)

Tools to measure Efficiency and Performance

Page 12: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

High Performance LINPACK(HPL)

Since 1993, the fastest supercomputers have been ranked on the TOP500 list according to their LINPACK benchmarking results.

Page 13: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Message Passing Interface (MPI)

High Performance

Portability Scalability

Most programs now running on highly parallel computers are built on the Message-Passing Interface, or MPI.

Page 14: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

OSU Micro-Benchmarking

Bandwidth test

Latency test

Message Rate test

Page 15: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.
Page 16: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.
Page 17: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.
Page 18: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

The test cluster we setup of ganglia mimics the real cluster in smaller scale

Ganglia test cluster

In a Virtual box, we installedlatest version of Centos.

We installed ganglia from source code in one of the main node.

Installed all the dependent packages like APR ,libConfuse , expat, pkg-config, python,PCRE,RRDtool and few other packages on which ganglia depends.

Page 19: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Ganglia Architecture

The Ganglia MONitor Daemon (GMOND)

The Ganglia METAdata Daemon (GMETAD)

Round Robin Database Tools

(RRDTools)PHP-based Web

interface

Web server

Page 20: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Ganglia Web Interface for Jellystone test cluster

Page 21: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Ganglia Load vs time graphLINPACK over 28 nodes on

Jellystone test cluster

Page 22: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Model Specific Register

Various Control registers in the x86 instruction setused for debugging, program execution tracing,Computer performancemonitoring, and toggling certain CPU features.

Page 23: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

MSR and Librapl

Librapl simplifies access to the RAPL values

In the MSR registers of modern Intel CPUs like

SandyBridge processors.

Page 24: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Intel® Power Gadget

Software based power

usage monitoring

tool for 2nd generation

Intel Core processors

or later.

• Package power limit

• Energy of the CPU/processor cores

• Energy of the processor graphics

Logfile data

Page 25: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.
Page 26: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Host metrics from rvitals

rvitals retrieves hardware vital information from the on-board Service Processor for a single or range of nodes and groups.

Page 27: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.
Page 28: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

IBM iDataplex iPDU

PDU is queried via SNMP. SNMP is a set of protocols for managing

complex networks

Page 29: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.
Page 30: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

The data collection includes power usage from Intel processors, PDUs and node power supplies.

The data is collected while running selected jobs.

The analysis of sampled data helps us to understand how jobs affects the real-time

power usage in supercomputers.

Data Collection

Page 31: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Intel® Power Gadget

IPMI via rvitals

IBM iDataplex iPDU via

SNMP

Tools

V S

Page 32: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Observations

Page 33: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Observations

Page 34: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.
Page 35: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Conclusion

We were able to monitor the power usage by different nodes while running jobs with a one minute granularity

We also compared the data output fromthree different power measurement tools.

We successfully achieved our goal to monitor the real time power and performance of a super computer.

Page 36: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Future

Page 37: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Supercomputer Systems Group Shawn StrandeNathan Rini ( Mentor )Irfan ElahiShawn NeedhamRohan Rodrigues Jonathan RobertsStormy KnightTom Gowan Benjamin Mathews

Acknowledgement

Ananta TiwariLaura Carrington

Page 38: Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University shan08kar@gmail.com.

Questions