Semantics-based Distributed I/O for mpiBLAST P. Balaji ά, W. Feng β, J. Archuleta β, H. Lin δ,...

1
Semantics-based Distributed I/O for mpiBLAST P. Balaji ά , W. Feng β , J. Archuleta β , H. Lin δ , R. Kettimuthu ά , R. Thakur ά and X. Ma δ ά Argonne National Laboratory β Virginia Tech δ North Carolina State University Issues with Distributed I/O NSF TeraGrid Traditional Distributed I/O Distributed I/O in mpiBLAST Sequence Search with mpiBLAST ParaMEDIC Architecture ParaMEDIC-powered mpiBLAST ParaMEDIC Framework mpiBLAST Data Encryption ParaMEDIC Data Tools Data Integrity Communication Services Direct Network Global File-System Application Plugins Basic Compression mpiBLAST plugin Communication Profiling Plugin Communication Profiling Remote Visualization Applications ParaMEDIC API PMAPI) mpiBLAST Working Model s Scatter Search Gather Output Workers Database Query Estimated Output of an All-to-All NT Search Compute Master I/O Master mpiBLAST Master mpiBLAST Worker mpiBLAST Worker mpiBLAST Worker mpiBLAST Master mpiBLAST Worker mpiBLAST Worker Query Raw Metadata Query Write Results Generate Temp Database Read Temp Database I/O Workers Compute Workers I/O Servers hosting file system Processed Metadata Impact of Network Latency Network Delay: Performance Breakup Trading Computation and IO Impact of Encrypted File- Systems Argonne-VT Distributed Setup Performance on TeraGrid Other Applications: MPE Communication Profiler Conclusion Understanding ParaMEDIC High Latency Synchronization operations heavily effected Low Bandwidth High-bandwidth links are not accessible to everyone Data Encryption Distributed I/O over the Internet might need to be encrypted in some environments And yet distributed I/O is essential Large-scale computations might require resources that are not available at a single site Scientists may need to access remote large-scale supercomputers which are not available locally Primary Idea: Data transformation, but not at the byte-level (unlike compression techniques) Application specifies the appropriate description of the data allows ParaMEDIC to process output as high- level objects, and not a stream of bytes With respect to mpiBLAST Overall output is just a concatenation of matches and other descriptive information for each query sequence Match scores, alignment information, etc. Finding the database matches for each sequence is the compute intensive portion needs to be stored Rest can be discarded and regenerated Distributed I/O is a necessary evil Large applications need resources from multiple centers Scientists might need to use remote large- scale resources Impacted by several issues High Latency, Low Bandwidth, Encryption requirements We propose the ParaMEDIC framework Semantics-based Distributed I/O mechanism ParaMEDIC uses application-specific plugins to “understand” what the output data means and process it Output data converted to application- specific metadata, transported, and then converted back Trades a small amount of additional computation for potentially large benefits in I/O Scientific applications have periodic communication profiles ParaMEDIC uses FFT to find periodicity and process output Preliminary results show 2-5X reduction in output Query Size (KB) 0-5 5-50 50- 150 150-200 200-500 >500 Total Number of Queries 3,305,17 0 87,50 6 25,92 0 26,524 9,592 248 3,455,00 0 Estimated Output (GB) 1,139 593 23,55 5 3,995 ? ? >29,282 0 20 40 60 80 100 120 140 160 180 200 0 1 5 10 20 40 70 100 Networy Delay ( mpiBLAST ParaMEDIC mpiBLAST Performance Breaku 0 50 100 150 200 0 1 5 10 20 40 70 100 Execution Time ( I/O Time Compute Time ParaMEDIC Performance Breaku 0 50 100 0 1 5 10 20 40 70 100 Execution Time ( Compute Time Post-Processing Time I/O Time 0 100 200 300 400 500 600 20 18 16 14 12 10 8 6 4 Application Compute to Post-processing Reso mpiBLAST ParaMEDIC 0 50 100 150 200 250 300 350 10 20 30 40 50 60 70 80 90 100 Query Size (K mpiBLAST ParaMEDIC 0 1000 2000 3000 4000 5000 6000 10 20 30 40 50 60 70 80 90 100 Query Size (K mpiBLAST ParaMEDIC 0 500 1000 1500 2000 2500 3000 3500 4000 10 20 30 40 50 60 70 80 90 100 Query Size (K mpiBLAST ParaMEDIC

Transcript of Semantics-based Distributed I/O for mpiBLAST P. Balaji ά, W. Feng β, J. Archuleta β, H. Lin δ,...

Page 1: Semantics-based Distributed I/O for mpiBLAST P. Balaji ά, W. Feng β, J. Archuleta β, H. Lin δ, R. Kettimuthu ά, R. Thakur ά and X. Ma δ ά Argonne National.

Semantics-based Distributed I/O for mpiBLASTP. Balajiά, W. Fengβ, J. Archuletaβ, H. Linδ, R. Kettimuthuά, R. Thakurά and X. Maδ

ά Argonne National Laboratoryβ Virginia Tech

δ North Carolina State University

Issues with Distributed I/O NSF TeraGrid

Traditional Distributed I/ODistributed I/O in mpiBLAST

Sequence Search with mpiBLAST

ParaMEDIC ArchitectureParaMEDIC-powered

mpiBLAST

ParaMEDIC Framework

mpiBLAST

DataEncryption

ParaMEDIC Data ToolsData

Integrity

Communication Services

DirectNetwork

GlobalFile-System

Application Plugins

BasicCompression

mpiBLASTplugin

CommunicationProfiling Plugin

CommunicationProfiling

RemoteVisualization

Applications

ParaMEDIC APIPMAPI)

mpiBLAST Working Model

s

Scatter Search Gather

Output

Workers

Database

Query

Estimated Output of an All-to-All NT Search

Compute Master I/O Master

mpiBLAST Master

mpiBLASTWorker

mpiBLASTWorker

mpiBLASTWorker

mpiBLAST Master

mpiBLASTWorker

mpiBLASTWorker

Query Raw Metadata QueryWrite Results

Generate TempDatabase

Read TempDatabase

I/O WorkersCompute

Workers

I/O Servershosting file

system

Processed Metadata

Impact of Network Latency Network Delay: Performance Breakup Trading Computation and IO Impact of Encrypted File-Systems

Argonne-VT Distributed Setup Performance on TeraGrid Other Applications: MPE Communication Profiler Conclusion

0

20

40

60

80

100

120

140

160

180

200

0 1 5 10 20 40 70 100

Networy Delay (ms)

Exe

cutio

n Ti

me

(sec

)

mpiBLAST

ParaMEDIC

mpiBLAST Performance Breakup

0

50

100

150

200

0 1 5 10 20 40 70 100Execution Time (sec)

Net

wor

k D

elay

(ms)

I/O Time

Compute Time

ParaMEDIC Performance Breakup

0

50

100

0 1 5 10 20 40 70 100

Execution Time (sec)

Net

wor

k D

elay

(ms)

Compute Time Post-Processing Time I/O Time

0

100

200

300

400

500

600

20 18 16 14 12 10 8 6 4Application Compute to Post-processing Resource ratio

Exe

cutio

n Ti

me

(sec

)

mpiBLAST

ParaMEDIC

0

50

100

150

200

250

300

350

10 20 30 40 50 60 70 80 90 100

Query Size (KB)

Exe

cutio

n Ti

me

(sec

)

mpiBLAST

ParaMEDIC

0

1000

2000

3000

4000

5000

6000

10 20 30 40 50 60 70 80 90 100

Query Size (KB)

Exe

cutio

n Ti

me

(sec

)

mpiBLAST

ParaMEDIC

0

500

1000

1500

2000

2500

3000

3500

4000

10 20 30 40 50 60 70 80 90 100

Query Size (KB)

Exe

cutio

n Ti

me

(sec

)

mpiBLAST

ParaMEDIC

Understanding ParaMEDIC

• High Latency• Synchronization operations heavily effected

• Low Bandwidth• High-bandwidth links are not accessible to everyone

• Data Encryption• Distributed I/O over the Internet might need to be

encrypted in some environments

• And yet distributed I/O is essential• Large-scale computations might require resources

that are not available at a single site• Scientists may need to access remote large-scale

supercomputers which are not available locally

• Primary Idea:• Data transformation, but not at the byte-level (unlike

compression techniques)• Application specifies the appropriate description of the

data allows ParaMEDIC to process output as high-level objects, and not a stream of bytes

• With respect to mpiBLAST• Overall output is just a concatenation of matches and

other descriptive information for each query sequence

• Match scores, alignment information, etc.

• Finding the database matches for each sequence is the compute intensive portion needs to be stored

• Rest can be discarded and regenerated

• Distributed I/O is a necessary evil• Large applications need resources from multiple centers• Scientists might need to use remote large-scale resources

• Impacted by several issues• High Latency, Low Bandwidth, Encryption requirements

• We propose the ParaMEDIC framework• Semantics-based Distributed I/O mechanism• ParaMEDIC uses application-specific plugins to

“understand” what the output data means and process it• Output data converted to application-specific metadata,

transported, and then converted back

• Trades a small amount of additional computation for potentially large benefits in I/O

• Scientific applications have periodic communication profiles• ParaMEDIC uses FFT to find periodicity and process output

• Preliminary results show 2-5X reduction in output

Query Size(KB)

0-5 5-50 50-150 150-200 200-500 >500 Total

Number of Queries

3,305,170 87,506 25,920 26,524 9,592 248 3,455,000

Estimated Output (GB)

1,139 593 23,555 3,995 ? ? >29,282