Semantics-based Distributed I/O for mpiBLAST
Embed Size (px)
description
Transcript of Semantics-based Distributed I/O for mpiBLAST

Semantics-based Distributed I/O for mpiBLASTP. Balajiά, W. Fengβ, J. Archuletaβ, H. Linδ, R. Kettimuthuά, R. Thakurά and X. Maδ
ά Argonne National Laboratoryβ Virginia Tech
δ North Carolina State University
Issues with Distributed I/O NSF TeraGrid
Traditional Distributed I/ODistributed I/O in mpiBLAST
Sequence Search with mpiBLAST
ParaMEDIC ArchitectureParaMEDIC-powered
mpiBLAST
ParaMEDIC Framework
mpiBLAST
DataEncryption
ParaMEDIC Data ToolsData
Integrity
Communication Services
DirectNetwork
GlobalFile-System
Application Plugins
BasicCompression
mpiBLASTplugin
CommunicationProfiling Plugin
CommunicationProfiling
RemoteVisualization
Applications
ParaMEDIC APIPMAPI)
mpiBLAST Working Model
s
Scatter Search Gather
Output
Workers
Database
Query
Estimated Output of an All-to-All NT Search
Compute Master I/O Master
mpiBLAST Master
mpiBLASTWorker
mpiBLASTWorker
mpiBLASTWorker
mpiBLAST Master
mpiBLASTWorker
mpiBLASTWorker
Query Raw Metadata QueryWrite Results
Generate TempDatabase
Read TempDatabase
I/O WorkersCompute
Workers
I/O Servershosting file
system
Processed Metadata
Impact of Network Latency Network Delay: Performance Breakup Trading Computation and IO Impact of Encrypted File-Systems
Argonne-VT Distributed Setup Performance on TeraGrid Other Applications: MPE Communication Profiler Conclusion
0
20
40
60
80
100
120
140
160
180
200
0 1 5 10 20 40 70 100
Networy Delay (ms)
Exe
cutio
n Ti
me
(sec
)
mpiBLAST
ParaMEDIC
mpiBLAST Performance Breakup
0
50
100
150
200
0 1 5 10 20 40 70 100Execution Time (sec)
Net
wor
k D
elay
(ms)
I/O Time
Compute Time
ParaMEDIC Performance Breakup
0
50
100
0 1 5 10 20 40 70 100
Execution Time (sec)
Net
wor
k D
elay
(ms)
Compute Time Post-Processing Time I/O Time
0
100
200
300
400
500
600
20 18 16 14 12 10 8 6 4Application Compute to Post-processing Resource ratio
Exe
cutio
n Ti
me
(sec
)
mpiBLAST
ParaMEDIC
0
50
100
150
200
250
300
350
10 20 30 40 50 60 70 80 90 100
Query Size (KB)
Exe
cutio
n Ti
me
(sec
)
mpiBLAST
ParaMEDIC
0
1000
2000
3000
4000
5000
6000
10 20 30 40 50 60 70 80 90 100
Query Size (KB)
Exe
cutio
n Ti
me
(sec
)
mpiBLAST
ParaMEDIC
0
500
1000
1500
2000
2500
3000
3500
4000
10 20 30 40 50 60 70 80 90 100
Query Size (KB)
Exe
cutio
n Ti
me
(sec
)
mpiBLAST
ParaMEDIC
Understanding ParaMEDIC
• High Latency• Synchronization operations heavily effected
• Low Bandwidth• High-bandwidth links are not accessible to everyone
• Data Encryption• Distributed I/O over the Internet might need to be
encrypted in some environments
• And yet distributed I/O is essential• Large-scale computations might require resources
that are not available at a single site• Scientists may need to access remote large-scale
supercomputers which are not available locally
• Primary Idea:• Data transformation, but not at the byte-level (unlike
compression techniques)• Application specifies the appropriate description of the
data allows ParaMEDIC to process output as high-level objects, and not a stream of bytes
• With respect to mpiBLAST• Overall output is just a concatenation of matches and
other descriptive information for each query sequence
• Match scores, alignment information, etc.
• Finding the database matches for each sequence is the compute intensive portion needs to be stored
• Rest can be discarded and regenerated
• Distributed I/O is a necessary evil• Large applications need resources from multiple centers• Scientists might need to use remote large-scale resources
• Impacted by several issues• High Latency, Low Bandwidth, Encryption requirements
• We propose the ParaMEDIC framework• Semantics-based Distributed I/O mechanism• ParaMEDIC uses application-specific plugins to
“understand” what the output data means and process it• Output data converted to application-specific metadata,
transported, and then converted back
• Trades a small amount of additional computation for potentially large benefits in I/O
• Scientific applications have periodic communication profiles• ParaMEDIC uses FFT to find periodicity and process output
• Preliminary results show 2-5X reduction in output
Query Size(KB)
0-5 5-50 50-150 150-200 200-500 >500 Total
Number of Queries
3,305,170 87,506 25,920 26,524 9,592 248 3,455,000
Estimated Output (GB)
1,139 593 23,555 3,995 ? ? >29,282