ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling...

23
ΕΠΛ372 Παράλληλη Επεξεργάσια Warehouse Scale Computing and Services Γιάννος Σαζεϊδης Εαρινό Εξάμηνο 2014 READING 1. Read Barroso The Datacenter as a Computer http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006?cookieSet=1 Slides based on Hennesy and Patterson CAQA 5 th edition Morgan and Kaufman

Transcript of ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling...

Page 1: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

ΕΠΛ372 Παράλληλη Επεξεργάσια

Warehouse Scale Computing and Services

Γιάννος ΣαζεϊδηςΕαρινό Εξάμηνο 2014

READING1. Read Barroso The Datacenter as a Computerhttp://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006?cookieSet=1

Slides based on Hennesy and Patterson CAQA 5th edition Morgan and Kaufman

Page 2: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Introduction• Warehouse-scale computer (WSC)

– Provides Internet services• Search, social networking, online maps, video sharing, online shopping,

email, cloud computing, etc.

– Differences with HPC “clusters”:• Clusters have higher performance processors and network• Clusters emphasize thread-level parallelism, WSCs emphasize request-

level parallelism– Differences with co-located-datacenters:

• Datacenters consolidate different machines and software into one location

• Datacenters emphasize virtual machines and hardware heterogeneity in order to serve varied customers

• Private and public cloud…

– Often term DC refers to both WSC and co-located-DC

Introduction

Page 3: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Introduction

• Important design factors for WSC: (same)– Cost-performance

• Small savings add up– Energy efficiency

• Affects power distribution and cooling• Work per joule

– Dependability via redundancy• Commodity HW and software based redundancy

– Network I/O• Consistency and interface to world

– Interactive and batch processing workloads• Search but also calculate meta data (rank pages)

Introduction

Page 4: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Introduction

• Important design factors for WSC: (differ.)– Ample computational parallelism is not important

• Most jobs are totally independent• “Request-level parallelism”• Mostly read and often write to not shared data• Data in storage for batch

– Operational costs count• Power consumption is a primary, not secondary,

constraint when designing system– Scale and its opportunities and problems

• Can afford to build customized systems since WSC require volume purchase. Failures more common

Introduction

Page 5: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Causes of Outages and Anomalies (2400 servers 1st yr)

Number Cause1-2 Power outage4 Upgrades

1000sHard-diskDramProblematic machines

5000 Server crashes

Page 6: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Figure 6.3 Average CPU utilization of more than 5000 servers during a 6-month period at Google. Servers are rarely completely idle or fully utilized, in-stead operating most of the time at between 10% and 50% of their maximum utilization. (From Figure 1 in Barroso and Hölzle [2007].) The column the third from the right in Figure 6.4 calculates percentages plus or minus 5% to come up with the weightings; thus, 1.2% for the 90% row means that 1.2% of servers were between 85% and 95% utilized.

Page 7: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Computer Architecture of WSC

• WSC often use a hierarchy of networks for interconnection

• Each 19” rack holds 48 1U servers connected to a rack switch

• Rack switches are uplinked to switch higher in hierarchy– Uplink has 48 / n times lower bandwidth,

where n = # of uplink ports• “Oversubscription”

– Goal is to maximize locality of communication relative to the rack

Com

puter Ar4chitecture of W

SC

Page 8: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Figure 6.5 Hierarchy of switches in a WSC. (Based on Figure 1.2 of Barroso and Hölzle [2009].)

Page 9: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Storage

• Storage options:– Use disks inside the servers, or– Network attached storage through

Infiniband

– WSCs generally rely on local disks– Google File System (GFS) uses local disks

and maintains at least three replicas

Com

puter Ar4chitecture of W

SC

Page 10: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Array Switch

• Switch that connects an array of racks– Array switch should have 10 X the

bisection bandwidth of rack switch– Cost of n-port switch grows as n2

– Often utilize content addressable memory chips and FPGAs

Com

puter Ar4chitecture of W

SC

Page 11: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

WSC Memory Hierarchy

• Servers can access DRAM and disks on other servers using a NUMA-style interface

Com

puter Ar4chitecture of W

SC

Page 12: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Copyright © 2011, Elsevier Inc. All rights Reserved.

Figure 6.7 Graph of latency, bandwidth, and capacity of the memory hierarchy of a WSC for data in Figure 6.6 [Barroso and Hölzle 2009].

Page 13: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Infrastructure and Costs of WSC• Location of WSC

– Proximity to Internet backbones, electricity cost, property tax rates, low risk from earthquakes, floods, and hurricanes

• Power distribution

PhyscicalInfrastrcuture

and Costs of W

SC

Page 14: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Infrastructure and Costs of WSC• Cooling

– Air conditioning used to cool server room– 64 F – 71 F

• Keep temperature higher (closer to 71 F)– Cooling towers can also be used

PhyscicalInfrastrcuture

and Costs of W

SC

Page 15: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Infrastructure and Costs of WSC• Cooling system also uses water (evaporation and

spills)– E.g. 70,000 to 200,000 gallons per day for an 8 MW facility

• Power cost breakdown:– Chillers: 30-50% of the power used by the IT equipment– Air conditioning: 10-20% of the IT power, mostly due to fans

• How many servers can a WSC support?– Each server:

• “Nameplate power rating” gives maximum power consumption• To get actual, measure power under actual workloads

– Oversubscribe cumulative server power by 40%, but monitor power closely

PhyscicalInfrastrcuture

and Costs of W

SC

Page 16: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Measuring Efficiency of a WSC• Power Utilization Effectiveness (PUE)

– = Total facility power / IT equipment power

– Median PUE on 2006 study was 1.69– Today large facilities close 1.1

• Performance– Latency is important metric because it is

seen by users– Bing study: users will use search less as

response time increases– Service Level Objectives (SLOs)/Service

Level Agreements (SLAs)• E.g. 99% of requests be below 100 ms

PhyscicalInfrastrcuture

and Costs of W

SC

Page 17: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Copyright © 2011, Elsevier Inc. All rights Reserved.

Figure 6.22 Power usage effectiveness (PUE) of 10 Google WSCs over time. Google A is the WSC described in this section. It is the highest line in Q3 ‘07 and Q2 ’10. (From www.google.com/corporate/green/datacenters/measuring.htm.) Facebook recently announced a new datacenter that should deliver an impressive PUE of 1.07 (see http://opencompute.org/). The Prineville Oregon Facility has no air conditioning and no chilled water. It relies strictly on outside air, which is brought in one side of the building, filtered, cooled via misters, pumped across the IT equipment, and then sent out the building by exhaust fans. In addition, the servers use a custom power supply that allows the power distribution system to skip one of the voltage conversion steps in Figure 6.9.

Page 18: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Cost of a WSC• Capital expenditures (CAPEX)

– Cost to build a WSC– Acquire servers, switchets

• Operational expenditures (OPEX)– Cost to operate a WSC

PhyscicalInfrastrcuture

and Costs of W

SC

Page 19: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Cloud Computing• WSCs offer economies of scale that

cannot be achieved with a datacenter:– 5.7 times reduction in storage costs– 7.1 times reduction in administrative

costs– 7.3 times reduction in networking costs– This has given rise to cloud services

such as Amazon Web Services• “Utility Computing”• Based on using open source virtual machine

and operating system software

Cloud C

omputing

Page 20: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Prgrm’g Models and Workloads

• Batch processing framework: MapReduce

– Map: applies a programmer-supplied function to each logical input record

• Runs on thousands of computers• Provides new set of key-value pairs as

intermediate values

– Reduce: collapses values using another programmer-supplied function

Program

ming M

odels and Workloads for W

SC

s

Page 21: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Prgrm’g Models and Workloads

• Example:– map (String key, String value):

• // key: document name• // value: document contents• for each word w in value

– EmitIntermediate(w,”1”); // Produce list of all words

– reduce (String key, Iterator values):• // key: a word• // value: a list of counts• int result = 0;• for each v in values:

– result += ParseInt(v); // get integer from key-value pair• Emit(AsString(result));

Program

ming M

odels and Workloads for W

SC

s

Page 22: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Prgrm’g Models and Workloads

• MapReduce runtime environment schedules map and reduce task to WSC nodes

• Availability:– Use replicas of data across different servers– If one slow or fails start on another– Use relaxed consistency:

• No need for all replicas to always agree

• Workload demands– Often vary considerably

Program

ming M

odels and Workloads for W

SC

s

Page 23: ΕΠΛ372 Παράλληλη ΕπεξεργάσιαInfrastructure and Costs of WSC • Cooling system also uses water (evaporation and spills) – E.g. 70,000 to 200,000 gallons per

Search CloudSuite Benchmark OverviewClient

Multithreaded client

Frontend serverTomcat 7.0.23 running Apache Nutch 1.2

1 to n Index serversHadoop 0.20 IPC server

runningApache Nutch 1.2

1 to n Document serversHadoop 0.20 IPC server running

Apache Nutch 1.2

Query flow description

• Client send query request to frontend server

• Frontend asks each index server to search in it’s own part of index dataset for document matches

• All index servers search their part of index dataset in parallel

• Frontend server collects, sorts the replies and asks the index servers the title and url of top results (detail request)

• Frontend receives the urls and asks the document servers to search in it’s own part of dataset for the summaries of the top documents.

• Frontend receives the summaries and assemblies the reply HTML page and sends the answer back to client

Notes• Index and document dataset is distributed to the number of index and document servers

• Apache Nutch 1.2 gives you the choice of running index and document server seperatedor on the same process together

23