Database_Cache Replacemnt Policies(Lyras)

14
Μάθημα: Προχωρημένα Θέματα Βάσεων Δεδομένων Λύρας Κωνσταντίνος

Transcript of Database_Cache Replacemnt Policies(Lyras)

Μάθημα: Προχωρημένα Θέματα Βάσεων Δεδομένων

Λύρας Κωνσταντίνος

How Workload Characteristics affect the Performance

of Cache Replacemnt Policies

• WorkLoad Characteristics ( Web Traces)

• Performance ( Cache Efficiency)

• Cache Replacement ( Remove Documents )

Workload Characteristics

• “One Timers” documents

• File Popularity follows Zipf law

• File Size follows heavy-tail distribution

• Temporal Locality

One Timers• Most of the Files are extremely unpopular.

• Over 90% of the Distinct Files requested only a few times.

• No Benefit in caching one-timers.

• 90% of the requests come to only 2%-4% of the files (concetration of references).

File Popularity • Some Web files are more popular than others.

• Popularity: Number of times a file was requested.

• File Popularity follow the Zipf Law.

Extremely popular file (the top 1% of the unique files received 39% of all client requests), moderately popular files (the top 37% received 78% of the requests) and unpopoular files (one timers)

Each file sorted into decreasing order based on the number of times it was requetsed. Rank 1 given to the file with the most references and rank N granted to the file with the fewest requestes.

File Size• Files in Web are variable size.• File size follow the heavy-tailed distribution• The propability of obtaining extremely large

values is non-negligible.

1) Small Files (100B – 10KB) 20%

2) Medium Files (10 – 15KB) 65%

3) Large Files (15 – KB) 15%

90% of files were HTML or Images

These objects account for only 50% of the total size.

40% of the total size is due to few large files(audio,video).

Pareto:Many small observations mixed in with a few large observation.

Temporal Locality• Files which have recently been referenced are

likely to be-referenced in the near future.• Temporal correlation bewteen recent past and

near future references.• 30% of all re-references to an file occurred within

an hour of the previous reference to the same file. 60% of all re-references occurred within 24 hours

of the previous request.

Performance Metrics

• File Hit Rate(HR) : Percent of requested files found in cache.

HR=70% 7 of 10 request(file) fulfill from proxy. • Byte Hit Rate(BHR): Percent of requested bytes found

in the cache. BHR=70% 7 of 10 bytes returned from the cache, the

rest 3 bytes retrieved across the external network.

Tradeoff HR-BHR

File Hit Rate Byte Hit rate

Maximize: Many Small Files Maximize: Few Large Files

Reduce Overload Web Server Reduce Traffic Network

Web Replacement• LRU : Evicts files that has no be accessed for the

longest time (temporal locality). Most recently referenced files are most likely to be referenced again in near future.

• LFU-Aging : Evicts files with the lowest reference count (file popularity).

• GDS : Assosiate a value H=1/s, to each file. Evicts the file with the lowets H(min) and the H value of all others files are reduce by H(min). So this policy considre both the file size and its temporal locality.

Comparison of Web Replacements

• Higher HR are achieved using size-based replacements, because these policies store a large number of small files.

• Higher BHR are achieved using frequency-based replacements, because these policies keep the most popular files, regardless of size.

How SENSITIVE are theWeb Cache Replacements to

Workload Characteristics?

TARGET• The Goal is to examine the sensitivity of proxing

caching to certain workload characteristics.

• Generate proxy workload, with generator tool, that differ in one chocen characteristic and investigate the sensitivity of cache replacements to each characteristic.

Characteristic Trace 1 Trace 2

Zipf Slope 0.80 0.80

Tail Index 1.4 1.4

Per. One-timers 60% 80%

Analysis of Perfomance