SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages,...

32
SEESAW: Set Enhanced Superpage Aware caching Mayank Parasar , Abhishek Bhattacharjee , Tushar Krishna School of Electrical and Computer Engineering Georgia Institute of Technology Department of Computer Science Rutgers University [email protected] Set Associativity http://synergy.ece.gatech.edu/

Transcript of SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages,...

Page 1: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW:Set Enhanced Superpage

Aware cachingMayank Parasar∑, Abhishek

BhattacharjeeΩ, Tushar Krishna∑

∑School of Electrical and Computer EngineeringGeorgia Institute of Technology

ΩDepartment of Computer Science Rutgers University

[email protected]

Set

Associativityhttp://synergy.ece.gatech.edu/

Page 2: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Outline¡Motivation

¡SEESAW: Concept

¡SEESAW: Micro-architecture

¡Evaluation Methodology

¡Results

¡Conclusion

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

2

6/26/18

Page 3: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

3

6/26/18

L1 Cache Characteristics

Fast lookup

High hit-rate

EnergyEfficiency

Page 4: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Virtually Indexed Physically Tagged [VIPT] Cache

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

4

6/26/18

TLB

PPN Page Offset

tag Data blockv

VPN Page Offset

Setindex

block offset

Cache HIT/MISS

VA

PA

Way-1Way-2Way-3Way-4

Way-1Way-2Way-3Way-4

set-1

set-NWay-1Way-2Way-3Way-4

Way-1Way-2Way-3Way-4

=

Page 5: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Virtually Indexed Physically Tagged [VIPT] Cache

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

5

6/26/18

TLB

PPN Page Offset

tag Data blockv

VPN Page Offset

Setindex

block offset

Cache HIT/MISS

VA

PA

Way-1Way-2Way-3Way-4

Way-1Way-2Way-3Way-4

set-1

set-NWay-1Way-2Way-3Way-4

Way-1Way-2Way-3Way-4

=

VIPT Caches necessitate:

(set-index + block-offset) <= Page-offset

Page 6: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Impact of Associativity on Access Latency and Energy of cache

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

6

6/26/18

Cache Access Latency Cache Access Energy

Page 7: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Effect of associativity on MPKI of cache

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

7

6/26/18

High Associativity hurts latency and energy without commensurately improving hit rate

Page 8: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

8

6/26/18

Revisiting L1 Cache Characteristics for VIPT Cache

Fast lookup

High hit-rate

EnergyEfficiency

Virtual

memory!

Virtual

memory!

Page 9: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

9

6/26/18

Opportunity: SuperpageIs it possible to relax constrains of

Traditional VIPT cache? Yes

How ?

4-KB2-MB

1-GB

More page-offset bitsfor superpage!

HW and OS Support for Superpagesin modern processors

Baseline Page

Super Page

Offset-bits:12

Offset-bits:21

Offset-bits:30

Page 10: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Prevalence of superpages in modern OSes under memory fragmentation

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

10

6/26/18

Ran on 32-core; Sandybridge; 32 GB RAMMemhog causes memory fragmentation; higher

%age indicates higher fragmentation

Page 11: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Outline¡Motivation

¡SEESAW: Concept

¡SEESAW: Micro-architecture

¡Evaluation Methodology

¡Results

¡Conclusion

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

11

6/26/18

Page 12: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Concept

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

12

6/26/18

Less-setsMore-associativity

More-setsLess-associativity

super-page

Base-page

tag Data blockv

Way-1 Way-1

Way-1

Way-1

Way-1

Way-1Way-2 Way-2

Way-2

Way-2

Way-2

Way-2

Way-3 Way-3

Way-3Way-3

Way-3 Way-3

Set:1

Set:2

Set:3

tag Data blockv

Way-1 Way-1

Way-1

Way-1

Way-1

Way-1Way-1 Way-1

Way-1

Way-1

Way-1

Way-1

Way-1 Way-1

Way-1Way-1

Way-1 Way-1

Set:1

Set:3Set:2

Set:4Set:5Set:6

Set:7Set:8Set:9

Faster

Energy-Efficient

Page 13: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Outline¡Motivation

¡SEESAW: Concept

¡SEESAW: Micro-architecture

¡Evaluation Methodology

¡Results

¡Conclusion

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

13

6/26/18

Page 14: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Micro-architecture

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

14

6/26/18

VPNSet

indexblock offset

Cache

VA

TLB

PPN Basepage OffsetPA

set-N

set-1

tag Data blockvWay-3Way-4

Way-3Way-4

Way-3Way-4

Way-3Way-4

Basepage Offset

tag Data blockvWay-1Way-2

Way-1

Way-2

Way-1Way-2

Way-1Way-2

set-1

set-N

Partition bit

Translation Filter Table

(TFT)

Partitiondecoder

Predicts whether page is superpage

Partition-0 Partition-1

Superpage offsetDecodes

partition index from partition bit

Page 15: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Micro-architecture

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

15

6/26/18

VPNSet

indexblock offset

VA

TLB

PPN Basepage OffsetPA

set-N

set-1

tag Data blockvWay-3Way-4

Way-3Way-4

Way-3Way-4

Way-3Way-4

Basepage Offset

tag Data blockvWay-1Way-2

Way-1

Way-2

Way-1Way-2

Way-1Way-2

set-1

set-N

Partition bit

Translation Filter Table

(TFT)

Partitiondecoder

Partition-0 Partition-1Cache

Superpage offset

Page 16: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Superpage access

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

16

6/26/18

VPNSet

indexblock offset

VA

TLB

PPN Basepage OffsetPA

set-N

set-1

tag Data blockvWay-3Way-4

Way-3Way-4

Way-3Way-4

Way-3Way-4

Basepage Offset

tag Data blockvWay-1Way-2

Way-1

Way-2

Way-1Way-2

Way-1Way-2

set-1

set-N

Partition bit

Translation Filter Table

(TFT)

Partitiondecoder

Partition-0 Partition-1Cache

Super Page

Superpage offset

= HIT/MISS

Page 17: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Basepage access

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

17

6/26/18

VPNSet

indexblock offset

VA

TLB

PPN Basepage OffsetPA

set-N

set-1

tag Data blockvWay-3Way-4

Way-3Way-4

Way-3Way-4

Way-3Way-4

Basepage Offset

tag Data blockvWay-1Way-2

Way-1

Way-2

Way-1Way-2

Way-1Way-2

set-1

set-N

Partition index

Translation Filter Table

(TFT)

Partitiondecoder

Partition-0 Partition-1Cache

Not a Super Page

= HIT/MISS

Page 18: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: TFT and Partition Decoder

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

18

6/26/18

Superpage?Tag: VA[63:21]

Partitiondecoder

Translation Filter Table

(TFT)

Translation Filter TableØ TFT Lookup

Ø Direct mappedØ False negative due to size

Ø TFT UpdateØ VA mispredictionØ 2MB L1-TLB fillØ 2MB L1-TLB Invalidation

Partition DecoderØ For 32kB CacheØ For 64kB Cache

Page 19: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Cache line insertion policy

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

19

6/26/18

VPNSet

indexblock offset

VA

TLB

PPN Baseline Page OffsetPA

set-N

set-1

tag Data blockvWay-3Way-4

Way-3Way-4

Way-3Way-4

Way-3Way-4

Baseline Page Offset

tag Data blockvWay-1Way-2

Way-1

Way-2

Way-1Way-2

Way-1Way-2

set-1

set-N

Partition bit

Translation Filter Table

(TFT)

Partitiondecoder

Partition-0 Partition-1Cache

Which partition should cache-

line be inserted?

Page 20: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Cache line insertion policy¡4way-8way¡Superpage miss: victim within the partition¡Basepage miss: victim within the set

¡4way¡Uses LRU within the associated partition¡Avoid installing the same line twice¡Saves energy

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

20

6/26/18

Page 21: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: System Level Optimization¡Cache coherence ¡Cache coherence lookups use physical address ¡Snoopy provide higher energy benefits over Directory based

coherence

¡Page table modifications¡Superpage splintered into multiple basepages¡Multiple basepages promoted to superpages

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

21

6/26/18

Page 22: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Outline¡Motivation

¡SEESAW: Concept

¡SEESAW: Micro-architecture

¡Evaluation Methodology

¡Results

¡Conclusion

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

22

6/26/18

Page 23: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Simulated system

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

23

6/26/18

Page 24: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Workloads

¡Spec¡Parsec¡Cloudsuite¡Tunkrank

¡Biobench¡Mummer¡Tiger

¡MongoDB

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

24

6/26/18

¡Server Workload¡graph500¡Nutch Hadoop

¡Social-event web service¡Olia

¡Key value store¡Redis

Page 25: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Outline¡Motivation

¡SEESAW: Concept

¡SEESAW: Micro-architecture

¡Evaluation Methodology

¡Results

¡Conclusion

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

25

6/26/18

Page 26: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Performance improvement

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

26

6/26/18

SEESAW observes 3-10% better runtime over baseline

Page 27: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Performance improvement

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

27

6/26/18

Out-of-orderCPU

in-orderCPU

~10% performance improvement for 64kB cache in OoO CPUs

Page 28: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Energy savings

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

28

6/26/18

10-20% more energy savings over CPUs using baseline VIPT caches!

Approx. one-third of energy savings from coherence

Page 29: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: TFT analysis and Way-Prediction

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

29

6/26/18

TFT Analysis SEESAW + Way-prediction

16-entry TFT drives miss-rate under 10%SEESAW+WP shows symbiotic behavior

Page 30: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Outline¡Motivation

¡SEESAW: Concept

¡SEESAW: Micro-architecture

¡Evaluation Methodology

¡Results

¡Conclusion

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

30

6/26/18

Page 31: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

31

6/26/18

Revisiting L1 Cache Characteristic

Fast lookup

High hit-rate

EnergyEfficiency

Page 32: SEESAW: Set Enhanced Superpage are caching¡SEESAW provides low-associative access to superpages, providing both latency and energy benefits ¡Up to 10 % performance improvement and

SEESAW: Conclusion

Mayank Parasar, School of Electrical and Computer Engineering, Georgia Tech

32

6/26/18

SetAssociativity

¡ L1 caches are optimized for latency¡ VIPT imposes indirect restriction on number of

sets in a L1 cache, increasing associativity¡ There is non-linear relation between associativity

and access latency/energy of the L1 cache

¡ Superpages are often used in modern OSes¡ SEESAW provides low-associative access to

superpages, providing both latency and energy benefits

¡ Up to 10 % performance improvement and 20 % energy reduction in modern workloads

¡ SEESAW has extremely low-overhead and is readily implementable