# DEUCE: WRITE-EFFICIENT ENCRYPTION FOR PCM



March 16<sup>th</sup> 2015 ASPLOS-XX Istanbul, Turkey

Vinson Young Prashant J. Nair Moinuddin K. Qureshi



# **EXECUTIVE SUMMARY**

- Bit flips expensive in Phase Change Memory (PCM)
  - Affects Lifetime, Power, and Performance
  - PCM system optimized to reduce bit flips (~12%)
- PCM more vulnerable to attacks 
   stolen module
- Secure PCM with encryption
- Problem: Encryption increases bit flips 12%→50%
- Goal: Encrypt PCM without causing 4x bit flips
- Insight: re-encrypt only modified words
- DEUCE: write-efficient encryption scheme
- Result: bit flips reduced  $50\% \rightarrow 23\%$  (27% speedup)

# PCM AS MAIN MEMORY

#### Phase Change Memory (PCM)

- Improved scaling + density
- Non-volatile (no refresh)



#### Key Challenge

- Writes: Limited endurance (10-100 million writes)
- Writes: Slow and limited throughput
- Writes: Power-hungry



# PCM AS MAIN MEMORY

#### Phase Change Memory (PCM)

- Improved scaling + density
- Non-volatile (no refresh)



#### Key Challenge

- Writes: Limited endurance (10-100 million writes)
- Writes: Slow and limited throughput
- Writes: Power-hungry



#### PCM systems are designed to reduce writes

















Optimizations reduce bit-flips per write to 10-12%

Non-volatility: Power savings 🖒 Security 🖓



More vulnerable to stolen-memory attack

Non-volatility: Power savings 🚺 Security 🖓



More vulnerable to stolen-memory attack

#### **Stolen memory** attack



Non-volatility: Power savings 🚺 Security 🖓



More vulnerable to stolen-memory attack

#### **Stolen memory** attack



Non-volatility: Power savings 🚺 Security 🛀





More vulnerable to stolen-memory attack

**Stolen memory** attack

**Bus snooping attack** also possible





Non-volatility: Power savings 🚺 Security 🖓





More vulnerable to stolen-memory attack

**Stolen memory** attack

**Bus snooping attack** also possible



Non-volatility: Power savings 🚺 Security 🛀





More vulnerable to stolen-memory attack

**Stolen memory** attack

**Bus snooping attack** also possible



We want to protect PCM from both "stolen memory attack" and "bus snooping attack"

# **ENCRYPTION ON PCM**

Protect PCM using memory encryption
Encryption causes 50% bit flips on each write
1 bit flip in line → 50% bit flips in encrypted line



# **ENCRYPTION ON PCM**

Protect PCM using memory encryption
Encryption causes 50% bit flips on each write
1 bit flip in line → 50% bit flips in encrypted line



#### **Avalanche Effect**

# **ENCRYPTION ON PCM**

Protect PCM using memory encryption
Encryption causes 50% bit flips on each write
1 bit flip in line → 50% bit flips in encrypted line



Encryption causes 50% bit flips → Write-intensive

Data-Comparison-Write
Flip-N-Write



Data-Comparison-Write
Flip-N-Write



Data-Comparison-Write



Data-Comparison-Write



Encryption increases bit flips from 12% to 50% (4x!)

## **GOAL: WRITE-EFFICIENT ENCRYPTION**

# Goal: How do we implement memory encryption, without increasing bit flips by 4x?

## OUTLINE

- Introduction to PCM and encryption
- Background on Counter-mode Encryption
- DEUCE
- Results
- Summary

















## **NEED FOR UNIQUE PADS**

#### $(0000) \oplus Pad = Pad$


#### **NEED FOR UNIQUE PADS**

#### $(0000) \oplus Pad = Pad$



#### **NEED FOR UNIQUE PADS**

#### $(0000) \oplus Pad = Pad$



Secure pad encryption cannot re-use pads

#### Different pad per line address



#### Different pad per line address



#### Different pad per line address



Different pad per write to same line → per-line counter



#### Different pad per line address



#### Different pad per write to same line → per-line counter



#### Different pad per line address



#### Different pad per write to same line → per-line counter



### OUTLINE

- Introduction to PCM and encryption
- Background on Counter-mode Encryption
- DEUCE 🛑
- Results
- Summary







What if we re-encrypt only modified words?



Reduce bit flips by re-encrypting only modified words

What if we re-encrypt only modified words?



Reduce bit flips by re-encrypting only modified words

# Naïve implementation needs per-word counters/pads









Naïve implementation needs per-word counters/pads



Reduce storage overhead by using only two counters



Naïve implementation needs per-word counters/pads 5 4 3 6 2 8 7 Encrypted **PCM** Expensive Reduce storage overhead by using only two counters 0 0 0 0 0 0 0 0 Encrypted PCM



Naïve implementation needs per-word counters/pads 5 4 3 6 2 8 7 Encrypted **PCM** Expensive! Reduce storage overhead by using only two counters 0 4 4 0 0 0 4 4 Encrypted **PCM** 

Naïve implementation needs per-word counters/pads



Reduce storage overhead by using only two counters



Do efficient partial re-encryption with only two counters

# **DEUCE: DUAL COUNTER ENCRYPTION**

Each line has two counters:

- Leading counter (*LeadCTR*): incremented every write
- Trailing counter (*TrailCTR*): updated every N writes (*Epoch*)
- "Modified" bit per word to choose between counters





#### **DEUCE: OPERATION**



#### **DEUCE: OPERATION**



#### **DEUCE: OPERATION**



DEUCE re-encrypts only words modified since Epoch








































































**DEUCE** does partial re-encryption between Epochs



Get TrailCTR by masking 2 LSB off LeadCTR DEUCE needs only one physical counter per line



**DEUCE** does partial re-encryption between Epochs

DEUCE generates two pads with LCTR and TCTR

#### DEUCE generates two pads with LCTR and TCTR











### OUTLINE

- Introduction to PCM and encryption
- Background on Counter-mode Encryption
- DEUCE
- Results 🛑
- Summary



### **Core Chip**

- 8 cores, each 4GHz 4-wide core
- L1/L2/L3→32KB/256KB/1MB





Shared L4 Cache



#### Phase Change Memory

Shared L4 Cache:

- 64 MB capacity
- 50 cycle latency



# Phase Change Memory [Samsung ISSCC'12]

- 4 ranks, each 8GB → 32GB total
- Read latency 75ns
- Write latency 150ns (per 128-bit write slot)



#### Workloads

- SPEC2006: High MPKI, rate mode
- 4 billion instruction slice

### DEUCE: Epoch=32; Word size=2B

#### Encrypted+FNW









DEUCE eliminates two-thirds of the extra bit flips caused by encryption

















Bit flip reduction improves speedup and EDP

### **RESULTS: LIFETIME ANALYSIS**

Heavily-written bits still heavily-written with DEUCE

### **RESULTS: LIFETIME ANALYSIS**

Heavily-written bits still heavily-written with DEUCE Solution: Zero-cost "Horizontal" Wear Leveling



### **RESULTS: LIFETIME ANALYSIS**

Heavily-written bits still heavily-written with DEUCE Solution: Zero-cost "Horizontal" Wear Leveling



### OUTLINE

- Introduction to PCM and encryption
- Baseline–Counter-mode Encryption
- DEUCE
- Results
- Summary

# **EXECUTIVE SUMMARY**

- Bit flips expensive in Phase Change Memory (PCM)
  - Affects Lifetime, Power, and Performance
  - PCM system optimized to reduce bit flips (~12%)
- PCM more vulnerable to attacks 
   stolen module
- Secure PCM with encryption
- Problem: Encryption increases bit flips 12%→50%
- Goal: Encrypt PCM without causing 4x bit flips
- Insight: re-encrypt only modified words
- DEUCE: write-efficient encryption scheme
- Result: bit flips reduced 50%→23% (27% speedup)
# THANK YOU



#### **EXTRA SLIDES**

# **DEUCE FOR OTHER NVM**

- In-place writing NVM
  - RRAM
  - STT-MRAM

# WRITE SLOTS

#### PCM is power-limited

Write-slot of 128 Flip-N-Write-enabled bits (64 bit flips)



Average number of write slots used per write request. On average, DEUCE consumes 2.64 slots whereas unencrypted memory takes 1.92 slots out of the 4 write slots





#### Vertical Wear Leveling

• Vertical Wear Leveling (Start-gap)—Inter-line leveling





Vertical Wear Leveling

• Vertical Wear Leveling (Start-gap)—Inter-line leveling





Vertical Wear Leveling

• Vertical Wear Leveling (Start-gap)—Inter-line leveling





Vertical Wear Leveling

• Vertical Wear Leveling (Start-gap)—Inter-line leveling





Vertical Wear Leveling

• Vertical Wear Leveling (Start-gap)—Inter-line leveling





Vertical Wear Leveling

• Vertical Wear Leveling (Start-gap)—Inter-line leveling





Vertical Wear Leveling

• Vertical Wear Leveling (Start-gap)—Inter-line leveling





Vertical Wear Leveling

• Vertical Wear Leveling (Start-gap)—Inter-line leveling





Vertical Wear Leveling

- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling





Vertical Wear Leveling

- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling



Vertical Wear Leveling

- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling



- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling



Vertical Wear Leveling

- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling



Vertical Wear Leveling

- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling



- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling



- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling



Vertical Wear Leveling

- Vertical Wear Leveling (Start-gap)—Inter-line leveling
- Horizontal Wear Leveling—Intra-line leveling



Re-use vertical wear leveling to level inside line

# **QUESTION: IS IT SECURE?**

- Aren't you re-using the pad?
- AES-CTR-mode
  - Unique full pad per ever write.
- DEUCE
  - Epoch start, uses unique full pad
  - Between epochs, uses a unique full pad per write, to reencrypt modified words. Unmodified words simply not touched
  - →whole line is always encrypted, and pads are not re-used

### **MINIMUM RE-ENCRYPT**



### **DEUCE RE-ENCRYPT**



# **QUESTION: IS IT SECURE?**

- What about information leak?
- AES-CTR-mode
  - Can tell when a line is modified
- DEUCE
  - Can tell when a line is modified, and
  - Can sometimes tell when a word is modified
  - Similar information leakage

#### **CIPHER BLOCK CHAINING**



Cipher Block Chaining (CBC) mode encryption

#### **COUNTER-MODE ENCRYPTION**



Counter (CTR) mode encryption

Security bounds no worse than CBC (NIST-approved) Weakness to input is accepted to due to the underlying block cipher and not the mode of operation

#### Encrypted+FNW









DEUCE eliminates two-thirds of the extra bit flips caused by encryption



DEUCE eliminates two-thirds of the extra bit flips caused by encryption



DEUCE eliminates two-thirds of the extra bit flips caused by encryption