



ALMA MATER STUDIORUM UNIVERSITÀ DI BOLOGNA

# Security and safety using open-source hardware, The story so far.

Frank K. Gürkaynak kgf@iis.ee.ethz.ch

**PULP Platform** Open Source Hardware, the way it should be!



@pulp\_platform >> pulp-platform.org



# In 2020 at a workshop at HiPEAC I gave a talk



**ETH** zürich

PULP PLATFORM Open Source Hardware, the way it should be!

# Will open-source hardware solve your security issues?

Frank K. Gürkaynak, ETH Zürich

CS<sup>2</sup> - 7<sup>th</sup> Workshop on Cryptography and Security in Computing Systems, 20.01.2020 Bologna, ITALY











### And at that time I said...





Security and safety using open-source hardware, the story so far - Crosscon 2024 CRUSSICON 3



### It has been 1621 days, what has changed?

- Open source is still a tool to help tackle security issues
  - It allows us to investigate the issues better
  - We can experiment with relevant systems easier
  - Share information more easily
  - Make changes and drive the developments openly (i.e. through RISC-V technical WGs)
- But that was about security, how about safety and reliability
  - Exactly the same arguments can be made for safety and reliability as well
  - In addition, several concerns/solutions for one has the potential to help the other
- We no longer have to rely on vendor solutions, we can develop our own

Unfortunately, this does not really solve the problems

### Still need to develop our own solutions for safe, reliable & secure systems

CRUS

# Safety, Security and Reliability: Not same, but related Security solutions that work for People with ill-intent can not circumvent the defined rando tenen. one Malicious people deliberately tr Reliability uting system the sectors and cauters conditioned and cauters will be actively System

The e Safety

**ETH** zürich

The cor

#### System cor. **K** correctly regardless of what happens



errors



# Knowing how things exactly work is vital

From the "ZombieLoad" paper

**ETH** zürich

From section 3.2, emphasis added for this presentation

"While we identified some necessary building blocks to observe the leakage (cf. Section 5), we **can only provide a hypothesis on why** the interaction of the building blocks leads to the observed leakage. As we could only observe data leakage on Intel CPUs, we assume that this is indeed an implementation issue (such as Meltdown) and not an issue with the underlying design (as with Spectre)."

Closed implementations hide/abstract many secrets from users

Ability to see inside and run experiments is vital for safety and security experts

M. Schwarz, M. Lipp, D. Moghimi, J. Van Bulck, J. Stecklina, T. Prescher, D. Gruss, "ZombieLoad: Cross-Privilege-Boundary Data Sampling", arXiv:1905.05726







### How can open source HW help?

- Know what is really inside
- More and independent verification of blocks
- Be able to experiment without constraints
- Share the information freely
- Fairer benchmarking

**ETH** zürich

- Access to SoA systems to work on
- After all: Open-source SW has proven useful why should open source HW be different?







# Counterpoint: Just because it is open, bugs won't go away

- Debian openssl -- predictable random number generator
  - On 2 May 2006, a patch was applied to Debian sources to fix unitialized variables
  - These unitialized variables were intentionally used for the random numbers
  - ... which generated seeds for (among others) RSA keys in the openSSL library
  - After the patch, there was very little entropy left. It was possible to guess (private) RSA keys
    - It took only 6 hours to generate all possible 4096 bit RSA keys using 32 Xeon cores.
  - The issue was discovered on 18 May 2008. Almost exactly two years later
- A serious security bug remained in plain sight for two years
  - Although everything was open source and logged nobody noticed!

**H**zürich

David Ahmed, "Two Years of Broken Crypto: Debian's Dress Rehearsal for a Global PKI Compromise", IEEE Security and Privacy, vol 6, pp 70-73, Sep/Oct 2008



Security

# $\mathbb{RISC}$ A free ISA to build SoA computer systems



### • It is FREE

- Everybody can build, sell, and make RISC-V cores available
- The description is **FREE**, implementations can be FREE or proprietary
- It is a modern design, no historical baggage
  - Some common ISAs (ARM, Intel..) have been around for 20+ years Newer implementations, still need to be compatible to older designs.
  - RISC-V benefited form the mistakes made by others, cleaner design
  - Major design decisions have been properly motivated and explained
- Reserved space for extensions, modular
- Open standard, you can help decide how it is developed





# Are RISC-V processors better than XYZ?

- Actual performance depends on the implementation
  - RISC-V does not specify implementation details (on purpose)
- It is a modern design, should deliver comparable performance
  - If implemented well, it should perform as good as other modern ISA implementations
  - In our experiments, we see no weaknesses when compared to other ISAs
  - It also is not magically 2x better
- High-end processor performance is not much about ISA
  - Implementation details like technology capabilities, memory hierarchy, pipelining, and power management are more important.



# It is not that cores from XYZ are insecure, unreliable



- Most commercial processors have well thought out solutions
  - In most likelihood better than anything we have in open-source hardware
- But researchers do not always have access
  - Work and insights can not be shared freely between researchers
  - Experimenting is limited, you work with what is given
  - Results and changes can not be verified independently

This is where we expect the most from open-source hardware: Access



# A processor core is but one part of a modern SoC







Security and safety using open-source hardware, the story so far - Crosscon 2024 CRUSS TON 12

# In a typical design, innovation is only in a limited scope





### **Open-source silicon-proven SoC template helps concentrate work where it counts**



**ETH** zürich

UNIVERSITÀ DI BOLOGN





IC Design process is complicated with many stakeholders











# Reliability, Safety, Security are VERTICAL problems



| Abstraction Layer      | Example                | Attacks Security         |
|------------------------|------------------------|--------------------------|
| Service                | E-Voting servic        | Legal challenges         |
| Users                  | Voters                 | Social engineering       |
| Application            | Swiss Post             | Bugs / backdoors in SW   |
| Algorithms / libraries | RSA / ope SSUES        | Weaknesses in Algorithms |
| Operating Systems      | seL4 at ALL            | Privilege elevation      |
| Architecture           | NXP - i.M. Levels      | Memory/cache attacks     |
| Microarchitecture      | Open                   | HWttacks on control flow |
| Digital Electronics    | Adders, ga             | Side channel leakage     |
| Physics                | Electrons, Quan states | s Environment            |



Security and safety using open-source hardware, the story so far - Crosscon 2024 **CRU55** 16



### RISC-V and open source HW are great to support..

- Research in Reliability, Safety, Security
  - In the last 5 years more and more work on RISC-V based systems
- We need representative example systems to
  - Investigate/expose flaws/problems
  - Find solutions to fix these issues
  - Properly understand the cost/benefit trade-offs of these solutions
- But open-source and RISC-V alone are not sufficient
  - Reliability, Safety and Security need solutions at all levels.
  - Open source can help but is not sufficient
    - False expectations can actually hinder development
  - There is still more work to be done

### What has the PULP team done in these areas



CROSS

# Team of 100 people in ETH Zürich – University of Bologna

P B

Research on open-source energy-efficient computing





# Our research focus: cluster-based many-core accelerators

mem

bank

ACC

#2

**Cluster 1** 

### **Innovation factors**

#### Extensions to processor cores

- Explore new extensions
- **Efficient implementations**

#### **Shared-memory Accelerators**

- Domain specific
- Local memory

#### **Multiple computing clusters**

- Communication
- Synchronization

**External** 12 Accelerator Memory mem mem mem mem mem Controller #1 bank bank bank bank bank Tightly coupled data memory interconnect L2 memory DMA ACC compute compute compute compute 12 Support #1 core core core core Accelerator core #2 EXT EXT EXT EXT 12 **Instruction Cache Peripherals** Accelerator

#### High-speed on-chip interconnect (NoC, AXI, other..)

#M

Additional

accelerators

Support Infrastructure



Security and safety using open-source hardware, the story so far - Crosscon 2024

Computing cluster with tightly coupled accelerators



### We have created a sandbox to design System on Chips





# Accelerating cryptographic functions



- Key challenge: I/O bandwidth
  - Not so difficult to design fast crypto HW
  - Need to match the rest of the system
  - Bandwidth to memory/bus the issue
- Fulmine (UMC65)
  - 2 TCDM ports 64bits/cycle
  - AES unit (2 rounds/cycle)
    - 0.38 cpb (8 kByte block); Intel Xeon AES-NI 1.18 cpb
    - @0.8V and 84 MHz, 1.76 Gbit/s, 120 pJ per byte (chip)
  - Also SHA3 unit and other accelerators



### Let's look at how the accelerator works a bit more in detail

F. Conti et al., "An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 64, no. 9, pp. 2481-2494, Sept. 2017.



# For a typical AES accelerator (encryption)



- You need
  - 128 bits of input data (plaintext)
  - 10 rounds of operation (AES-128)
- This data will need
  - 128 bits of key (AES-128)
- And produce
  - 128 bit of output data (ciphertext)
- **Output dependence** 
  - Some cryptographic modes
    - CFB, OFB, CBC

- What are your options?
  - How many bits you process per step
    - 1/2/4/8/16/32/64/128
    - Let's take max:128, one round every step
  - How many rounds you process per clock
    - 1/2/5/10
    - Let's take: Two rounds every clock cycle
- What is the I/O?
  - Assuming the key is taken care of
  - We need 256 bits every 5 cycles or 51.2 bits/cycle

### You can easily get 10Gb/s as long as you can transfer 50+ bits/cycle





# Notice that it is the I/O bandwidth that is your issue



### How you can accelerate

#### **Extensions to processor cores**

- Limited by the processor bit-width
- Vector could help, but is large
- People mostly want this

#### **Shared-memory Accelerators**

- Can adapt memory I/O to your needs
- This is extremely efficient

#### **Independent Accelerators**

- Communication (Bus I/O)
- **Synchronization**
- Compute units are fairly small

High-speed on-chip interconnect (NoC, AXI, other..)



### For most cases, shared memory accelerators are very efficient, but overlooked





# Leakage resilient cryptography

- Reduce Attack surface
  - A new key (K<sup>\*</sup>) is generated per data block
- Encryption example (2PRG)
  - E function is AES
  - g finite field multiplication with 1<sup>st</sup> order masking
  - Max throughput 5.29 Gbit/s @ 256 MHz
  - Needs 2x Block ciphers for same throughput
- Strong side channel resilience within IoT Power budget
  - Implemented and tested in Fulmine (last slide)

Robert Schilling, Thomas Unterluggauer, Stefan Mangard, Frank Gürkaynak, Michael Muehlberghuber, Luca Benini, "High-Speed ASIC Implementations of Leakage-Resilient Cryptography", DATE 2018

K



2PRG

 $K_0^*$ 

 $C_{\rm A}$ 

K

Security

IAIK

### Attacks against the control flow



- Can be realized in both HW and SW
  - A successful attack on a processor changes the order of executed instructions
  - Can be used to execute malicious code or jump over security checks
- HW attacks can be realized by controlling environment
  - Clock or voltage glitches
  - Injecting electromagnetic pulses
- Small IoT devices more vulnerable
  - They operate in potentially hostile environment
  - Have less resources to withstand attacks from a capable adversary





- Sponge based construction to decrypt instructions
  - AEE Light with 32 bit state and 32 bit capacity in APE mode
  - Used **Prince** for permutation allowing single cycle execution



Attacker has to change instructions and state at the same time



### Patronus: RISC-V system with CFI

- Additional pipeline stage in Ibex for decryption
  - LLVM based compilation flow
- Only 25-35% power/area overhead
- Additional instructions for branches added as instruction set extensions
- About 10% runtime overhead due to patches
- Probability of illegal instruction trap when instruction altered
  - 91.51% within 1 cycle
  - 99.19% within 2 cycles
  - 99.95% within 3 cycles





Mario Werner, Thomas Unterluggauer, David Schaffenrath, Stefan Mangard, "Sponge-Based Control-Flow Protection for IoT devices", 2018 IEEE European Symposium on Security and Privacy



Security and safety using open-source hardware, the story so far - Crosscon 2024

### Securing covert channels



• Several attacks are based on passing information between tasks



- Covert channels are used to pass information between tasks
  - Most channels are based on state of hardware that is retained between task switches
  - Branch prediction history, caches, reorder buffers
- Attacks can be mitigated by 'securing' covert channels



### **Timing Channel**

UNIVERSITÀ DI BOLOGN





Security and safety using open-source hardware, the story so far - Crosscon 2024 **CRU55** 29



# CVA6 (Ariane)



- Open-source 64-bit application-class RISC-V processor
- Boots Linux (or seL4)
- Developed by PULP team at ETH
- Now owned and maintained by OpenHW Group
- Widely used in academia and industry





**RTL** Simulation



RISC-V®



Security

### speed, turn-around time, cost





Security and safety using open-source hardware, the story so far - Crosscon 2024 CRUSS TON 30



### **Timing Channels on CVA6**





Security and safety using open-source hardware, the story so far - Crosscon 2024 CRUSS [201]

### fence.t instruction to clear microarchitectural state







**ETH** zürich

UNIVERSITÀ DI BOLOGNA

Security and safety using open-source hardware, the story so far - Crosscon 2024 **CRUSS** 32



### How the Microreset affects timing channels

Probability



### Unmitigated



### fence.t (Microreset)



 $N = 10^{6}$ , M = 21.7 mb,  $M_{0} = 27.8$  mb



Security and safety using open-source hardware, the story so far - Crosscon 2024 CRUSS [CON 33

### When particles play tricks on you

- There are environments that are not friendly to Integrated Circuits
  - Space
  - High-energy physics experiments
- Particles can cause all kinds of issues
  - Flip bits in memory / FFs
  - Affect timing and corrupt next state calculations
  - Destroy part of the circuit

### More random (unlike for example fault attacks)

- It is about statistics, and statistics can help us
- Particle hits affect also circuits in the vicinity (spatial separation can help)
- Redundancy is the key approach. Find a balance between redundancy and performance



Reliability

### Take our 32bit Micro controller - PULPissimo



Reliability

CRU55 CON 35

https://github.com/pulp-platform/pulpissimo

### Trikarenos – PULPissimo with Reliability





Security and safety using open-source hardware, the story so far - Crosscon 2024 **CRU55** Security 36



Reliability

### Trikarenos – PULPissimo with Reliability





Security and safety using open-source hardware, the story so far - Crosscon 2024 CRUSSICON 37



Reliability

### Trikarenos – ASIC implementation in TSMC28



UNIVERSITÀ DI BOLOGNA

**ETH** zürich

- 2 mm<sup>2</sup>@250 MHz
- 3 separate lbex cores
- 256 KiB Memory in 8 word-interleaved banks

Reliability

Legend:
Cores
HMR Unit
Memory (w/ ECC en-/decode)
Interconnect
Debugger
Logging & control registers, ROM,



...





Security and safety using open-source hardware, the story so far - Crosscon 2024

Experiments

### What is next for Trikarenos? Actually go to Space







Reliability

### Carfield SoC Flooplan – Arrived this Tuesday

VERSITÀ DI BOLOGN





Safety

# The safety island in Carfield implemented in Intel16 Safety

- SentryCore Configuration
  - CV32RT, FPU (32bit), CLIC
  - TCLS, ECC Memory
  - 64 bit AXI interface, no DMA
  - Total 128 KiB ECC-protected Memory
- Physically separated TCLS Cores
  - 20 µm margins

**ETH** zürich

- Avoids multi-bit error from single particle
- Implementation Results
  - Clock Frequency: 500 MHz
  - Area: **0.42 mm**<sup>2</sup>
  - Power (preliminary): 50-70 mW



Security and safety using open-source hardware, the story so far - Crosscon 2024 CRUSSICON 42

### Pre-demonstrator to show what can be achieved



**ETH** zürich

### **Independent DMA unit**

Transfer of data in and out of the mega-IP

Safety

### **Triple-Core Lockstep**

Majority Voters on all outputs

### **ECC-protected Memory**

- 39-32 Hsiao Code for single error correction, double error detection
- CV32RT [Balas et. al., 2023]
  - CV32E40P-based real-time core
- **Core-Local Interrupt Controller (CLIC)**

### fastIRQ extension



How open source helped us in our projects

- Start from a working system, no need to reinvent everything
  - Silicon proven SoC templates available
- Easy to extend with accelerators and custom blocks/memories
  - Platforms designed for heterogeneous acceleration (Occamy, Mempool)
- Not limited by previous design choices
  - Everything can be adapted and changed
- Full source code allows you to observe/record everything
  - Not limited to available performance counters/timers, build add/your own
- Possible to exploit the results commercially



# Most projects we have involve security, safety, reliability

P D P

- There is much more to be done
  - We have made a decent start
  - Different SoC templates available to implement a variety of solutions
  - And we have some great ideas
- We need partners to reach our goals
  - We are not experts in security, safety, reliability
  - We need partners to co-operate and learn from
- Open source hardware helps reach these goals
  - Easier to collaborate, less roadblocks
  - Collaborations do no block commercial exploitation (permissive licensing)

The future is bright



