FHE Hardware

blogpage sketch fhe


Advances in specialized hardware (ASICs)

While algorithms boosts FHE speed 8x per year, the hardware developments compound this effect. There is growing interest in designing FHE-specific hardware.


ASIC stands for Application-Specific Integrated Circuit. They are specialized hardwares, or you can call custom-designed chips, that are built to perform well-defined set of operations. The mainstream CPUs are designed to be reasonably performant across many types of computations. If you know what kind of math operations you will do beforehand (and in what sequence you will do), you can tailor your chip to be very performant on those.


The GPUs themselves are ASICs to CPUs. They are designed to be very performant on matrix multiplications (matmuls). That means, GPUs are also advantageous for FHE computations which involve lots of matmuls. But FHE, in addition to matmuls, also relies on the following operations:


Therefore, there are opportunities for FHE ASICs that can perform better than generic GPUs, by optimizing above operations.


An example ASIC design (HE-Booster on GPU) is below. Notice that it includes modules for CRT, NTT, and some other HE-specific operations.


Pasted image 20250716181005.png


FHE ciphertexts are 40-1000x larger than the plaintext, meaning that ASICs need to have lots of memory. Additionally, FHE is embarrassingly parallel (a feature of lattice-based math) that creates other opportunies for special hardware.


Hardware acceleration

a-high-level-technical-overview-of-fully-homomorphic-encryption--archived

Hardware acceleration


One major aspect of FHE research is the work on designing custom hardware to accelerate FHE operations. There are many projects, but let me summarize a few I am more familiar with.


The most prominent group of projects are part of a DARPA program called DPRIVE. In short, DARPA is funding FHE hardware designers with a challenge to perform logistic regression, CNN inference, and CNN training as quickly as possible in FHE. There are currently four participants:



All DPRIVE participants are working on ASICs that accelerate arithmetic FHE schemes (BFV/BGV and CKKS), and they are at various points along the path toward fabrication. Their initial performance claims are based on simulation, but to trumpet the horn of the underdog, Niobium, their initial paper claims a 5,000x speedup over CPU, with logistic regression of a 1,024-sample, 10-feature dataset is estimated to take 40 seconds versus 60 hours on CPU. To me this is a lower bound on what’s possible with hardware acceleration. At the core of most of these accelerators are accelerations of number-theoretic transforms (NTTs) and other polynomial operations in the relevant polynomial rings. To my understanding, the hard parts of these accelerators are packing enough RAM into them so that they can store all of the ciphertexts and auxiliary key material and get good memory locality.


Another project I’m familiar with is an FPGA-based approach to accelerating CGGI, which goes by the name FPT, out of the COSIC research lab at KU Leuven. They use an Alveo U280, and they functionally bootstrap pipelined batches of 16 ciphertexts at a time to achieve a throughput of 1 bootstrap / 35 microseconds. I’ve seen their live demo, in which they run Conway’s Game of Life in CGGI, and the animation is effectively in real time. Unlike the NTT-crunching machines from the DPRIVE program, FPT is an FFT-crunching machine. Naturally, this project starts from the TFHE-rs API for CGGI.


Then there are approaches I’m less familiar with. The folks at Intel have a HEXL project that focuses on targeting Intel CPUs using AVX and similar modern CPU fanciness. There are also folks at NVIDIA working on GPU acceleration, and the HEaaN library (CKKS) also supports GPU acceleration. There is also a company called Optalysys that is building an optical computing chip for FHE. The idea there is that, by using interference patterns of light passing through lenses (or rather, nanoscale equivalents), one can compute Fourier transforms “at the speed of light,” and in doing so accelerate bootstrapping.


And finally, I’m working on my own hardware acceleration approach: CGGI on TPUs. This is in an open source library called jaxite (named so because it’s written in JAX). The performance is nothing to write home about yet, but my hope is that if I can get performance to be 10-100x faster than CPU, then I can use the fact that Google already has TPUs deployed at scale to ship some FHE products before more intense hardware acceleration is ready at scale.


For some more details on these and other accelerators I know less about, see this paper, “SoK: Fully Homomorphic Encryption Accelerators”.












FHE Hardware Startup Ecosystem

A new wave of companies is building FHE-specific acceleration:


Niobium Microsystems

  • Raised $5.5M in 2024 for FHE accelerator chips

  • Focus: Custom ASICs optimized for lattice-based cryptography

  • Potential: 1000-10,000× speedup for FHE operations


Fabric Cryptography

  • Building custom silicon for cryptographic workloads

  • FHE-optimized processors with specialized arithmetic units

  • Targeting cloud deployment and edge computing


Agita Labs

  • Hardware acceleration for privacy-preserving computation

  • Focus on making FHE practical for real-time applications


Duality Technologies

  • Software-hardware co-design approach

  • Real deployments in healthcare (patient privacy) and finance (fraud detection)

  • Notable: CTO founded the Palisade Library, chief cryptographer developed BGV scheme



From Fully Homomorphic Encryption to Silicon - What is Microsoft's HEAX?


https://x.com/i/grok/share/zUMo4KqWBszye3FbM4Qv8ozjW


https://www.jeremykun.com/2024/05/04/fhe-overview/#hardware-acceleration



Niobium - https://www.biometricupdate.com/202405/niobium-raises-5-5m-to-develop-fully-homomorphic-encryption-accelerator-chip


Chips to Compute With Encrypted Data Are Coming

  • FHE Hardware-accelerator startups:

    • https://niobiummicrosystems.com/

    • https://www.fabriccryptography.com/

    • https://agitalabs.com/

    • https://dualitytech.com/

      • preserving patient privacy in healthcare

      • financial firms, check for fraud

      • The CTO of Duality is also the founder of the Palisade Library while their chief cryptographer is the developer of a leading FHE scheme called BGV.



FHE Market Landscape

-


Pasted image 20250714153614.png

  • https://x.com/Dod_2206/status/1943312631227650410

  • TFHE-rs (github)

    • https://docs.zama.ai/tfhe-rs





Outgoing Internal References (1)

Outgoing Web References (13)

Receive my updates

Barış Özmen © 2025