Final-Level

Cache (FLC)

Fundamental High Memory Bandwidth Technology

FLC is an architecture that redefines memory on modern devices. It offloads traditional memory usage to less expensive flash memory and solid-state drives while using only a small amount of expensive DRAM as cache. It dramatically reduces the size, cost, and power requirements of anything from personal devices to generative AI capable servers.

Learn more

ADVANCED & SCALABLE

Memory

Hierarchy

High-bandwidth, moderate-capacity DRAM inserted as final-level cache (FLC) for enhancing the performance of standard DDR memory. Additionally, the DDR memory can be used as a massive workload cache to hide the latency of storage (e.g., SSD) when used as a final memory.
Optimal combination of bandwidth, latency, capacity, and power dissipation
Economic & energy-efficient way to build petabyte scale accessible DRAM/SSD pool

INNOVATIVE & DISRUPTIVE

Massive Cache

Architecture

Very High (>95%) Hit Rate for FLC-1 High Bandwidth Cache; FLC-2 ~100% hit rate
Fully-associative look-up engine with gigantic entries (e.g., 32K/64K for 128MB cache)
Large cache line (e.g., 2KB, 4KB, 16KB, or larger)
Multi-level (2 or more) caching
Effective in inspecting & managing (= masking or mapping out) defective or failing memory addresses
Cache DRAM or HBM3 for FLC level 1

Final-Level Cache (FLC)
Fundamental High Memory Bandwidth Technology

Memory Latency When Fully Active (Without FLC)

Low latency speed (=Published Spec., e.g. ~60ns) when idle

Big latency (e.g. >200ns) when fully active

Why High-Bandwidth
FLC Wins

Sufficient bandwidth available in FLC1 for full memory access requests

Economic & energy-efficient way to build gigantic total (~peta bytes) accessible DRAM/SSD pool

What Happens
When FLC 1 Misses?

Low latency from almost idle DDR for FLC 2

Much lower than conventional implementation without FLC1

Low FLC2 activity (Few % of Time when FLC1 Misses)

Without FLC

Typical DDR or CXL memory has very high latency due to the inherent overhead of CXL.

With FLC

The high bandwidth Cache DRAM hit rate of >95% results in significantly reduced latency. This is shown in the second graph. Even when missing FLC-1 High Bandwidth Cache, latency remains low, due to low utilization of the DDR as shown in the third graph.

>95%

Cache Hit Rate

PROVEN PERFORMANCE BOOST

FLC Consumer Benchmark

Details

Test Conditions: GPU 12 Core / Quad-A77 2GHz / Quad-A55 1.6GHz / CMN-600 1066MHz / LPDDR4X 3466MTS / IPM 1733MTS Enable/Disable

Software: Linux / FXBench 5.0- Offscreen

Note: Similar Conclusion on Yolo V3

Request More Info

Final-Level

Cache (FLC)

Fundamental High Memory Bandwidth Technology

ADVANCED & SCALABLE

Memory

Hierarchy

INNOVATIVE & DISRUPTIVE

Massive Cache

Architecture

Final-Level Cache (FLC)
Fundamental High Memory Bandwidth Technology

Memory Latency When Fully Active (Without FLC)

Why High-Bandwidth
FLC Wins

What Happens
When FLC 1 Misses?

Without FLC

With FLC

>95%

PROVEN PERFORMANCE BOOST

FLC Consumer Benchmark

Details

Contact us

Company

TEchnology

Final-Level

Cache (FLC)

Fundamental High Memory Bandwidth Technology

ADVANCED & SCALABLE

Memory

Hierarchy

INNOVATIVE & DISRUPTIVE

Massive Cache

Architecture

Final-Level Cache (FLC)Fundamental High Memory Bandwidth Technology

Memory Latency When Fully Active (Without FLC)

Why High-BandwidthFLC Wins

What HappensWhen FLC 1 Misses?

Without FLC

With FLC

>95%

PROVEN PERFORMANCE BOOST

FLC Consumer Benchmark

Details

Contact us

Learn more about FLC technology

Company

TEchnology

Final-Level Cache (FLC)
Fundamental High Memory Bandwidth Technology

Why High-Bandwidth
FLC Wins

What Happens
When FLC 1 Misses?