

#### Envisioning Post-CMOS Nanofabrics and Nanocomputing Approaches "An Architectural Perspective"

Csaba Andras Moritz UMASS ECE March 2<sup>nd</sup>, 2011









Center for Hierarchical Manufacturing an NSF Nanoscale Science & Engineering Center





- Some key questions
  - > How can we beat CMOS performance at the nanoscale?
  - > 3D integration: what overlay precision is needed?
  - How much faults can we manage and how?
  - > Overcome cost trends. New business model for nano ICs?
  - Crossing physical domains for additional benefits
  - Putting it all together: vision for nanosystem fabric/platform
    - Lower cost, improve performance & power, and scale. Is it possible?

# How can we beat CMOS?

| State<br>Variable                 | Electric<br>Charge | Electron Spin                 |                            |                            | Molecular<br>State        | Phase<br>Change     |
|-----------------------------------|--------------------|-------------------------------|----------------------------|----------------------------|---------------------------|---------------------|
|                                   |                    | Single<br>Spin                | Single<br>Domain           | Spin Wave                  | ***                       | 题                   |
| Information<br>Encoding<br>method | Field Effect       | Magnetic<br>field<br>coupling | Magnetic field<br>coupling | Magnetic field<br>coupling | Photon, heat,<br>electron | Joule heating       |
| Switching<br>Speed                | ~1ps               | ~10ps                         | ~10ps                      | ~10ps                      | 0.1ps                     | lns                 |
| Power<br>dissipation              | ~MW/cm²            | ~MW/cm <sup>2</sup>           | 3.6KW/cm <sup>2</sup>      | <100W/cm²                  | ~MW/cm²                   | ~MW/cm <sup>2</sup> |

K. Galatsis, et al., "Alternate state variables for emerging nanoelectronic devices," IEEE transactions on Nanotechnology, vol. 8, no. 1, 2009

- Faster/lower power nanoscale switches often seen as key goal
  - > But no breakthrough alternative on the horizon
- Two possible mindsets we envision to beat CMOS performance
  - Integrated fabric mindset assembling devices and interconnect
  - > New devices implementing complex logic functions vs. switch

4



## Approach #1: Nanowire grids (NASICs)



#### Nanoscale Application Specific Integrated Circuits:

Nanowire grid-based but also graphene nanoribbon crossbars

- Integrated assembly of novel circuits: No arbitrary device sizing, placement
- ~30X density adv; Up to 10% defect rate, 30% parameter variation managed
- Experimental NASIC Fabric Prototype at UMass Amherst and UCLA (ongoing)
  - > Scalable *in-situ*, *ex-situ* and direct patterning of nanowire arrays

# Approach #2: Logic Functions as the device



- Implement logic function in one step with a single device
  - High fan-in, high fan-out, and input multiplexing
- A generalized high fan-in multivalue threshold logic
- Spin Wave Functions



\*Key physical components of a spin-wave based computing fabric

- Leverage collective precession of spins in ferromagnetic materials
  - Encode information in amplitude and phase of spin-wave
  - Computation through interference
  - Waveguides for spin propagation
  - Magneto-electric (ME) cells for I/O and amplitude modulation

### Benefits/Intuition: (7;3) Parallel Counter Design



M. Mehta, V. Parmar, E. Swartzlander, "High-speed multiplier design using multi-input counter and compressor circuits," ARITH, pp. 43-50, 1991.

#### 45nm Standard Cell Library based CMOS Layout for (7;3) Counter





Threshold Logic (Few threshold gates, but

individual gates Highly Complex)



P. Celinski et al., "Compact parallel (m,n) counters based on self-timed threshold logic," Electronics Letters 38, no. 13 (2002): 633-635.



#### SPWF vs.CMOS Complexity

| Fabric    | Complexity                                             |
|-----------|--------------------------------------------------------|
| CMOS      | Transistor Count ~=<br>100                             |
| Spin Wave | No "device" to<br>compute<br>I/O ME cell count =<br>13 |

- Significant reduction in logic complexity
- Translates into performance

### 3D Integration: How much precision needed?



- 3D integration and 2D functionalization require considering registration and overlay precision between process steps
  - How much overlay precision is needed and how it impacts yield?
- We can mitigate with choices we make: regular design & order of process
  - > First mask may be 'offset' with tolerance since underlying pattern uniform (grid)
  - > Overlay for subsequent litho-masks precise ( $3\sigma = \pm 5.7$ nm known ITRS2009)
  - > 75% yield for such overlay at 10 nm nanowire pitch nanoprocessor design

# What about 3-D integration with CMOS?



- Requires mixing CMOS design rules with nano
  - > CMOS lambda design rules for integration with metal stacks (ITRS 2009)
    - Determines via size and overhang, metal and nanowire spacing



CMOS layer



- N<sup>3</sup>ASIC built on a single SOI substrate with 3D Integration
  - > Area-distributed interfacing using standard lithographic vias
  - Nanowire logic/memory tiles integrated with CMOS
  - > <u>No special manufacturing constraints beyond bottom nanolayer</u>

#### Cost: Programmable N<sup>3</sup>ASIC "Fabric/Platform"



- Uniform with all devices at junctions programmable
- <u>Fabric/platform model</u> game-changing cost reduction potentially
- We can program logic and SRAM-like memory on same fabric

# Dealing with Defects, Faults @ Nanoscale



- Perhaps it is time for a mindset change: 0.01 defects/cm<sup>2</sup> (CMOS) not possible
- Aggressive <u>multi-level built-in masking</u>
  - Error correction masking integrated into physical fabric
- Runtime Re-Calibration enabled by Stochastic Resilience Sensors
  - Estimate fault rates at runtime, adjust with reconfiguration (inhibition, enhancement, etc)

#### Crossing Physical Domains: Hybrid Spin-Charge Fabrics?



M. Sellier et al., "Predictive Delay Evaluation on Emerging CMOS Technologies: A Simulation Framework," in Quality Electronic Design, 2008. ISQED 2008. 9th International Symposium on, 2008, 492-497, 10.1109/ISQED.2008.4479784.



S. Rakheja, A. Naeemi, and J.D. Meindl, "Physical limitations on delay and energy dissipation of interconnects for post-CMOS devices," in Interconnect Technology Conference (IITC), 2010 International, 2010, 1-3, 10.1109/IITC.2010.5510448.

- Spin wave propagation may be inferior to charge transfer by10X
  - 45nm minimum width wire delay: ~10ps for 1µ length at 45nm CMOS; spin wave delay: ~100ps/µm (Khitun); Another study (Rakheja) also puts CMOS ahead by 10X
- <u>Communication may be more efficient in charge domain</u>
- Glue logic at low fan-in better maybe in charge-based domain

#### Multi-domain IC platform: Spin-Charge-Memristor Fabric?



Ultimate vision: fully re-programmable multi-domain nanofabric/platform

- N3P: Hybrid programmable nanowire, spin wave functions, memristor, and CMOS fabric; further functionalize with photovoltaics, sensing, etc
- Post manufacturing & runtime programming/calibration of hardware
  - Shifts chip fabless business model ... similar to software!
  - > CPU, GPU, other logic, share same platform: LOW COST!



#### Thank you!

# Nanoarch 2011 in San Diego, June 8-9 <u>http://www.nanoarch.org</u>