# ReNoC: A Network-on-Chip Architecture with Reconfigurable Topology

Mikkel B. Stensgaard and Jens Sparsø Technical University of Denmark



**Technical University of Denmark** 

### Outline

- Motivation
- ReNoC
  - Basic Concepts
  - Physical Architecture
  - Logical Topology
  - Generalization
- Evaluation
- Conclusion

### Motivation

- System-on-Chips
  - Increasing ... Transistor count and complexity
  - Increasing ... Development time
  - Increasing ... Test time
  - Increasing ... Production costs



Pushes towards a general SoC platform

### General SoC Platform

- FPGA like platform for SoC
  - Pre-tested
  - Large volumes
  - Shorter time-to-market
- Domain specific SoC platforms
  - No single platform can be used for everything
- Typical IP-Blocks
  - RAMs, CPUs, IOs, FPGAs
  - Other coarse grained blocks
- Communication infrastructure
  - Flexible NoC



### Flexible NoC for Platform chip

- Challenge
  - Flexibility
    - Support a wide range of communication scenarios
    - QoS and other advanced features
  - Energy and area efficient
- Current Solution: Packet-switched NoC
  - General topology (typically 2D mesh)
  - Only fraction of total capacity is ever used
  - Large part of chip area and power



- Application specific topologies
  - Much more power and area effective [Murali, Srinivasan]
  - Only possible for a single application

## **Switching Methods**

- Packet-switching
  - (Packets routed individually)
  - Routing, buffering and arbitration is needed
  - + Links can be shared [Ætherial, Xpipes, and more]
- Physical circuit-switching
  - (Physical point-to-point connections)
  - + No routing, buffering and arbitration is needed
  - Links are dedicated (No sharing)
    ["An energy-efficient reconfigurable circuit-switched network-on-chip", Wolkotte et al]

|          | Packet-switching | Circuit-switching |
|----------|------------------|-------------------|
| Size     | -                | +                 |
| Energy   | -                | +                 |
| Flexible | +                | -                 |

## Reconfigurable NoC (ReNoC)

- Topology can be configured by application
  - Application specific topology
  - Minimize amount of packet-switching
- Best from packet- and circuit-switching
  - Energy efficiency from circuit-switching
  - Flexibility from packet-switching



## Reconfigurable NoC (ReNoC)

- Topology can be configured by application
  - Application specific topology
  - Minimize amount of packet-switching
- Best from packet- and circuit-switching
  - Energy efficiency from circuit-switching
  - Flexibility from packet-switching



## Physical Architecture

- Links
- Network nodes
  - Topology switch
  - Router
- Can use any existing router
  - Quality-of-Service
  - Virtual Channels
  - Clocked or Clockless



Simple physical architecture:

## **Topology Switches**

- Inserted as a layer between routers and links
- Goal: Minimal area and energy overhead
  - Infrequent configuration
  - Non-full connectivity

Example: Topology switch for 2D mesh

- 5 links/IP-block
- 5 router ports
- Full connectivity →10x10 switch



## **Topology Switches**

- Inserted as a layer between routers and links
- Goal: Minimal area and energy overhead
  - Infrequent configuration
  - Non-full connectivity

Example: Topology switch for 2D mesh

Router port → corresponding link



## **Topology Switches**

- Inserted as a layer between routers and links
- Goal: Minimal area and energy overhead
  - Infrequent configuration
  - Non-full connectivity

Example: Topology switch for 2D mesh

- Router port → corresponding link
- Link → Any other Link (Except itself)
- Link → Router port



### **Implementation**

- Analogue to switch-boxes in FPGAs
- Efficient implementations
  - Pass-gates, tristate buffers, or multiplexers
- Configured using
  - Serial interface, separate network or network itself
- Example: Topology switch for 2D mesh
  - 5, 4-input multiplexers!



## Logical Topology

- Application experience this as static topology
- Widely different topologies are possible
- Routers/links become a sharable resource
- Unused routers/links can be powerand clock-gated
- Logical links
  - Router to Router
  - IP-Block to IP-Block
  - IP-Block to Router
  - Local / long links



### Generalization

### Any Physical Topology

- Tree, Mesh, etc
- Heterogeneous
- Hierarchical
- Network Nodes
  - Router
  - Topology Switch
  - Topology Switch + Router
- Links
  - Uni- and bi-directional
  - Local and non-local
- Router
  - Less ports than number of links as it is a sharable resource



### **Evaluation**

- Demonstrate ReNoC
- Evaluate overhead of Topology Switches
- (Configuration is not considered)
- Physical architecture:



### Application

Video Object Plane Decoder (VOPD) Application
 ["Mapping of MPEG-4 decoding on a flexible architecture platform", van der

["Mapping of MPEG-4 decoding on a flexible architecture platform", van der Tol and Jaspers]

### Task graph:



(Bandwidth in Mbit/second)

### **Architectures**

#### Static Mesh:

- 2D mesh topology without topology switches
- Used as reference
- ReNoC mesh:
  - ReNoC architecture configured as 2D mesh
  - Estimate overhead
- ReNoC specific:
  - ReNoC architecture configured with application specific topology
  - Estimate power savings



ReNoC specific:

### **Implementation**

#### Router

- Simple, Low power router @ 100 MHz, single-cycle
- Source-routed, input buffered, 32 bit flits
- 2 Virtual Channels per input port (4 flits deep)
- Credit-based flow-control

### Topology Switch

- Multiplexer based
- Configuration by registers

### Technology

- 90nm, low-leakage cells,1 V
- Routers and topology switches were synthesized
- Power estimated using random-data at 20% utilization

#### Link

SPICE simulated

["A power and energy exploration of network-on-chip architectures", Banerjee et al]

## Area/ Energy figures

| Module              | Area (mm²) | Enegy/packet (pJ) | Idle Power (uW) |
|---------------------|------------|-------------------|-----------------|
| 5x5 Router          | 0,061      | 32                | 136             |
| 5x5 Topology Switch | 0,007      | 0,6-0,8           | -               |
| Link                | -          | 21                | -               |

- Router vs. topology switch
  - ~9 times larger
  - ~45 times more energy / packet
  - +Idle power

### Results

| Architecture   | Area (mm²) | Power (mW) |  |
|----------------|------------|------------|--|
| Static mesh    | 0,53       | 4,56       |  |
| ReNoC mesh     | 0,58       | 4,69       |  |
| ReNoC specific | 0,58       | 2,02       |  |

- ReNoC mesh vs. static mesh
  - Area increase: 10%
  - Power increase: 3%
- ReNoC specific vs. static mesh
  - Power decrease: 56%
  - Topology switches use 5% of power

(Note: Details can be found in article)

### Discussion

- Presentation focused on main ideas
- Additional issues include
  - Configuration of topology switches
  - Slowest logical link determines clock-frequency
  - Clock-skew
  - Few router ports were used in evaluation
  - High-performance (pipelining)
- Routers with fewer ports might be a choice
  - Ports becomes a sharable resource
  - Smaller routers, but general 2D mesh not possible



### **Future Work**

- Automatic generation of
  - Physical architectures
  - Logical topologies
- Topology switch implementations
- Configuration methods
  - Serial link
  - Separate network
  - Network itself

### Conclusion

- ReNoC enables logical topology to be configured
  - Application Specific topologies
  - Exploit knowledge of communication
- Best from packet- and circuit-switching
  - Efficiency from circuit-switching
  - Flexibility from packet-switching
- Enables general SoC platforms

## Thank you

# Thank you

## Results, detailed

|                | Area (mm <sup>2</sup> ) |          |       | Power consumption (mW) |          |       |               |       |       |
|----------------|-------------------------|----------|-------|------------------------|----------|-------|---------------|-------|-------|
| Architecture   | Routers                 | Topology | Total | Routers                | Topology | Links | Leakage Power | Idle  | Total |
|                |                         | switches |       |                        | switches |       | Power         | Power |       |
| Static mesh    | 0.53                    | -        | 0.53  | 2.39                   | -        | 0.84  | 0.08          | 1.25  | 4.56  |
| ReNoC mesh     | 0.53                    | 0.05     | 0.58  | 2.39                   | 0.12     | 0.84  | 0.08          | 1.25  | 4.69  |
| ReNoC specific | 0.53                    | 0.05     | 0.58  | 0.65                   | 0.09     | 0.84  | 0.03          | 0.41  | 2.02  |

## Characterization, detailed

| Module          | Area     | Energy  | Leakage   | Idle  |
|-----------------|----------|---------|-----------|-------|
|                 |          | per     |           | power |
|                 |          | packet  |           |       |
|                 | $(mm^2)$ | (pJ)    | $(\mu W)$ | (μW)  |
| Link, 1mm       | -        | 21      |           | -     |
| 5x5 Router      | 0.061    | 32      | 8.6       | 136   |
| Topology Switch | 0.007    | 0.6/0.8 | 0.7       | -     |
| 4x4 Router      | 0.047    | 31      | 6.7       | 109   |
| Topology Switch | 0.005    | 0.6/1.1 | 0.6       | -     |
| 3x3 Router      | 0.032    | 30      | 4.7       | 82    |
| Topology Switch | 0.003    | 0.6/1.3 | 0.3       | -     |

### Router



### Router Breakdown

| Module           | Area     | Energy | Leakage   | Idle      |
|------------------|----------|--------|-----------|-----------|
|                  |          | per    |           | power     |
|                  |          | packet |           |           |
|                  | $(mm^2)$ | (pJ)   | $(\mu W)$ | $(\mu W)$ |
| Input Port       | 8900     | 21.1   | 1.2       | 18.8      |
| Virtual Channel  | 4300     | 16.4   | 0.6       | 8.7       |
| Output Port      | 1350     | 5.7    | 0.15      | 6.3       |
| 5x5 Switch       | 3800     | 2.6    | 0.4       | -         |
| VC Allocator     | 5100     | 1.6    | 0.8       | 11.3      |
| Switch Allocator | 900      | 0.8    | 0.13      | -         |