Massoud Pedram

Stephen and Etta Varra Professor

Department of EE-Systems

University of Southern California

Previous Projects
Previous Presentations

Current Projects

Design of Modular multiplication

Development of efficient Montgomery/Barrett modular multiplication to reduce computation latency or save hardware area. It includes a new algorithm to parallelize the computation of quotient and intermediate result, speed up the expression (A+B)*C, etc.

Related work:

B. Zhang, Z. Cheng and M. Pedram, “High-Radix Design of a Scalable Montgomery Modular Multiplier with Low Latency,” in IEEE Transactions on Computers

Realizing DNNs with Fixed-functional combinational Logic

Algorithm-Architecture Co-Design for Energy-Efficient and Reliable Machine Learning Models

Sponsor: National Science Foundation (NSF)

The broader scope of this research includes: a) Energy-efficient architecture and algorithm co-design for DNN training to yield compressed models, b) Efficient model compression to retain its robustness, c) Model compression of brain-inspired deep SNNs.

Related work:

Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, and Peter A. Beerel, “Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks,” in IEEE Transactions on Computers, 2020.

Souvik Kundu, Mahdi Nazemi, Peter Beerel, Massoud Pedrami, “DNR: A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNN,” in Proc. of ASP-DAC, 2021.

Souvik Kundu, Gourav Datta, Massoud Pedram , Peter Beerel, “Spike-Thrift: Towards Energy-Efficient Deep SNNs by Limiting Spiking Activity via Attention Guided Compression,” in WACV, 2021.

Energy-Efficient, Low-Latency Realization of Neural Networks Through Boolean Logic Minimization

Sponsor: TBD

To cope with computational and storage complexity of deep neural networks, this project focuses on a training method that enables a radically different approach for realization of deep neural networks through Boolean logic minimization. The aforementioned realization completely removes the energy-hungry step of accessing memory for obtaining model parameters, consumes about two orders of magnitude fewer computing resources compared to realizations that use floating-point operations, and has a substantially lower latency.

Related work:

M. Nazemi, A. Fayyazi, A. Esmaili, A. Khare, S. N. Shahsavani, and M. Pedram, NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic,Proc. of The 29th IEEE Int'l Symp. On Field-Programmable Custom Computing Machines, May 2021.

M. Nazemi, G. Pasandi, and M. Pedram, “Energy-efficient, low-latency realization of neural networks through Boolean logic minimization,” Asia and South Pacific Design Automation Conference (ASP-DAC), 2019.(Best Paper Award)

Coldflux: Tools For SFQ Logic Design

Placement and Clock Network Synthesis for Single Flux Quantum (SFQ) Logic Circuits

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

In this project new algorithms for placement and clock network synthesis for large scale SFQ circuits a re developed and implemented. The goal is to maximize the circuit performance in terms of maximum clock frequency, considering special characteristics of SFQ technology.

SN Shahsavani, TR Lin, A Shafaei, CJ Fourie, M Pedram“An Integrated Cell Placement and Interconnect Synthesis Tool for Large-scale SFQ Logic Circuits,IEEE Transactions on Applied Superconductivity 27 (4), 1-8
SN Shahsavani, A Shafaei, M Pedram, A placement algorithm for superconducting logic circuits based on cell grouping and super-cell placement, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2018.

Minimizing the longest routing wires of large-scale flux circuits

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

Development of a routing tools for single flux quantum (SFQ) circuits. The routing tools aims at finding connection wire paths of all nets of a SFQ circuit while satisfying design rule check. The following is for elaboration: The routing tools are further optimized to identify critical nets with the longest routed wire length and to re-route there nets for a shorter wire length. The routing algorithm for the re-route process is a maze routing algorithm which searches wire paths with the lowest cost of routing individual nets.

Related work:

Efficient Synthesis and Realization of Single Flux Quantum (SFQ) Logic Circuits

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

This project is about designing suitable computer aided-design (CAD) software tools to support design automation of superconducting single flux quantum circuits. The main focus is on logic and behavioral level synthesis. This includes designing algorithms and implementing them using C/C++ and Python programming languages, and it involves verification of generated circuits through circuit simulations.

Related work:

G. Pasandi and M. Pedram, PBMap: A Path Balancing Technology Mapping Algorithm for Single Flux Quantum Logic Circuits, in IEEE Trans. on Applied Superconductivity, Vol. 29, Issue 4, June 2019

Optimization of Single Flux Quantum (SFQ) Logic Cells

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

Due to fabrication uncertainties, it is important to have cells that can tolerate variations. For this project, we introduce a hybrid optimization algorithm for improving critical parameter margins of the cells. With Monte Carlo simulations, we show that increased critical margins improve the parametric yield.

Related work:

Advanced Cell Design, Characterization and Re-configurable Circuits for Single Flux Quantum Technology

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

Due to the lack of three-terminal device like MOSFET in CMOS circuits in superconducting electronics, it is difficult to conceive a superconducting FPGA which provides significantly cheaper solutions for various applications. Our work is focused on proposing designing FPGA for superconducting circuits using magnetic Josephson junctions and energy-efficient RSFQ biasing.

N. K. Katam and M. Pedram, "Logic Optimization, Complex Cell Design, and Retiming of Single Flux Quantum Circuits," in IEEE Transactions on Applied Superconductivity, vol. 28, no. 7, pp. 1-9, Oct. 2018, Art no. 1301409.
N. K. Katam, O. A. Mukhanov and M. Pedram, Superconducting Magnetic Field Programmable Gate Array, in IEEE Transactions on Applied Superconductivity, vol. 28, no. 2, pp. 1-12, March 2018, Art no. 1300212.
N. K. Katam and M. Pedram, "Timing Characterization for Static Timing Analysis of Single Flux Quantum Circuits," in IEEE Transactions on Applied Superconductivity. doi: 10.1109/TASC.2019.2891166

Intelligent Arithmetic Circuit Recognition

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

we address the problem of deriving a functional description of a circuit from an unstructured netlist by leveraging deep learning and circuit representations based on convolutional neural networks (CNNs). In doing so, we are motivated by the state-of-the-art performance of machine learning (ML) techniques, based on both convolutional and deep neural networks, for solving challenging problems including classification, pattern recognition, language processing, and decision making in a variety of applications – from business, to social work, medicine, and engineering.

Related work:

A. Fayyazi, S. Shababi, P. Nuzzo, S. Nazarian, and M. Pedram, "Deep Learning-Based Circuit Recognition Using Sparse Mapping anf Level-Dependent Decaaying Sum Circuit Representations," to appear in DATE 2019.

Verification Techniques for Single Flux Quantum (SFQ) Circuits

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

Objective of this project is to develop the post-synthesis verification techniques for SFQ circuits. As part of this work, we developed a logical equivalence checking (LEC) approach that would check the equivalence of a post-synthesis gate level netlist of a target SFQ circuit against an initial Boolean network representation of the same circuit. In addition to LEC, we work on a semi-formal verification framework for SFQ circuits in the UVM standard. The SFQ logic-focused framework was developed with the best-practice verification methodology of the Universal Verification Methodology (UVM) standard in mind and is easily portable for verifying other SFQ circuit designs.

Related work:

Alvin Wong, Kevin Su, Hang Sun, Arash Fayyazi, Massoud Pedram, Shahin Nazarian, “Semi-formal Verification Framework and Benchmark for Single Flux Quantum Technology” in ISQED 2019.

SFQ Circuit Design

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

Description: As Moore's law dying for CMOS, novel technology with higher operation rate, better power efficiency and scale-ability is badly demanded for future chips. SFQ is one of the most promising sapling using single quantum flux as the compute unit, holding an advantage of fast speed and low dynamic power. This project is to build a RSFQ cell library, develop novel circuit and system structure for the SFQ logic family and enhance the static power performance for SFQ circuits.

Related work:

H. Cong, N. K. Katam and M. Pedram, "Design of an SFQ Full Adder as a Single-Stage Gate," 2019 IEEE International Superconductive Electronics Conference (ISEC), Riverside, CA, USA, 2019, pp. 1-3.

Timing Analysis for Superconducting Single-Flux-Quantum Circuits

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

Development of timing analysis tool to provide timing information for a given post-synthesis/placement/routing circuits. It implements static timing analysis and statistical static timing analysis, prunes unimportant paths, reports setup/hold time violations, provides timing information to ensure timing closure of placement/routing.

Related work: B. Zhang, M. Li and M. Pedram,"qSSTA: A Statistical Static Timing Analysis Tool for Superconducting Single-Flux-Quantum Circuits," in IEEE Transactions on Applied Superconductivity, vol. 30, no. 7, pp. 1-12, Oct. 2020

SFQ based ALU design

Sponsor: Intelligence Advanced Research Projects Activity (IARPA)

Description: We plan to exploit the nature of each gate being clocked in SFQ circuits and use this gate level pipelined nature of SFQ logic cells to design a ALU to have better throughput (qBSA). We also, plan to carry forward this research to evaluate the performance of the ALU for Data dependent operations.

Related work: Souvik Kundu, Gourav Datta, Peter A. Beerel, Massoud Pedram, "qBSA: Logic design of a 32-bit block-skewed RSFQ arithmetic logic unit", in IEEE ISEC 2019.

Approximate Computing: A New Approach to Design Energy Efficient Circuits for

for Error Resilient Applications

The error resiliency of applications such as media processing has provided designers with a new technique called approximate computing (AC), which abandons exactness of computation in favor of improved efficiency. In functional approximation, which is our focus here, a more simpler function different than the actual design is implemented that can be generated manually by the designer (in the case of approximate arithmetic blocks) or automatically (when an arbitrary function is given). We generated an approximate non-iterative divider which is highly accurate and energy efficient.

Related work:

M. Vaeztourshizi, M. Kamal, A. Afzali-Kusha, M. Pedram, An Energy-Efficient, Yet Highly-Accurate, Approximate Non-Iterative Divider.” ISLPED (2018).