Hi, I’m Donghyeon Joo,

researching Deep Learning Acceleration and Optimization

From the pervasive approaches in low-level system to higher-level compiler to DL architecture, I currently seek every opportunities throughout the compute stack.

current status:

  • I am currently a first year Ph.D student at University of Maryland, College Park. (Homepage update incoming!)
  • Undergraduate of Electrical Engineering @Korea Univ. (~2023 Feb.)
  • Research Intern @Korea Univ. Compiler & Microarchitecture Lab

Text-to-Image

DALL-E Mini

Stable Diffusion

Coming Soon


Deep Voice

LSTM-GAN

Lyrics to Melody

Voice Modification

Coming Soon


Style Transfer

Feedforward-CNN

Gram matrix

Coming Soon


Paper Reviews

to keep track of recent advances

Coming Soon

Architecture & Systems

NVIDIA Accelerator

Accelerator full stack

from RTL to Compilers

Porting to Xilinx FPGA

Coming Soon


Undergrad Thesis

Multithread Matmul

OpenMP

Vtune Profiler


Circuit Optimization

Verilog

Synopsys DC

RTL planning


RISC-V

Verilog

GCC-LLVM

MAD instructions


Paper Reviews

so many possibilities!

What I’ve been up to – Recent Posts

  • LUTNet and Logic Shrinkage

    E. Wang of Imperial College and M. Abdelfattah of Cornell LUTNet (2019) abstract: DNNs contain significant redundancy – weights and activations can be quantized down to binary values, without degrading model accuracy. Network binarisation on FPGAs replaces resource-hungry multipliers with lightweight XNOR gates(in sync with VLSI undergrad course). As FPGA’s building block K-LUT (K-input Look…

    Read more


  • Sparse and Irregular Tensor Computation

    Is sparse matrix of concern to DL as well? – Defintely, yes. From Survey of Acelerator Arch for DNNs (2020) “a large proportion of NN connetions can be pruned to zero with or without minimum accuracy loss” “Many corresponding computing architectures have also been proposed” From Cambricon (2018) Sparsity in Neural Network section: “sparsity as…

    Read more


  • Some Random Terms of ECE and CS

    2023/02/03 While reading some architecture papers, there were some terminologies that I came across during undergraduate coursework, Netlist literally a “list of nets”, a description of the connectivity of an electronic circuit. (Net, or a network is a collection of two or more interconnected components) In its simplest form, a netlist consists of a list…

    Read more


  • On Transformer (Attention) Acceleration

    Software Techniques are a different topic to discuss later – methods like pruning and knowledge distillation Majority operations are 1) Self-Attention and 2) Feedforward. Simple, a more underlying optimization would be: However, what about exploiting the chracteristics of the Transformer network? “Hardware Accelerator for Multi-Head Attention andPosition-Wise Feed-Forward in the Transformer” – Nanjing Univ. Utilized…

    Read more


  • INVITED: New Directions in Distributed Deep Learning: Bringing theNetwork at Forefront of IoT Design

    My Preface: As deep learning models get larger, I was wondering how edge would deal with the evergrowing model size Introduction: Three challenges to large-scale adoptation of DL at edge: Sending private data to the cloud exposes security risks. On-device inference and training is a method to avoid the data privacy compromise. The current trend…

    Read more


Timeline

  • 2017~18

    Style Transfer

  • 2019~20

    Military Service

  • Spring 2021

    Computer Architecture

    Deep Voice

  • Fall 2021

    Compiler

    Natural Language Processing

    Operating Systems

  • Winter 2021 ~ Summer 2022

    Undergrad Thesis: ELECTRA Profiling and MatMUL

    VLSI Design

  • Summer 2022

    Qualcomm IT Tour

    Text-to-Image (CCP)

    FPGA (Compiler Lab Intern)

    2022 Summer Goals

  • Winter 2022

    Applying to Grad Schools

    SOPs / LORs

    Faculty Contacts

    Finishing up Undergrad

    NVDLA on FPGA