Hi, I’m Donghyeon Joo,
researching Deep Learning Acceleration and Optimization
From the pervasive approaches in low-level system to higher-level compiler to DL architecture, I currently seek every opportunities throughout the compute stack.
current status:
- I am currently a first year Ph.D student at University of Maryland, College Park. (Homepage update incoming!)
Undergraduate of Electrical Engineering @Korea Univ. (~2023 Feb.)
Research Intern @Korea Univ. Compiler & Microarchitecture Lab
Deep Learning
Text-to-Image
DALL-E Mini
Stable Diffusion
Coming Soon
Deep Voice
LSTM-GAN
Lyrics to Melody
Voice Modification
Coming Soon
Style Transfer
Feedforward-CNN
Gram matrix
Coming Soon
Paper Reviews
to keep track of recent advances
Coming Soon
Architecture & Systems
NVIDIA Accelerator
Accelerator full stack
from RTL to Compilers
Porting to Xilinx FPGA
Coming Soon
Undergrad Thesis
Multithread Matmul
OpenMP
Vtune Profiler
Circuit Optimization
Verilog
Synopsys DC
RTL planning
RISC-V
Verilog
GCC-LLVM
MAD instructions
Paper Reviews
so many possibilities!
What I’ve been up to – Recent Posts
-
LUTNet and Logic Shrinkage
E. Wang of Imperial College and M. Abdelfattah of Cornell LUTNet (2019) abstract: DNNs contain significant redundancy – weights and activations can be quantized down to binary values, without degrading model accuracy. Network binarisation on FPGAs replaces resource-hungry multipliers with lightweight XNOR gates(in sync with VLSI undergrad course). As FPGA’s building block K-LUT (K-input Look…
-
Sparse and Irregular Tensor Computation
Is sparse matrix of concern to DL as well? – Defintely, yes. From Survey of Acelerator Arch for DNNs (2020) “a large proportion of NN connetions can be pruned to zero with or without minimum accuracy loss” “Many corresponding computing architectures have also been proposed” From Cambricon (2018) Sparsity in Neural Network section: “sparsity as…
-
Some Random Terms of ECE and CS
2023/02/03 While reading some architecture papers, there were some terminologies that I came across during undergraduate coursework, Netlist literally a “list of nets”, a description of the connectivity of an electronic circuit. (Net, or a network is a collection of two or more interconnected components) In its simplest form, a netlist consists of a list…
-
On Transformer (Attention) Acceleration
Software Techniques are a different topic to discuss later – methods like pruning and knowledge distillation Majority operations are 1) Self-Attention and 2) Feedforward. Simple, a more underlying optimization would be: However, what about exploiting the chracteristics of the Transformer network? “Hardware Accelerator for Multi-Head Attention andPosition-Wise Feed-Forward in the Transformer” – Nanjing Univ. Utilized…
-
INVITED: New Directions in Distributed Deep Learning: Bringing theNetwork at Forefront of IoT Design
My Preface: As deep learning models get larger, I was wondering how edge would deal with the evergrowing model size Introduction: Three challenges to large-scale adoptation of DL at edge: Sending private data to the cloud exposes security risks. On-device inference and training is a method to avoid the data privacy compromise. The current trend…
Timeline
2017~18
Style Transfer
2019~20
Military Service
Spring 2021
Computer Architecture
Deep Voice
Fall 2021
Compiler
Natural Language Processing
Operating Systems
Winter 2021 ~ Summer 2022
Undergrad Thesis: ELECTRA Profiling and MatMUL
VLSI Design
Winter 2022
Applying to Grad Schools
SOPs / LORs
Faculty Contacts
Finishing up Undergrad
NVDLA on FPGA