Hamed Shirzad



About Me

I am a computer science Ph.D. student at the University of British Columbia (UBC), supervised by Prof. Danica Sutherland (2021-now). My main research interest lies in Machine Learning on Graphs. I explore diverse graph tasks, spanning supervised learning to generative models and evaluation methods used for them. Recently, I have been more focused on Graph Transformers. I enjoy exploring method theories and using them as inspiration to develop improved models. I find both the theoretical and practical aspects of these methods interesting.

I completed my M.Sc. in CS at Simon Fraser University, under Prof. Greg Mori. My focus was on generative models for Graphs. During this time, I used tree decomposition to enhance sequential graph generation. You can find my Master's thesis here.

I did my B.Sc. in Computer/Software Engineering and Mathematics (double majors) at Sharif University of Technology. During my M.Sc. and B.Sc. I had the privilege of doing internships at the Empirical-Inference Lab at Max-Planck Institute for Intelligent Systems, Borealis AI, and Autodesk AI Lab

Research Interests:



H Shirzad*, A Velingker*, B Venkatachalam*, DJ Sutherland, AK Sinop. “Exphormer: Sparse Transformers for Graphs", ICML 2023. [arXiv], [GitHub][blog post]

Summary: Graph transformers offer a promising architecture for various graph learning tasks. However, scaling them for large graphs while maintaining competitive accuracy is challenging. Our paper introduces Exphormer, a framework with a sparse attention mechanism using virtual nodes and expander graphs. These yield linear graph transformer complexity, maintain theoretical properties, and enhance models within the GraphGPS framework. We demonstrate Exphormer's competitiveness across diverse datasets, including outperforming prior models and scaling to larger graphs.

H Shirzad, K Hassani, DJ Sutherland. “Evaluating Graph Generative Models with Contrastively Learned Features.", NeurIPS 2022. [arXiv], [GitHub]

Summary: Various models have been proposed for Graph Generative Models, necessitating effective evaluation methods. Our work suggests using contrastively trained GNN representations for more reliable metrics.  We show neither traditional approaches nor GNN-based approaches dominate the other, however, we demonstrate that Graph Substructure Networks can combine both approaches for theoretically and practically stronger evaluation metrics.

H Shirzad, H Hajimirsadeghi, AH Abdi, G Mori. “TD-Gen: Graph Generation Using Tree Decomposition", AISTATS 2022. [arXiv], Code can be found under supplementary ZIP here

Summary: TD-GEN is a graph generation framework based on tree decomposition. The framework includes a permutation invariant tree generation model which forms the backbone of graph generation. Tree nodes are supernodes, each representing a cluster of nodes in the graph. Graph nodes and edges are incrementally generated inside the clusters by traversing the tree supernodes. We also discuss the shortcomings of standard evaluation criteria based on the statistical properties of the generated graphs as performance measures.  We propose to compare the performance of models based on average likelihood, conditioned on the permutation of the nodes.


H Shirzad, R Deng, H Zhao, F Tung. “Conditional Diffusion Models as Self-supervised Learning Backbone for Irregular Time Series", Learning from Time Series for Health Workshop at ICLR 2024

Summary: Conditional diffusion models' training resembles Masked Autoencoders. We propose a customized conditional diffusion model as a self-supervised learning backbone for irregular time series data. This model uses a learnable time embedding and a cross-dimensional attention mechanism to handle complex temporal dynamics. It's suitable for conditional generation tasks and acquires useful hidden states for discriminative tasks. Empirical evidence shows that these hidden states can be used for excellent results in the downstream tasks.

H Shirzad, B Venkatachalam, A Velingker, DJ Sutherland, D Woodruff. “Low-Width Approximations and Sparsification for Scaling Graph Transformers", GLFrontiers Workshop at NeurIPS 2023

Summary: We first train an Exphormer model with a hidden dimension of 4 or 8 on the graph dataset, use the attention scores to find active connections, sparsify the graph, and train the final Exphormer model on this sparse graph. This approach is much more memory efficient than the original Exphormer model.

CD Weilbach, W Harvey, H Shirzad, F Wood. “Scaling Graphically Structured Diffusion Models", SPIGM Workshop at ICML 2023. [Paper]