About the Blog

[ blog ] 01 Jan 2021

I utilize the blog site to share my reflections and various notes. The primary site is tailored for a broader audience, where I publish spontaneous musings and concise, informal notes with appropriate tags. To ensure precision, certain blog posts will be presented in my native language, Chinese. However, I make an effort to use English, particularly for technical notes. You can check them with Home or Archive tabs in the sidebar.

Furthermore, I’ve established a separate notes site dedicated to more specialized and organized annotations on academic subjects gleaned from courses, lectures, and discussions. I opted for a Tufte theme for this site, as I believe the side note format is exceptionally beneficial for elucidating intricate theoretical concepts. You can navigate to this site via the Notes tab on sidebar.

Flow-Matching Objectives

[ AI&Physics ] 27 May 2024 In the previous blog, I walked through the simulation-based approaches to train the neural ODE/continuous normalizing flow models. Those approaches are mathematically elegant, while they are still expensive and non-scalable in practice. Flow-matching objectives are targets to make the training more affordable and scalable. In this blog, I will review the derivations behind flow-matching models.

Training Neural ODE with three different loss types

[ AI&Physics ] 13 May 2024 The recent popular flow-matching models are based on another interesting model group called Neural ODE/continuous normalizing flow. While the main idea behind flow-matching models is to find a practical and affordable way to train the neural ODE, the original adjoint sensitivity method is actually very intellectually interesting and full of meaningful details. So, in this blog, I'll review the derivations behind the adjoint method before diving into the flow-matching objective in the next one. At the end, they are both good candiates of protocols to make observables from MD trajectories differentiable.

Implicit Reparameterization Gradients

[ AI&Physics ] 12 Sep 2023 This note delves into a paper recommended by Kevin, which focuses on the challenges of obtaining low-variance gradients for continuous random variables, particularly those pesky distributions we often encounter (yes, the Rice distribution). Key takeaway, you can have unbiased estimators for pathwise gradients of continuous distributions with numerically tractable CDFs, like gamma, truncated, or mixtures.

An obscure reason of GPU memory leak in pytorch

[ tech ] 08 May 2023 A short debug note on why I kept getting "CUDA out of memory" error in my codes. Main takeaway is, don't use in-place operations in your computing graph unless necessary. If you are applying it to non-leaf tensors, change it even it seems necessary. I tested on both 1.13 and 2.0, with cuda version 11.6 and 11.7.

Configure A macOS with M1 chip From Scratch

[ tech ] 12 Jul 2022 A walk-through note on how to configure my familiar working system from a brand new macOS system with M1 chip, including Git token, Homebrew, Terminal color theme, Oh-my-zsh plugins, and conda. Compared to the previous post for an Intel chip, the difference mainly lies in the Homebrew PATH. I also use mambaforge to replace miniconda for python environment management.

Minhuan Li (李黾奂)