I am an AI Research Scientist at Meta AI London, where I work on video diffusion models, generative rendering, and talking avatars.
I received my Ph.D. in Computer Science from the University of Surrey, supervised by Prof. Tao Xiang and Prof. Yi-Zhe Song, and worked closely with Dr. Xiatian (Eddy) Zhu.
I completed my Bachelor's degree in Computer Science (Artificial Intelligence) at the University of Malaya, where I was a lab member of CISIP under Prof. Chan Chee Seng.
Contact: kamwoh [at] gmail.com
> Research Journey
Training LMs for Vector Graphics
Applied structured spatial generation to a new modality. VecGlypher fine-tunes multimodal LLMs to directly emit SVG path commands — no raster intermediate — via a two-stage curriculum (syntax mastery, then style refinement), producing editable vector glyphs from text or image exemplars.
Generative Neural Rendering
Moved beyond per-object NeRF distillation to a unified rendering paradigm. Kaleido is a sequence-to-sequence transformer that directly synthesises novel views — no explicit 3D representations, no per-scene optimisation — pre-trained on video, then fine-tuned for zero-shot novel view synthesis.
Learning 3D from Unposed 2D
Lifted PartCraft's part-aware generation into 3D. Chirpy3D fine-tunes multi-view diffusion models with LoRA, learns a continuous part latent space, and distils part-controlled views into NeRF — generating fine-grained 3D objects from unposed 2D images alone.
Controlling Diffusion Models
ConceptHash showed that parts can be encoded separately — but can we also generate by parts? PartCraft answers yes: entropy-based attention losses steer cross-attention so each part token controls its own spatial region, enabling compositional image generation from part-level descriptions.
Tackling Core Hashing Problems
Addressed fundamental hashing challenges: SDC solved similarity collapse in unsupervised hashing via Wasserstein-calibrated distributions; FIRe unified benchmarking; ConceptHash made codes interpretable — each sub-code attends to a discoverable visual part, turning hash bits into semantic concepts.
Deep Hashing for Image Retrieval
The orthogonal binary codes used in watermarking map directly to hashing — both need compact codes with maximum Hamming separation. DPN introduced central similarity with orthogonal class centres; OrthoHash distilled it to a single cosine objective, proving the codebook alone drives retrieval performance.
Watermarking DNN Outputs
Extended ownership protection from model weights to model outputs. IPR-GAN embeds both white-box signatures in GAN weights and black-box watermarks in generated images, enabling dual verification without sacrificing visual quality.
Watermarking DNN Weights
Started by studying how to protect DNN intellectual property. DeepIPR embeds learnable "passport" layers into normalization parameters — the model only produces correct outputs with the right passport, defeating ambiguity attacks without degrading performance.