PhD research checkpoints — Andrej Janchevski (EPFL, 2025)
PyTorch checkpoint dump for the three research methods presented in the thesis
Scalable Methods for Knowledge Graph Reasoning and Generation
(infoscience.epfl.ch).
The repository mirrors the on-disk layout the demo backend expects, so a single
huggingface_hub.snapshot_download(repo_id="Bani57/checkpoints", local_dir=...)
drops every file into its final location with no extra wiring.
The interactive demos that consume these weights are deployed at https://bani57-website.hf.space; source at https://huggingface.co/spaces/Bani57/website.
Methods and weights
COINs — knowledge graph reasoning (thesis §3.1)
Community-Informed Graph Embeddings. Six embedding scoring families (TransE, DistMult, ComplEx, RotatE, Q2B, KBGAT) trained on three KGs. Partitions each KG into Leiden communities and learns separate community-local and global embeddings, combined at scoring time.
COINs-KGGeneration/graph_completion/checkpoints/{dataset}_{algorithm}.tar
— 18 files, ~2.6 GB.
Datasets: freebase (FB15k-237), wordnet (WN18RR), nell (NELL-995).
Algorithms: transe, distmult, complex, rotate, q2b, kbgat.
COINs-KGGeneration/graph_completion/results/{dataset}/transe_model.tar
— 3 files, ~185 MB.
TransE pre-init checkpoints used to bootstrap the KBGAT embedder.
MultiProxAn — graph generation (thesis §4.3)
Discrete denoising diffusion model with the MultiProx outer Gibbs loop for multi-chain refinement. Generates molecular graphs (QM9) and synthetic community graphs (comm20).
MultiProxAn/checkpoints/{dataset}{,_c}.ckpt
— 4 files, ~380 MB.
Discrete ({dataset}.ckpt) and continuous ({dataset}_c.ckpt) variants.
KG anomaly correction (thesis §4.4)
DiGress-style diffusion conditioned on the COINs embedder for the same
dataset. Either samples a fresh subgraph (generate) or denoises a
user-supplied subgraph (correct).
COINs-KGGeneration/graph_generation/checkpoints/{dataset}{,_correct}.ckpt
— 6 files, ~2.7 GB.
Usage
The deployed website downloads the entire repository into its
CHECKPOINTS_ROOT at container startup:
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="Bani57/checkpoints",
repo_type="model",
local_dir="src/research", # mirrors the on-disk layout
local_dir_use_symlinks=False,
)
For accelerated downloads, install hf_transfer and set
HF_HUB_ENABLE_HF_TRANSFER=1. Total payload ≈ 5.8 GB.
The weights are loaded by ModelRegistry
in the website backend; lazy per-request loading keeps the working set small.
Training
The COINs and MultiProxAn checkpoints were trained on EPFL's GPU cluster during 2021–2025 as part of the doctoral research programme. Training hyperparameters live in the research code's YAML configs.
Intended use
These checkpoints are released to power the interactive thesis demos linked above. They are research artefacts; downstream production use is neither tested nor supported.
Citation
@phdthesis{janchevski_scalable_2025,
author = {Andrej Janchevski},
title = {Scalable Methods for Knowledge Graph Reasoning and Generation},
school = {{EPFL}},
year = {2025},
url = {https://infoscience.epfl.ch/entities/publication/87acf391-feef-43a0-b665-7f2f0bc70b2c},
}
License
MIT for the released weights and source. The research methods retain their original publication terms; see the thesis.