PhD research checkpoints — Andrej Janchevski (EPFL, 2025)

PyTorch checkpoint dump for the three research methods presented in the thesis Scalable Methods for Knowledge Graph Reasoning and Generation (infoscience.epfl.ch). The repository mirrors the on-disk layout the demo backend expects, so a single huggingface_hub.snapshot_download(repo_id="Bani57/checkpoints", local_dir=...) drops every file into its final location with no extra wiring.

The interactive demos that consume these weights are deployed at https://bani57-website.hf.space; source at https://huggingface.co/spaces/Bani57/website.

Methods and weights

COINs — knowledge graph reasoning (thesis §3.1)

Community-Informed Graph Embeddings. Six embedding scoring families (TransE, DistMult, ComplEx, RotatE, Q2B, KBGAT) trained on three KGs. Partitions each KG into Leiden communities and learns separate community-local and global embeddings, combined at scoring time.

COINs-KGGeneration/graph_completion/checkpoints/{dataset}_{algorithm}.tar — 18 files, ~2.6 GB. Datasets: freebase (FB15k-237), wordnet (WN18RR), nell (NELL-995). Algorithms: transe, distmult, complex, rotate, q2b, kbgat.

COINs-KGGeneration/graph_completion/results/{dataset}/transe_model.tar — 3 files, ~185 MB. TransE pre-init checkpoints used to bootstrap the KBGAT embedder.

MultiProxAn — graph generation (thesis §4.3)

Discrete denoising diffusion model with the MultiProx outer Gibbs loop for multi-chain refinement. Generates molecular graphs (QM9) and synthetic community graphs (comm20).

MultiProxAn/checkpoints/{dataset}{,_c}.ckpt — 4 files, ~380 MB. Discrete ({dataset}.ckpt) and continuous ({dataset}_c.ckpt) variants.

KG anomaly correction (thesis §4.4)

DiGress-style diffusion conditioned on the COINs embedder for the same dataset. Either samples a fresh subgraph (generate) or denoises a user-supplied subgraph (correct).

COINs-KGGeneration/graph_generation/checkpoints/{dataset}{,_correct}.ckpt — 6 files, ~2.7 GB.

Usage

The deployed website downloads the entire repository into its CHECKPOINTS_ROOT at container startup:

from huggingface_hub import snapshot_download
snapshot_download(
    repo_id="Bani57/checkpoints",
    repo_type="model",
    local_dir="src/research",       # mirrors the on-disk layout
    local_dir_use_symlinks=False,
)

For accelerated downloads, install hf_transfer and set HF_HUB_ENABLE_HF_TRANSFER=1. Total payload ≈ 5.8 GB.

The weights are loaded by ModelRegistry in the website backend; lazy per-request loading keeps the working set small.

Training

The COINs and MultiProxAn checkpoints were trained on EPFL's GPU cluster during 2021–2025 as part of the doctoral research programme. Training hyperparameters live in the research code's YAML configs.

Intended use

These checkpoints are released to power the interactive thesis demos linked above. They are research artefacts; downstream production use is neither tested nor supported.

Citation

@phdthesis{janchevski_scalable_2025,
  author  = {Andrej Janchevski},
  title   = {Scalable Methods for Knowledge Graph Reasoning and Generation},
  school  = {{EPFL}},
  year    = {2025},
  url     = {https://infoscience.epfl.ch/entities/publication/87acf391-feef-43a0-b665-7f2f0bc70b2c},
}

License

MIT for the released weights and source. The research methods retain their original publication terms; see the thesis.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Bani57/checkpoints 1