Model Attribution
SiMologics is built on top of state-of-the-art open-source antibody AI models. We acknowledge and thank the research teams behind each model.
AntiBERTy
AntiBERTy is a BERT-based antibody language model trained on 558 million unpaired antibody sequences from the Observed Antibody Space (OAS) database. It produces 512-dimensional per-residue and per-sequence embeddings that encode evolutionary, structural, and functional information. Used on SiMologics for sequence embedding, species and chain-type classification, masked residue prediction, and log-likelihood scoring.
Citation: Ruffolo, J. A., Gray, J. J., & Sulam, J. (2022). Deciphering antibody affinity maturation with language models and weakly supervised learning. arXiv:2112.07782.
ProGen2-OAS
ProGen2 is a family of protein language models trained on hundreds of millions of protein sequences using a causal (autoregressive) transformer architecture. The OAS fine-tuned variant is specialised for antibody variable region generation. SiMologics uses it to extend user-provided seed sequences and generate novel antibody sequences via temperature-controlled nucleus sampling.
Citation: Nijkamp, E., et al. (2023). ProGen2: Exploring the space of protein sequence likelihood models. Cell Systems, 14(12).
IgCraft
IgCraft is an antibody-specific generative model that supports unconditional antibody generation, region inpainting (redesigning selected CDR and framework regions given IMGT-formatted input), inverse folding (predicting sequence from a PDB structure), and CDR grafting (transplanting donor CDR sequences onto an acceptor scaffold).
Citation: IgCraft — internal and/or preprint. See GitHub for latest citation guidance.
Source: https://github.com/oxpig/IgCraft
BioPhi (Sapiens)
BioPhi incorporates Sapiens, a BERT-based model trained on human antibody repertoire data to score the humanness of each residue in a sequence. SiMologics uses it for two tasks: Humanise (suggest mutations to increase humanness while preserving CDRs) and Score (compute per-residue humanness without modifying the sequence).
Citation: Prihoda, D., et al. (2022). BioPhi: A platform for antibody design, humanization, and humanness evaluation based on natural antibody repertoires and deep learning. mAbs, 14(1).
Source: https://github.com/Merck/BioPhi
