Drugs targeting the central nervous system must meet stringent criteria for both efficacy and safety, including their ability to penetrate the blood-brain barrier (BBB). This model predicts the likelihood of small-molecule drugs crossing the BBB, a critical factor in CNS drug development. The molecules are represented using SMILES (Simplified Molecular Input Line Entry System) strings.

The model is a fine-tuned version of IBM's biomedical foundation model, ibm/biomed.omics.bl.sm.ma-ted-458m [1], which was trained on over 2 billion biological samples across multiple modalities, including proteins, small molecules, and single-cell gene expression data.

The fine-tuning was performed using the MoleculeNet BBBP dataset [2]. For benchmarking, we employed predefined training, validation, and testing splits provided by MolFormer [3], sourced from the dataset referenced in [4].

Model Summary

Usage

Using biomed.omics.bl.sm.ma-ted-458m.moleculenet_bbbp requires installing https://github.com/BiomedSciAI/biomed-multi-alignment

pip install git+https://github.com/BiomedSciAI/biomed-multi-alignment.git

A simple example for using ibm/omics.bl.sm.ma-ted-458m.moleculenet_bbbp:

from mammal.examples.molnet.molnet_infer import load_model, task_infer

smiles_seq = "C(Cl)Cl"

task_dict = load_model(task_name="BBBP", device="cpu")
result = task_infer(task_dict=task_dict, smiles_seq=smiles_seq)
print(f"The prediction for {smiles_seq=} is {result}")

See our detailed example at: https://github.com/BiomedSciAI/biomed-multi-alignment

Citation

If you found our work useful, please consider giving a star to the repo and cite our paper:

@misc{shoshan2024mammalmolecularaligned,
      title={MAMMAL -- Molecular Aligned Multi-Modal Architecture and Language}, 
      author={Yoel Shoshan and Moshiko Raboh and Michal Ozery-Flato and Vadim Ratner and Alex Golts and Jeffrey K. Weber and Ella Barkan and Simona Rabinovici-Cohen and Sagi Polaczek and Ido Amos and Ben Shapira and Liam Hazan and Matan Ninio and Sivan Ravid and Michael M. Danziger and Joseph A. Morrone and Parthasarathy Suryanarayanan and Michal Rosen-Zvi and Efrat Hexter},
      year={2024},
      eprint={2410.22367},
      archivePrefix={arXiv},
      primaryClass={q-bio.QM},
      url={https://arxiv.org/abs/2410.22367}, 
}
Downloads last month
15
Safetensors
Model size
458M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for ibm/biomed.omics.bl.sm.ma-ted-458m.moleculenet_bbbp

Finetuned
(7)
this model

Spaces using ibm/biomed.omics.bl.sm.ma-ted-458m.moleculenet_bbbp 2

Collection including ibm/biomed.omics.bl.sm.ma-ted-458m.moleculenet_bbbp