InvitroSPI and a large database of proteasome-generated spliced and non-spliced peptides

Hanna P. Roetschke, Guillermo Rodriguez-Hernandez, John A. Cormican, Xiaoping Yang, Steven Lynham, Michele Mishto*, Juliane Liepe*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


Noncanonical epitopes presented by Human Leucocyte Antigen class I (HLA-I) complexes to CD8+ T cells attracted the spotlight in the research of novel immunotherapies against cancer, infection and autoimmunity. Proteasomes, which are the main producers of HLA-I-bound antigenic peptides, can catalyze both peptide hydrolysis and peptide splicing. The prediction of proteasome-generated spliced peptides is an objective that still requires a reliable (and large) database of non-spliced and spliced peptides produced by these proteases. Here, we present an extended database of proteasome-generated spliced and non-spliced peptides, which was obtained by analyzing in vitro digestions of 80 unique synthetic polypeptide substrates, measured by different mass spectrometers. Peptides were identified through invitroSPI method, which was validated through in silico and in vitro strategies. The peptide product database contains 16,631 unique peptide products (5,493 non-spliced, 6,453 cis-spliced and 4,685 trans-spliced peptide products), and a substrate sequence variety that is a valuable source for predictors of proteasome-catalyzed peptide hydrolysis and splicing. Potential artefacts and skewed results due to different identification and analysis strategies are discussed.

Original languageEnglish
Article number18
JournalScientific Data
Issue number1
Publication statusPublished - 10 Dec 2023


Dive into the research topics of 'InvitroSPI and a large database of proteasome-generated spliced and non-spliced peptides'. Together they form a unique fingerprint.

Cite this