Quick Start Guide

So I hear your week has been hectic. No worries. In this tutorial, we will walk through the basic usage of pMTnet Omni Document with minimum configuration. If you are truly swamped, we recommend our online tool.

Note

Make sure your data file which we assume is located at ./df.csv is structured somewhat like the following:

For a more detailed instruction on the data format, please check out Input File Format.

CLI (Command Line Interface)

By using CLI, you only need one line of code.

python -m pMTnet_Omni_Document --file_path ./df.csv --output_folder_path ./

Interactive Python

Read the file

# Import necessary functions
from pMTnet_Omni_Document.data_curation import read_file

# Read the file
df, mhc_seq_dict = read_file(file_path="./df.csv",
                            save_results=True,
                            sep=",")

In the output, you will see two files:

./df_curated.csv will contain all the curated data. You will also see some extra columns in this file.

Note

If you see that mhca_use_seq and/or mhcb_use_seq columns all have False, then the json file will simply contain an empty dictionary.

./mhc_seq_dict.json is a json file of a dictionary. The keys are various MHC sequences and the values are their corresponding ESM embeddings.