Quick Start Guide
So I hear your week has been hectic. No worries. In this tutorial, we will walk through the basic usage of pMTnet Omni Document with minimum configuration. If you are truly swamped, we recommend our online tool.
Note
Make sure your data file which we assume is located at
./df.csv
is structured somewhat like the following:
For a more detailed instruction on the data format, please check out Input File Format.
CLI (Command Line Interface)
By using CLI, you only need one line of code.
python -m pMTnet_Omni_Document --file_path ./df.csv --output_folder_path ./
Interactive Python
Read the file
# Import necessary functions
from pMTnet_Omni_Document.data_curation import read_file
# Read the file
df, mhc_seq_dict = read_file(file_path="./df.csv",
save_results=True,
sep=",")
In the output, you will see two files:
./df_curated.csv
will contain all the curated data. You will
also see some extra columns in this file.
Note
If you see that mhca_use_seq
and/or mhcb_use_seq
columns
all have False
, then the json file will simply contain an empty
dictionary.
./mhc_seq_dict.json
is a json file of a dictionary.
The keys are various MHC sequences and the values are their corresponding
ESM embeddings.