This repository contains all the data and code needed to generate the figures and support the conclusions in the following works:
This study used Python3 version 3.7.7 and the following packages:
Several custom python scripts are included in this repository and are imported and used by the Jupyter notebooks herein for flow cytometry data analysis:
flow_data_munging.py </br>
import flow_data_munging as munge_flow
global_parameters_flow_analysis.py</br>
import global_parameters_flow_analysis as flow_params
flow_statistical_analysis.py</br>
import flow_statistical_analysis as flow_stat
plot_flow_data.py</br>
import plot_flow_data as flowplot
Where to find datasets (raw and processed), code, and figures. For more information about how datasets were generated, see the Methods sections of the written works described above. The Adobe Illustrator file in this repository (.ai extension) contains the figure layouts with figures produced using the data and code described below.
This dataset is available at NCBI under GEO accession GSE233573.
C2C12-Nkd_vs_WT_plated_ligand_assay_scatterplotting.ipynb
: Run this notebook to generate RNA-seq scatterplot figures.C2C12-Nkd_vs_WT_plated_ligand_assay_scatterplotting.ipynb
.Processing_Dataset3_2021_CHO_reporter_vs_cis_ligand_48well-Step1.ipynb
and Processing_Dataset3_2021_CHO_reporter_vs_cis_ligand_48well-Step2.ipynb
)--see the local README.txt file for more info. Raw data subsets are divided into the following subfolders:Each of the below qRT-PCR subfolders contains the raw data as Excel spreadsheets, a Jupyter notebook (.ipynb file extension) with the same name as the dataset subfolder, and the figure(s) generated by that Jupyter notebook (.svg file extension).
Each subfolder listed below contains gel images corresponding to the Western blot and gel electrophoresis data reporter in the paper. Some gel images include additional gels unrelated to this experiment; in such cases, the images are provided both in uncropped versions and cropped versions with annotations. For more experimental details, see the article's relevant Methods section.
Flow cytometry data analysis steps are described in the proper sequential order with example commands below. A subset of the processing commands was used to process each of the datasets, depending on the cell types, culture schemes, and treatments used in the experiment that generated each dataset.
Cells were gated in a 2D plane of forward scatter (FSC) and side scatter (SSC) to select intact, singlet cells. [All data.]
df = munge_flow.apply_fluorescence_gate(df, flow_params.SSC_FSC_gate, 'SSC_v_FSC')
Cells were gated in a 2D plane of mTurq2 (PB450, A.U.) vs. SSC to separate out the +mTurq2 receiver cells from -mTurq2 senders or ‘blank’ parental cells. [All samples with receiver cells.]
df_r = munge_flow.apply_fluorescence_gate(df, flow_params.receiver_gate, 'receivers')
df_s = munge_flow.apply_fluorescence_gate(df, flow_params.sender_gate, 'senders')
Plasmid-transfected cells were gated in the APC700 channel to select cells expressing the cotransfection marker IFP2. [Plasmid-transfected samples only.]
df_transfected = munge_flow.apply_fluorescence_gate(df_r, flow_params.transfected_gate, 'transfected')
Note: df_transfected
would be used in all subsequent steps instead of df_r
, if applicable.
Receiver cells coexpressing ligand were gated into six consecutive bins of arbitrary mCherry (ECD) fluorescence units. [All samples with receiver cells coexpressing ligands.]
df_r = munge_flow.apply_multiple_1D_gates(df_r, 'mCh', flow_params.mCh_gate_bounds, keep_ungated=True)
Compensation was applied to subtract mTurq2 signal leaking into the FITC channel. [All samples with receiver cells.]
df_r = munge_flow.compensate(df_r, flow_params.mCit_mTurq_comp)
If applicable, reporter activity, mCitrine (FITC, A.U.) fluorescence was normalized to co-translational receptor expression by dividing mCitrine by the mTurq2 signal (PB450, A.U.). The resulting mCitrine/mTurq2 ratio is the “signaling activity” (reporter activity per unit receptor). [All samples with receiver cells.]
df_r = munge_flow.single_cell_fluor_norm(df_r, numerator_col='mCitrine', denominator_col='mTurq')
If applicable, cotranslational cis-ligand expression was normalized to co-translational receptor expression by dividing mCherry by the mTurq2 signal (PB450, A.U.). The resulting mCherry/mTurq2 ratio (cis-ligand expression per unit receptor) controls for slight variations in receptor expression when quantiatively comparing ligands’ cis-inhibition efficiencies (Figure 6). [All samples with receiver cells coexpressing ligands.]
df_r = munge_flow.single_cell_fluor_norm(df_r, numerator_col='mCh', denominator_col='mTurq')
Average bulk measurements for each sample were obtained by computing the mean signal across single-cell data for a given sample (and mCherry bin, if applicable). Cells treated with different 4-epi-tetracycline (4-epi-Tc) levels were pooled as technical replicates after mCherry binning. A minimum of 100 cells were required during averaging; mCherry bins with too few cells did not generate a bulk data point. [All samples.]
# Define fluorescence values to average in calculation of bulk fluorescence levels.
y_columns = ['mCitrine','mTurq','mCh', 'mCitrine/mTurq']
# Define x_categories: a list of data attribute names and values to separate
# the samples by, prior to averaging across all single cells in those unique samples.
# The composition of x_categories depends on the relevant data attributes for
# each given dataset. Here are two examples:
x_categories = [['date', pd.Series.unique(df_r['date'])],
['gate', pd.Series.unique(df_r['gate'])],
['sample', pd.Series.unique(df_r['sample'])]] # Without mCh binning
x_categories = [['date', pd.Series.unique(df_r['date'])],
['gate', pd.Series.unique(df_r['gate'])],
['receptor', pd.Series.unique(df_r['receptor'])],
['cis-ligand', pd.Series.unique(df_r['cis-ligand'])],
['sender', pd.Series.unique(df_r['sender'])],
['biorep', pd.Series.unique(df_r['biorep'])]] # With mCh binning
# Average the specified fluorescence values across single cells for each unique sample.
df_bulk = munge_flow.summarize_multiple_yvals(df_r, y_columns, x_categories,
stats =['median','mean'], min_cell_count=100)
Background subtraction was performed by subtracting “leaky” reporter activity of the receiver (with minimal cis-ligand, if applicable) in coculture with “blank” senders (CHO-K1 wt or C2C12-Nkd parental cells, according to the receiver cell type). You must first add a column to the dataframe that designates the 'background' samples with signal to subtract from the corresponding non-background samples.
# Define fluorescence values to background subtract.
y_columns = ['mCitrine/mTurq_mean',]
# Define x_categories: a list of data attribute names and values to separate
# the samples by, prior to background subtraction. Here, the order of attributes
# in x_categories does matter: the last item in the list must specify
# the column (here, 'control') and label (here, 'bsub') that designates the background samples to subtract. The bsub label must come first in the list.
# The below example is relatively simple - for some datasets, several more data
# attributes should be included in the list.
x_categories = [['receiver', ['Notch1', 'Notch2']],
['biorep', [1, 2, 3, 4]],
['control', ['bsub', '']]]
# In the above example, including 'biorep' in x_categories ensures that
# background fluorescence is subtracted individually for each biorep. If you
# instead exclude 'biorep' from x_categories, the function below will subtract
# the average value across all bioreps.
# Perform the background subtraction (creates new column).
df_bulk = munge_flow.background_subtract(df_bulk, y_columns, x_categories)
Y-axis normalization was performed as described in each figure caption, but most often by dividing background-subtracted fluorescence values by fluorescence in some condition with 'maximal' signaling activity. The process of y-axis normalization is similar to background subtraction: you must first add a column to the dataframe that designates the samples with the signal you'll divide other samples' signal by.
# Define fluorescence values to normalize fluorescence to.
y_columns = ['mCitrine/mTurq_mean_bsub',]
# Define x_categories: a list of data attribute names and values to separate
# the samples by, prior to y-axis normalization. Here, the order of attributes
# in x_categories does matter: the last item in the list must specify
# the column (here, 'control') and label (here, 'Yeq1') that designates the samples to divide other samples by. The first value in the list of control labels
# must be the control label designating the samples to divide others by.
# The below example is relatively simple - for some datasets, several more data
# attributes should be included in the list.
x_categories = [['receiver', ['Notch1', 'Notch2']],
['control', ['Yeq1', 'bsub', '']]]
# When 'biorep' or 'date' are excluded from x_categories, the function below will divide by the average 'Yeq1' fluorescence value across all bioreps (this is the recommended approach for y-axis normalization for most data in this study).
# Perform y-axis normalization (creates new column).
df_bulk = munge_flow.compute_fold_change(df_bulk, y_columns, x_categories)
Pre-processing steps were performed first, in order. Processing steps were performed next but are largely parallel to each other. Where applicable, sequential processing steps are denoted with 'A' (first) and 'B' (second).
Pre-processing
These notebooks export files to the Raw_Data folders only.
Pre-processing_step1_convert_fcs_to_csv.ipynb
: Import the original flow cytometry .fcs files and export .csv files. This script looks for a folder containing .fcs files, and in the same location, an .xls file with a list of sample labels corresponding to the .fcs files. It exports a single .csv file with the data from all .fcs files in the specified folder. FCS file import and sample labeling is slow, and can take multiple days for some datasets. To use this file, you must update paths and folder names.
Pre-processing_step2_combine_csvs_for_each_experiment.ipynb
: Combine flow cytometry data from multiple .csv files (if applicable), add a column specifying the date the data were collected, and fix errors in sample labeling() if applicable). The date is either scraped from the filename or mapped from an excel file.
Processing
These notebooks export files to the ProcessedData folders only. The name of each Processing notebook file is 'Processing', follwed by the name of the folder that it exports data files and figures to.