Abstract
As part of the Reproducibility Project: Cancer Biology, we published a Registered Report (Blum et al., 2015), that described how we intended to replicate selected experiments from the paper "Transcriptional amplification in tumor cells with elevated c-Myc" (Lin et al., 2012). Here we report the results. We found overexpression of c-Myc increased total levels of RNA in P493-6 Burkitt’s lymphoma cells; however, while the effect was in the same direction as the original study (Figure 3E; Lin et al., 2012), statistical significance and the size of the effect varied between the original study and the two different lots of serum tested in this replication. Digital gene expression analysis for a set of genes was also performed on P493-6 cells before and after c-Myc overexpression. Transcripts from genes that were active before c-Myc induction increased in expression following c-Myc overexpression, similar to the original study (Figure 3F; Lin et al., 2012). Transcripts from genes that were silent before c-Myc induction also increased in expression following c-Myc overexpression, while the original study concluded elevated c-Myc had no effect on silent genes (Figure 3F; Lin et al., 2012). Treating the data as paired, we found a statistically significant increase in gene expression for both active and silent genes upon c-Myc induction, with the change in gene expression greater for active genes compared to silent genes. Finally, we report meta-analyses for each result.
chooseCRANmirror(graphics=FALSE, ind=1) #selects a CRAN mirror
#Writes a manifest to local folder which includes all packages necessary to run each script called in the r markdown
cat('
library(httr)
library(tidyr)
library(reshape2)
library(pander)
library(car)
library(lsmeans)
library(coin)
library(MBESS)
library(metafor)
library(rjson)
library(psychometric)
',
file = "manifest.R")
#Creates a .checkpoint folder (in tempdir for this example)
dir.create(file.path(tempdir(), ".checkpoint"), recursive = TRUE, showWarnings = FALSE)
options(install.packages.compile.from.source = "no")
#Creates a checkpoint which allows for installation of packages as they existed on CRAN at the snapshot date of 2017-10-19
if (("checkpoint" %in% installed.packages()[, 1]) == F) {
install.packages("checkpoint")
}
#loads checkpoint
library(checkpoint)
checkpoint("2017-10-19", checkpointLocation = tempdir())
#Checkpoint in markdown code found at: https://github.com/RevolutionAnalytics/checkpoint/blob/master/vignettes/archive/using-checkpoint-with-knitr.Rmd
##Replication Study: Transcriptional amplification in tumor cells with elevated c-Myc##
L. Michelle Lewis1, Meredith C. Edwards1, Zachary R. Meyers1, C. Conover Talbot Jr.2, Haiping Hao2, David Blum1, Reproducibility Project:Cancer Biology†*
† The RP:CB core team consists of Elizabeth Iorns (Science Exchange, Palo Alto, California), Rachel Tsui (Science Exchange, Palo Alto, California), Alexandria Denis (Center for Open Science, Charlottesville, Virginia), Nicole Perfito (Science Exchange, Palo Alto, California), and Timothy M. Errington (Center for Open Science, Charlottesville, Virginia).
1 University of Georgia, Bioexpression and Fermentation Facility, Athens, Georgia, United States
2 Johns Hopkins University, Deep Sequencing and Microarray Core Facility, Baltimore, Maryland, United States
* Correspondence to Nicole Perfito ([email protected]) and Timothy M. Errington ([email protected])
Competing Interests
RP:CB: EI, RT, NP: Employed by and hold shares in Science Exchange Inc.
LML, MCE, ZRM, DB: Bioexpression and Fermentation Facility, University of Georgia is a Science Exchange associated lab.
CCT, HH: Deep Sequencing and Microarray Core Facility, Johns Hopkins University is a Science Exchange associated lab.
The other authors declare no conflicts of interest exist.
Funding
The Reproducibility Project: Cancer Biology is funded by the Laura and John Arnold Foundation, provided to the Center for Open Science in collaboration with Science Exchange. The funder had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Private Link for additional content related to the experimentation (raw files, methods notes, scripts, etc)
https://osf.io/mokeb/?view_only=756a4e87b872460d8d4ed25eae4d5150
Suggested browser to view: Chrome
To view any osf.io link in the manuscript add this extension to the end of it in your browser: "?view_only=756a4e87b872460d8d4ed25eae4d5150"
List of private links for experiments reported:
Conditional expression of c-Myc in P493-6 cells and total RNA levels (Figure 1):
https://osf.io/tfd57/?view_only=756a4e87b872460d8d4ed25eae4d5150
Digital gene expression following c-Myc overexpression (Figure 2):
https://osf.io/fn2y4/?view_only=756a4e87b872460d8d4ed25eae4d5150
Meta-analyses (Figure 3):
https://osf.io/5yscz/?view_only=756a4e87b872460d8d4ed25eae4d5150
Introduction
The Reproducibility Project: Cancer Biology (RP:CB) is a collaboration between the Center for Open Science and Science Exchange that seeks to address concerns about reproducibility in scientific research by conducting replications of selected experiments from a number of high-profile papers in the field of cancer biology (Errington et al., 2014). For each of these papers a Registered Report detailing the proposed experimental designs and protocols for the replications was peer reviewed and published prior to data collection. The present paper is a Replication Study that reports the results of the replication experiments detailed in the Registered Report (Blum et al., 2015) for a 2012 paper by Lin et al., and uses a number of approaches to compare the outcomes of the original experiments and the replications.
In 2012, Lin et al. reported results that the c-Myc transcription factor, a potent oncogene that is frequently overexpressed in a large percentage of cancers, globally amplifies the expression of actively transcribed genes, opposed to regulating specific target genes. Using the P493-6 cell line, a model for MYC activation in Burkitt’s lymphoma, total levels of RNA per cell were reported to increase when c-Myc was highly expressed compared to conditions where c-Myc expression was low. Additionally, active genes in cells with low c-Myc levels were reported to increase in expression upon c-Myc induction, in contrast to genes that were silent under low c-Myc conditions that did not change.
The Registered Report for the 2012 paper by Lin et al. described the experiments to be replicated (Figures 1B and 3E-F), and summarized the current evidence for these findings (Blum et al., 2015). Since that publication there have been additional studies investigating the ability c-Myc to influence the global gene expression output of cells. Similar to Lin et al. other studies have reported c-Myc dependent amplification of cellular RNA (Hart et al., 2015; Hsu et al., 2015; Nie et al., 2012; Sabò et al., 2014), although this observation was not reported in all biological systems (Fagnocchi et al., 2016; Sabò et al., 2014; Walz et al., 2014). It has been suggested c-Myc regulates specific genes that indirectly lead to RNA amplification (Sabò et al., 2014; Sabò and Amati, 2014; Walz et al., 2014). This has also been suggested of MYCN (Duffy et al., 2014). The reported differences could be a result of the intrinsic variation between cell lines in maintaining the transcriptome (Trakhtenberg et al., 2016). Indeed, a recent study reported that distinct transcriptional regulation can be accounted for by differences in promoter affinity under different c-Myc expression levels (Lorenzin et al., 2016).
The outcome measures reported in this Replication Study will be aggregated with those from the other Replication Studies to create a dataset that will be examined to provide evidence about reproducibility of cancer biology research, and to identify factors that influence reproducibility more generally.
Results and Discussion
Conditional expression of c-Myc in the B-cell line P493-6
To test the effects of increased levels of c-Myc on gene expression we used the same human P493-6 B cell line of Burkitt’s lymphoma that contains a conditional tetracycline-repressive MYC transgene (Pajic et al., 2000; Schuhmacher et al., 1999) as the original study. We performed Western blot analysis to confirm c-Myc expression could be reduced to very low levels and then reactivated after removal of tetracycline. This is comparable to what was reported in Figure 1B of Lin et al., 2012 and described in Protocol 1 in the Registered Report (Blum et al., 2015). Since proliferation of P493-6 cells depend on c-Myc expression and the presence of serum (Pajic et al., 2000; Schuhmacher et al., 1999), with serum reported to stimulate a majority of genes independent of c-Myc (Schlosser et al., 2005), we maintained these cells in separate lots of serum to assess whether the results differed. For cells maintained in both lots of serum, treatment with tetracycline resulted in a strong decrease in c-Myc protein levels (Figure 1A). After removal of tetracycline, c-Myc levels increased over time approaching the levels observed in tetracycline-free conditions.
Total RNA levels following c-Myc overexpression
We sought to independently
replicate whether increased levels of c-Myc resulted in increased absolute levels of RNA.
This experiment is similar to what was reported in Figure 3E of Lin et al., 2012 and used
the same extraction method for total RNA quantification, which was described in Protocol 2
in the Registered Report (Blum et al., 2015). Total RNA was isolated from P493-6 cells 0,
1, and 24 hr after tetracycline release and the amount of RNA per 1,000 cells was
quantified (Figure 1B). We found that under conditions where c-Myc expression was low (0
hr), there was a mean of round(mean(subset(data2, Lot==1 & Time==0)$value),2)
length(subset(data2, Lot==1 & Time==0)$value)
formatC(sd(subset(data2, Lot==1 &
Time==0)$value),2,format="f")
round(mean(subset(data2, Lot==1 &
Time==24)$value),2)
length(subset(data2, Lot==1 & Time==24)$value)
round(sd(subset(data2, Lot==1 & Time==24)$value),2)
round(mean(subset(data2, Lot==1 & Time==24)$value)/mean(subset(data2,
Lot==1 & Time==0)$value),2)
contrast1$df
round(contrast1$t.ratio,2)
sub('^(-)?0[.]','\\1.',round(contrast1$p.value,3))
round(mean(subset(data2, Lot==2 & Time==0)$value),2)
length(subset(data2, Lot==2 & Time==0)$value)
round(sd(subset(data2, Lot==2 & Time==0)$value),2)
round(mean(subset(data2, Lot==2 & Time==24)$value),2)
length(subset(data2, Lot==2 & Time==24)$value)
round(sd(subset(data2, Lot==2 & Time==24)$value),2)
round(mean(subset(data2, Lot==2 & Time==24)$value)/mean(subset(data2,
Lot==2 & Time==0)$value),2)
contrast2$df
round(contrast2$t.ratio,2)
sub('^(-)?0[.]','\\1.',round(contrast2$p.value,4))
round(mean(zero),2)
round(mean(twentyfour),2)
round(mean(twentyfour)/mean(zero),2)
Digital gene expression following c-Myc overexpression
To test whether c-Myc
expression amplifies the existing gene expression program, digital gene expression
analysis using the NanoString nCounter platform was performed on a set of genes from
multiple functional categories. This experiment is similar to what was reported in Figure
3F and Table S1 of Lin et al., 2012 and described in Protocols 3-4 in the Registered
Report (Blum et al., 2015). We quantified mRNA levels/cell of prettyNum(length(unique(comb.means$Accession)),
big.mark=",")
prettyNum(length(intersect(comb.means$Accession, o.comb.means$Accession)),
big.mark=",")
prettyNum(length(unique(o.comb.means$Accession)),
big.mark=",")
length(active_0hr_l1)
formatC(median(active_0hr_l1),2,format="f")
length(silent_0hr_l1)
round(median(silent_0hr_l1),3)
round(length(which((active_1hr_l1-active_0hr_l1)>0))/length(active_0hr_l1)*100)
round(length(which((active_24hr_l1-active_0hr_l1)>0))/length(active_0hr_l1)*100)
round(length(which((active_24hr_l1-active_1hr_l1)>0))/length(active_0hr_l1)*100)
round(median(active_1hr_l1)/median(active_0hr_l1),2)
formatC(median(active_24hr_l1)/median(active_0hr_l1),2,format="f")
round(median(active_24hr_l1)/median(active_1hr_l1),2)
round(length(which((silent_1hr_l1-silent_0hr_l1)>0))/length(silent_0hr_l1)*100)
round(length(which((silent_24hr_l1-silent_0hr_l1)>0))/length(silent_0hr_l1)*100)
round(length(which((silent_24hr_l1-silent_1hr_l1)>0))/length(silent_0hr_l1)*100)
round(median(silent_1hr_l1)/median(silent_0hr_l1),2)
round(median(silent_24hr_l1)/median(silent_0hr_l1),2)
abs(round((median(silent_24hr_l1)-median(silent_1hr_l1))/(median(silent_24hr_l1)),2))
length(active_0hr)
round(median(active_0hr),2)
length(silent_0hr)
formatC(median(silent_0hr),2,format="f")
round(length(which((active_1hr-active_0hr)>0))/length(active_0hr)*100)
round(length(which((active_24hr-active_0hr)>0))/length(active_0hr)*100)
round(length(which((active_24hr-active_1hr)>0))/length(active_0hr)*100)
round(median(active_1hr)/median(active_0hr),2)
round(median(active_24hr)/median(active_0hr),2)
round(median(active_24hr)/median(active_1hr),2)
round(length(which((silent_1hr-silent_0hr)>0))/length(silent_0hr)*100)
round(length(which((silent_24hr-silent_0hr)>0))/length(silent_0hr)*100)
round(length(which((silent_24hr-silent_1hr)>0))/length(silent_0hr)*100)
prettyNum(nrow(common), big.mark=",")
round(ovl1_active_percent, digits = 1)
ovl1_active
total_common_active
round(ovl2_active_percent, digits=1)
ovl2_active
total_common_active
round(ovl1_silent_percent, digits=1)
ovl1_silent
total_common_silent
round(ovl2_silent_percent, digits=1)
ovl2_silent
total_common_silent
To test whether active genes, as well as silent genes, increased expression during c-Myc induction we performed the confirmatory analysis as outlined in the Registered Report (Blum et al., 2015). This analysis differed from what was reported in the original study by analyzing the data as paired instead of unpaired. As suggested during peer review of the Registered Report, this is because expression of the same gene, analyzed across different conditions, is not independent (Blum et al., 2015). We performed a Wilcoxon signed-rank test on active genes comparing expression at 0 hr to 1 hr, 0 hr to 24 hr, and 1 hr to 24 hr, which were statistically significant for cells grown in both lots of serum (Table 1). The same comparisons were performed on silent genes, which were also statistically significant, with the exception of the silent gene comparison of 1 hr to 24 hr for serum lot one. Considering this was not the test reported in the original study, we conducted these paired analyses on the original data to provide a direct comparison. For both active and silent genes c-Myc induction resulted in statistically significant increases in expression, with the exception of the silent gene comparison from 0 hr to 1 hr (Table 1). This is in contrast to the results of the unpaired tests that were reported in the original study where active genes were reported to have a statistically significant increase in expression and silent genes were reported as not statistically significant for all comparisons. We conducted an exploratory unpaired analysis on the replication data for comparison, which resulted in statistically significant differences among the active gene comparisons as well as half of the silent gene comparisons (Table 2).
Importantly, though, the question of whether the change in expression among active genes is different than silent genes has not been tested. This would require a separate test on their difference (Gelman and Stern, 2006; Nieuwenhuis et al., 2011). To test whether active genes increased in expression during c-Myc induction more than silent genes, we performed an exploratory analysis on the difference in expression of active genes during c-Myc induction (e.g. from 0 hr to 24 hr) compared to the difference in expression of silent genes over that same period (e.g. from 0 hr to 24 hr). For both the original and replication data, there was a statistically significant increase in expression of active genes compared to silent genes (Table 3). This suggests that active genes and silent genes do not have similar rates of expression upon c-Myc induction. To summarize, for this experiment we found results that were in the same direction as the original study and suggest that while both active and silent genes increased in expression upon c-Myc induction, the rate of increase was different.
The original study and this
replication attempt used the same criteria to characterize a gene as silent or active, but
there are many negative consequences of dichotomizing continuous variables, such as
information loss, especially with a small gene set (Altman, 2006; Cohen, 1983). Papers
published after the original study took an unbiased view by collecting comprehensive
RNA-sequencing data to assess if the transcriptional effects of c-Myc were direct or
indirect, concluding c-Myc activates and represses transcription of discrete gene sets,
which in turn leads to induced RNA amplification (Sabò et al., 2014; Walz et al., 2014).
Furthermore, Sabò and colleagues also used NanoString technology to quantify a subset of
the differentially expressed genes identified by RNA-seq and observed similar results that
revealed upward shifts in gene expression upon c-Myc induction (Sabò et al., 2014).
However, instead of dichotomizing genes as active or silent, gene expression data was
presented as continuous. Similarly, we presented the digital gene expression data
generated during this replication attempt as continuous, which illustrates a general
pattern of overall increased gene expression following c-Myc induction (Figure 2 - figure
supplement 2). Importantly, though, these results are limited to the
prettyNum(rep_n, big.mark=",")
Meta-analyses of original and replicated effects
We performed a meta-analysis using a random-effects model to combine each of the effects described above as pre-specified in the confirmatory analysis plan (Blum et al., 2015). To provide a standardized measure of the effect, a common effect size was calculated for each effect from the original and replication studies. Cohen’s d is the standardized difference between two means using the pooled sample standard deviation. The effect size r is a standardized measure of the strength and direction of the association between two variables, in this case time during c-Myc induction and gene expression. The estimate of the effect size of one study, as well as the associated uncertainty (i.e. confidence interval), compared to the effect size of the other study provides another approach to compare the original and replication results (Errington et al., 2014; Valentine et al., 2011). Importantly, the width of the confidence interval for each study is a reflection of not only the confidence level (e.g. 95%), but also variability of the sample (e.g. SD) and sample size.
The comparison of total RNA
levels at low levels of c-Myc (0hr) compared to high levels of c-Myc (24 hr) resulted in
d =
round(exp_orig_d,2)
round(exp_orig_ci[[1]],2)
round(exp_orig_ci[[3]],2)
round(exp_lot1_d,2)
round(exp_lot1_ci[[1]],2)
round(exp_lot1_ci[[3]],2)
round(exp_lot2_d,2)
formatC(exp_lot2_ci[[1]],2,format="f")
round(exp_lot2_ci[[3]],2)
round(exp_meta$b[1],2)
round(exp_meta$ci.lb,2)
round(exp_meta$ci.ub,2)
sub('^(-)?0[.]','\\1.',round(exp_meta$pval,4))
There were six comparisons of
the gene expression data, three for active genes and three for silent genes (Figure 3B).
These calculations were performed analyzing the data as paired, for reasons discussed
above and as prespecified in the Registered Report (Blum et al., 2015). For active genes,
expression at 0 hr to 1 hr, 0 hr to 24 hr, and 1 hr to 24 hr the meta-analyses were
statistically significant (p = scinot(a.meta.0v1$pval)$coeff
scinot(a.meta.0v1$pval)$exp
scinot(a.meta.0v24$pval)$coeff
scinot(a.meta.0v24$pval)$exp
sub('^(-)?0[.]','\\1.',round(a.meta.1v24$pval,4))
sub('^(-)?0[.]','\\1.',round(s.meta.0v1$pval,3))
sub('^(-)?0[.]','\\1.',round(s.meta.1v24$pval,4))
scinot(s.meta.0v24$pval)$coeff
scinot(s.meta.0v24$pval)$exp
This direct replication provides an opportunity to understand the present evidence of these effects. Any known differences, including reagents and protocol differences, were identified prior to conducting the experimental work and described in the Registered Report (Blum et al., 2015). However, this is limited to what was obtainable from the original paper and through communication with the original authors, which means there might be particular features of the original experimental protocol that could be critical, but unidentified. So while some aspects, such as the cell line, induction time course, and the method used to measure gene expression were maintained, others were changed at the time of study design (Blum et al., 2015) that could affect results, such as the analytical approach (Silberzahn et al., 2017) and serum lot (Leek et al., 2010). Furthermore, other aspects were unknown or not easily controlled for. These include variables such as cell line genetic drift (Hughes et al., 2007; Kleensang et al., 2016) or changes in cellular volume that can impact overall transcript abundance (Padovan-Merhar et al., 2015). Whether these or other factors influence the outcomes of this study is open to hypothesizing and further investigation, which is facilitated by direct replications and transparent reporting.
Materials and methods
As described in the Registered Report (Blum et al., 2015), we attempted a replication of the experiments reported in Figures 1B and 3E-F of Lin et al., 2012. A detailed description of all protocols can be found in the Registered Report (Blum et al., 2015). Additional detailed experimental notes, data, and analysis are available on the Open Science Framework (OSF) (RRID:SCR_003238) (https://osf.io/mokeb/; Lewis et al., 2017). This includes the R Markdown file (https://osf.io/vdrsh/) that was used to compose this manuscript, which is a reproducible document linking the results in the article directly to the data and code that produced them (Hartgerink, 2017).
Cell culture
P493-6 cells (shared by Young lab, Whitehead Institute for Biomedical Research, RRID: CVCL_6783) were maintained in RPMI-1640 supplemented with 1% Ala-Gln and 10% tetracycline-free FBS (Clontech, Mountain View, CA, cat# 631105, lot# 1: A15003, lot# 2: A15032). Cells were grown at 37°C in a humidified atmosphere at 5% CO2. Quality control data for the cell line are available at https://osf.io/e6ftz/. This includes results confirming the cell line was free of mycoplasma contamination (DDC Medical, Fairfield, Ohio). Additionally, STR DNA profiling of the cell line was performed (DDC Medical, Fairfield, Ohio).
For repression of the conditional pmyc-tet construct in P493-6 cells, 0.1 µg/ml tetracycline (Sigma-Aldrich, St. Louis, MO, T7660) was added to the culture medium and cells were incubated for 72 hr. Under these conditions, P493-6 cells did not proliferate due to a dependency on the expression of MYC (Schuhmacher et al., 1999). For MYC re-induction, cells were washed three times with growth medium and grown in tetracycline-free culture conditions.
Western blot
P493-6 cells were harvested at the indicated times and total cell lysates were prepared by pelleting ~1x107 cells (determined with a C-chip disposable hemocytometer) at 4°C at 1,200 rpm for 5 min using a refrigerated centrifuge (Eppendorf, Westbury, NY, model# 5810R). After cell pellets were washed once with ice-cold 1X PBS, pellets were resuspended in RIPA lysis buffer containing 2X SIGMAFAST Protease inhibitors and 2X Phosphatase inhibitor cocktails 2 and 3. Protein concentrations were determined using the Bradford assay according to the manufacturer’s instructions. Sample buffer was added to protein lysates and 50 µg of protein along with protein ladder was resolved by SDS-PAGE and transferred to PVDF membrane as described in the Registered Report (Blum et al., 2015). The membrane was blocked with 5% w/v nonfat dry milk in 1X TBS with 0.2% Tween-20 (TBST). Membranes were probed with rabbit anti-c-Myc clone Y69 (Epitomics, Burlingame, CA, cat# 1472-1; RRID:AB_731658); 1:5,000 dilution in 5% w/v nonfat dry milk/TBST and mouse anti-ß-actin clone AC-15 (Sigma-Aldrich, cat# A5441; RRID:AB_476744); 1:10,000 dilution in 5% w/v nonfat dry milk/TBST. Each incubation was followed by washes with TBST and the appropriate secondary antibody: HRP-conjugated donkey anti-rabbit (Sigma-Aldrich, cat# GERPN2124); 1:10,000 dilution in 5% w/v nonfat dry milk/TBST or HRP-conjugated sheep anti-mouse (Sigma-Aldrich, cat# GERPN2124); 1:10,000 dilution in 5% w/v nonfat dry milk/TBST. Membranes were washed with TBST and incubated with ECL Prime Chemiluminescent reagent (Sigma-Aldrich, cat# GERPN2232) according to the manufacturer’s instructions. Western blot images were acquired with G:BOX iChem XT and GeneSnap software (RRID:SCR_014249), version 7.12.02 (Syngene, Frederick, Maryland) and quantified using ImageJ software (RRID:SCR_003070), version 1.50i (Schneider et al., 2012). All images taken are available at https://osf.io/ujg7t/.
RNA quantification
P493-6 cells were harvested at the indicated times and total RNA extraction was performed by pelleting ~1x107 cells (exact number determined with a C-chip disposable hemocytometer) and homogenizing the sample in 1 ml Tri Reagent (Sigma-Aldrich, cat# T9424) according to the manufacturer’s instructions. For each sample 10% v/v miRNA Homogenate Additive was added, vortexed, and incubated on ice for 10 min. For each 1 ml of Tri Reagent, 100 µl of bromochloropropane was added, vortexed for 15-30 sec, incubated for 5 min at RT, then centrifuged at 12,000xg for 10 min at 4°C. The aqueous phase was recovered and total RNA isolation was performed using the miRVana miRNA extraction kit (Ambion, Waltham, MA, cat# AM1561) according to the manufacturer’s instructions. Recovered RNA was eluated in 100 µl nuclease-free water. Total RNA concentrations and purity (data available at https://osf.io/jh5r4/) were measured using a NanoDrop ND-1000 (Thermo Fisher Scientific, Waltham, Massachusetts) with the NanoDrop Operating Software, version 3.3, and converted to ng per 1,000 cells.
RNA extraction and NanoString nCounter digital gene expression assay
P493-6 cells were harvested at
the indicated times and 1x106 cells were collected (number
determined with a C-chip disposable hemocytometer) and lysed directly in 100 µl Buffer RLT
(Qiagen, Hilden, Germany, cat# 79216) supplemented with ß-mercaptoethanol to yield a
concentration of 10,000 cells per µl. This was performed four independent times. Multiple
4 µl aliquots were stored and shipped at -80°C with temperature monitored during shipping
to avoid freeze/thaw cycles. Lysates were processed according to the Cell Lysate Protocol
(nCounter Gene Expression Assay Manual, NanoString Technologies, Seattle, Washington)
according to the manufacturer’s instructions and as described in the Registered Report
(Blum et al., 2015). Three nCounter Reporter CodeSets (nCounter GX Human Immunology Kit,
nCounter GX Human Kinase Kit, nCounter Custom CodeSet) encompassing
prettyNum(length(unique(comb.means$Accession)),
big.mark=",")
Statistical analysis
Statistical analysis was performed with R software (RRID:SCR001905), version 3.3.2 (R Core Team, 2017). All data, csv files, and analysis scripts are available on the OSF (https://osf.io/mokeb/). Confirmatory statistical analysis was pre-registered (https://osf.io/nj8wb/) before the experimental work began as outlined in the Registered Report (Blum et al., 2015). Proposed analysis of gene expression data was conducted by the Wilcoxon signed-rank test using the method proposed by Pratt to handle zero differences (Pratt, 1959), with additional exploratory analysis performed using the Wilcoxon rank sum test as reported in the original study and a Wilcoxon rank sum test on the difference in expression of active genes during c-Myc induction (e.g. from 0 hr to 24 hr) compared to the difference in expression of silent genes over that same period (e.g. from 0 hr to 24 hr). Data were checked to ensure assumptions of statistical tests were met. When described in the results, the Bonferroni correction, to account for multiple testings, was applied to the alpha error by dividing the uncorrected value (.05) by the number of tests performed. Although the Bonferroni method is conservative, it was accounted for in the power calculations to ensure sample size was sufficient. In cases where the number of groups were 3 and the sample sizes were evenly distributed among the groups, Fisher's LSD test was performed resulting in an _a priori significance threshold of .05. A meta-analysis of a common original and replication effect size was performed with a random effects model and the metafor package (Viechtbauer, 2010) (available at: https://osf.io/5yscz/). The sample sizes reported in Table 1 and Figure 3 for the gene analysis comparisons is based on the sample size used in the Wilcoxon signed-rank test, which removes samples with zero differences after ranking (Pratt, 1959). The raw original study data were shared by the original authors with the summary data published in the Registered Report (Blum et al., 2015) and was used in the power calculations to determine the sample size for this study.
Deviations from Registered Report
The number of flasks, and thus
cells, was increased when tetracycline was added to P493-6 cells to account for the cells
not proliferating during this period (i.e. there were two Flask B’s as described in the
Registered Report, which were pooled prior to seeding). The proposed statistical analysis
for the western blot analysis (Protocol 1) described in the Registered Report was not
performed since the levels of normalized c-Myc at time 0 hr was at the limit of detection.
The number of genes analyzed in the original study, and thus listed in the Registered
Report, was reported incorrectly as 1,388 instead of prettyNum(length(unique(o.comb.means$Accession)),
big.mark=",")
Acknowledgements
The Reproducibility Project: Cancer Biology would like to thank the original authors, particular Charles Lin (Baylor College of Medicine) for sharing critical reagents and data, specifically the P493-6 cells. We would also like to thank Courtney Soderberg at the Center for Open Science for assistance with statistical analyses and the following companies for generously donating reagents to the Reproducibility Project: Cancer Biology; American Type and Tissue Collection (ATCC), Applied Biological Materials, BioLegend, Charles River Laboratories, Corning Incorporated, DDC Medical, EMD Millipore, Harlan Laboratories, LI-COR Biosciences, Mirus Bio, Novus Biologicals, Sigma-Aldrich, and System Biosciences (SBI).
References
Altman, D.G., 2006. The cost of dichotomising continuous variables. BMJ 332, 1080–1080. doi:10.1136/bmj.332.7549.1080
Biggs, R., Macmillan, R.L., 1948. The Error of the Red Cell Count. Journal of Clinical Pathology 1, 288–291. doi:10.1136/jcp.1.5.288
Blum, D., Hao, H., McCarthy, M., Reproducibility Project: Cancer Biology, 2015. Registered report: Transcriptional amplification in tumor cells with elevated c-Myc. eLife 4. doi:10.7554/eLife.04024
Cohen, J., 1983. The Cost of Dichotomization. Applied Psychological Measurement 7, 249–253. doi:10.1177/014662168300700301
Duffy, D.J., Krstic, A., Halasz, M., Schwarzl, T., Fey, D., Iljin, K., Prakash Mehta, J., Killick, K., Whilde, J., Turriziani, B., Haapa-Paananen, S., Fey, V., Fischer, M., Westermann, F., Henrich, K.-O., Bannert, S., Higgins, D.G., Kolch, W., 2014. Integrative omics reveals MYCN as a global suppressor of cellular signalling and enables network-based therapeutic target discovery in neuroblastoma. Oncotarget. doi:10.18632/oncotarget.6568
Errington, T.M., Iorns, E., Gunn, W., Tan, F.E., Lomax, J., Nosek, B.A., 2014. An open investigation of the reproducibility of cancer biology research. Elife 3. doi:10.7554/eLife.04333
Fagnocchi, L., Cherubini, A., Hatsuda, H., Fasciani, A., Mazzoleni, S., Poli, V., Berno, V., Rossi, R.L., Reinbold, R., Endele, M., Schroeder, T., Rocchigiani, M., Szkarłat, Ż., Oliviero, S., Dalton, S., Zippo, A., 2016. A Myc-driven self-reinforcing regulatory network maintains mouse embryonic stem cell identity. Nature Communications 7, 11903. doi:10.1038/ncomms11903
Gelman, A., Stern, H., 2006. The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant. The American Statistician 60, 328–331. doi:10.1198/000313006X152649
Hart, J.R., Roberts, T.C., Weinberg, M.S., Morris, K.V., Vogt, P.K., 2015. MYC regulates the non-coding transcriptome. Oncotarget 5, 12543–12554. doi:10.18632/oncotarget.3033
Hartgerink, C.H.J., 2017. Composing reproducible manuscripts using R Markdown. eLife. https://elifesciences.org/labs/cad57bcf/composing-reproducible-manuscripts-using-r-markdown
Hiorns, L.R., Bradshaw, T.D., Skelton, L.A., Yu, Q., Kelland, L.R., Leyland-Jones, B., 2004. Variation in RNA expression and genomic DNA content acquired during cell culture. British Journal of Cancer 90, 476–482. doi:10.1038/sj.bjc.6601405
Hsu, T.Y.-T., Simon, L.M., Neill, N.J., Marcotte, R., Sayad, A., Bland, C.S., Echeverria, G.V., Sun, T., Kurley, S.J., Tyagi, S., Karlin, K.L., Dominguez-Vidaña, R., Hartman, J.D., Renwick, A., Scorsone, K., Bernardi, R.J., Skinner, S.O., Jain, A., Orellana, M., Lagisetti, C., Golding, I., Jung, S.Y., Neilson, J.R., Zhang, X.H.-F., Cooper, T.A., Webb, T.R., Neel, B.G., Shaw, C.A., Westbrook, T.F., 2015. The spliceosome is a therapeutic vulnerability in MYC-driven cancer. Nature 525, 384–388. doi:10.1038/nature14985
Hughes, P., Marshall, D., Reid, Y., Parkes, H., Gelber, C., 2007. The costs of using unauthenticated, over-passaged cell lines: how much more data do we need? BioTechniques 43, 575–586. doi:10.2144/000112598
Kleensang, A., Vantangoli, M.M., Odwin-DaCosta, S., Andersen, M.E., Boekelheide, K., Bouhifd, M., Fornace, A.J., Li, H.-H., Livi, C.B., Madnick, S., Maertens, A., Rosenberg, M., Yager, J.D., Zhaog, L., Hartung, T., 2016. Genetic variability in a frozen batch of MCF-7 cells invisible in routine authentication affecting cell function. Scientific Reports 6, 28994. https://doi.org/10.1038/srep28994
Leek, J.T., Scharpf, R.B., Bravo, H.C., Simcha, D., Langmead, B., Johnson, W.E., Geman, D., Baggerly, K., Irizarry, R.A., 2010. Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics 11, 733–739. doi:10.1038/nrg2825
Lewis, L.M., Edwards, M.C., Meyers, Z.R., Talbot, C.C. Jr., Hao, H., Blum, D., Iorns, E., Tsui, R., Denis, A., Perfito, N., Errington, T.M., 2017. Study 48: Replication of Lin et al., 2012 (Cell). Open Science Framework. doi:10.17605/OSF.IO/MOKEB
Lin, C.Y., Lovén, J., Rahl, P.B., Paranal, R.M., Burge, C.B., Bradner, J.E., Lee, T.I., Young, R.A., 2012. Transcriptional Amplification in Tumor Cells with Elevated c-Myc. Cell 151, 56–67. doi:10.1016/j.cell.2012.08.026
Lorenzin, F., Benary, U., Baluapuri, A., Walz, S., Jung, L.A., von Eyss, B., Kisker, C., Wolf, J., Eilers, M., Wolf, E., 2016. Different promoter affinities account for specificity in MYC-dependent gene regulation. eLife 5. doi:10.7554/eLife.15161
Nie, Z., Hu, G., Wei, G., Cui, K., Yamane, A., Resch, W., Wang, R., Green, D.R., Tessarollo, L., Casellas, R., Zhao, K., Levens, D., 2012. c-Myc Is a Universal Amplifier of Expressed Genes in Lymphocytes and Embryonic Stem Cells. Cell 151, 68–79. doi:10.1016/j.cell.2012.08.033
Nielson, L., Smyth, G., Greenfield, P., 1991. Hemacytometer Cell Count Distributions: Implications of Non-Poisson Behavior. Biotechnology Progress 7, 560–563. doi:10.1021/bp00012a600
Nieuwenhuis, S., Forstmann, B.U., Wagenmakers, E.-J., 2011. Erroneous analyses of interactions in neuroscience: a problem of significance. Nature Neuroscience 14, 1105–1107. doi:10.1038/nn.2886
Padovan-Merhar, O., Nair, G.P., Biaesch, A.G., Mayer, A., Scarfone, S., Foley, S.W., Wu, A.R., Churchman, L.S., Singh, A., Raj, A., 2015. Single Mammalian Cells Compensate for Differences in Cellular Volume and DNA Copy Number through Independent Global Transcriptional Mechanisms. Molecular Cell 58, 339–352. doi:10.1016/j.molcel.2015.03.005
Pajic, A., Spitkovsky, D., Christoph, B., Kempkes, B., Schuhmacher, M., Staege, M.S., Brielmeier, M., Ellwart, J., Kohlhuber, F., Bornkamm, G.W., Polack, A., Eick, D., 2000. Cell cycle activation by c-myc in a burkitt lymphoma model cell line. Int. J. Cancer 87, 787–793.
Pratt, J.W., 1959. Remarks on Zeros and Ties in the Wilcoxon Signed Rank Procedures. Journal of the American Statistical Association 54, 655. doi:10.2307/2282543
R Core Team, 2017. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
Sabò, A., Amati, B., 2014. Genome Recognition by MYC. Cold Spring Harbor Perspectives in Medicine 4, a014191–a014191. doi:10.1101/cshperspect.a014191
Sabò, A., Kress, T.R., Pelizzola, M., de Pretis, S., Gorski, M.M., Tesi, A., Morelli, M.J., Bora, P., Doni, M., Verrecchia, A., Tonelli, C., Fagà, G., Bianchi, V., Ronchi, A., Low, D., Müller, H., Guccione, E., Campaner, S., Amati, B., 2014. Selective transcriptional regulation by Myc in cellular growth control and lymphomagenesis. Nature 511, 488–492. doi:10.1038/nature13537
Schlosser, I., Hölzel, M., Hoffmann, R., Burtscher, H., Kohlhuber, F., Schuhmacher, M., Chapman, R., Weidle, U.H., Eick, D., 2005. Dissection of transcriptional programmes in response to serum and c-Myc in a human B-cell line. Oncogene 24, 520–524. doi:10.1038/sj.onc.1208198
Schneider, C.A., Rasband, W.S., Eliceiri, K.W., 2012. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675.
Schuhmacher, M., Staege, M.S., Pajic, A., Polack, A., Weidle, U.H., Bornkamm, G.W., Eick, D., Kohlhuber, F., 1999. Control of cell growth by c-Myc in the absence of cell division. Current Biology 9, 1255–1258. doi:10.1016/S0960-9822(99)80507-7
Silberzahn, R., Uhlmann, E.L., Martin, D.P., Anselmi, P., Aust, F., Awtrey, E.C., Bahník, Š., Bai, F., Bannard, C., Bonnier, E., Carlsson, R., Cheung, F., Christensen, G., Clay, R., Craig, M.A., Dalla Rosa, A., Dam, L., Evans, M.H., Flores Cervantes, I., Fong, N., Gamez-Djokic, M., Glenz, A., Gordon-McKeon, S., Heaton, T., Eriksson, K.H., Heene, M., Hofelich Mohr, A., Högden, F., Hui, K., Johannesson, M., Kalodimos, J., Kaszubowski, E., Kennedy, D.M., Lei, R., Lindsay, T.A., Liverani, S., Madan, C.R., Molden, D.C., Molleman, E., Morey, R.D., Mulder, L.B., Nijstad, B.A., Pope, B., Pope, N.G., Prenoveau, J.M., Rink, F., Robusto, E., Roderique, H., Sandberg, A., Schlueter, E., Schönbrodt, F.D., Sherman, M.F., Sommer, S.A., Sotak, K.L., Spain, S.M., Spörlein, C., Stafford, T., Stefanutti, L., Täuber, S., Ullrich, J., Vianello, M., Wagenmakers, E.-J., Witkowiak, M., Yoon, S., Nosek, B.A., 2017. Many analysts, one dataset: Making transparent how variations in analytical choices affect results. URL: https://osf.io/preprints/psyarxiv/qkwst/ doi: 10.17605/OSF.IO/QKWST
Trakhtenberg, E.F., Pho, N., Holton, K.M., Chittenden, T.W., Goldberg, J.L., Dong, L., 2016. Cell types differ in global coordination of splicing and proportion of highly expressed genes. Scientific Reports 6. doi:10.1038/srep32249
Valentine, J.C., Biglan, A., Boruch, R.F., Castro, F.G., Collins, L.M., Flay, B.R., Kellam, S., Mościcki, E.K., Schinke, S.P., 2011. Replication in Prevention Science. Prevention Science 12, 103–117. doi:10.1007/s11121-011-0217-6
Viechtbauer, W., 2010. Conducting Meta-Analyses in R with the metafor Package. Journal of Statistical Software 36. doi:10.18637/jss.v036.i03
Walz, S., Lorenzin, F., Morton, J., Wiese, K.E., von Eyss, B., Herold, S., Rycak, L., Dumay-Odelot, H., Karim, S., Bartkuhn, M., Roels, F., Wüstefeld, T., Fischer, M., Teichmann, M., Zender, L., Wei, C.-L., Sansom, O., Wolf, E., Eilers, M., 2014. Activation and repression by oncogenic MYC shape tumour-specific gene expression profiles. Nature 511, 483–487. doi:10.1038/nature13473
Figure Legends
Figure 1. Induction of c-Myc in P493-6 cells and impact on total RNA levels.
P493-6 cells were grown in the
presence of tetracycline (Tet) for 72 hr and switched into Tet-free growth medium to
induce c-Myc expression. Cells were cultured in two separate lots of serum. (A) Representative
Western blot using an anti-c-Myc antibody (top panels) or an anti-ß-Actin antibody (bottom
panel). Two exposures of the anti-c-Myc antibody are presented to facilitate detection of
c-Myc. (B)
Quantification of total RNA levels (ng of total RNA per 1,000 cells) for cells at 0, 1,
and 24 hr after release from Tet. Means reported and error bars represent s.e.m. from
length(subset(data2, Lot==1 & Time==0)$value)
summary(fit1)[[1]][["Df"]][1]
summary(fit1)[[1]][["Df"]][2]
round(summary(fit1)[[1]][["F
value"]][1],digits=2)
sub('^(-)?0[.]','\\1.', round(summary(fit1)[[1]][["Pr(>F)"]][1], digits
= 3))
contrast1$df
round(contrast1$t.ratio,2)
sub('^(-)?0[.]','\\1.',round(contrast1$p.value,3))
summary(fit2)[[1]][["Df"]][1]
summary(fit2)[[1]][["Df"]][2]
round(summary(fit2)[[1]][["F
value"]][1],digits=2)
sub('^(-)?0[.]','\\1.', round(summary(fit2)[[1]][["Pr(>F)"]][1], digits
= 5))
contrast2$df
round(contrast2$t.ratio,2)
sub('^(-)?0[.]','\\1.',round(contrast2$p.value,4))
Figure 2. Digital gene expression analysis.
P493-6 cells grown in the
presence of tetracycline (Tet) for 72 hr for repression of the conditional pmyc-tet construct, were
switched into Tet-free growth medium to induce c-Myc expression. Cells were cultured in
two separate lots of serum. Transcripts/cell estimates from NanoString nCounter gene
expression assays (prettyNum(length(unique(comb.means$Accession)),
big.mark=",")
length(active_0hr_l1)
length(silent_0hr_l1)
length(active_0hr_l2)
length(silent_0hr_l2)
Figure 2 - figure supplement 1. Logarithmic expression of genes.
This is the same experiment as in Figure 2. (A-B, E-F) Gene expression data plotted on a log2 transformed scale for active (A, E) and silent (B, F) genes at 0, 1, and 24 hr after release from Tet for both lots of serum. (C-D, G-H) Box and whisker plots showing gene expression changes (log2 ratio) between the indicated times for active (C, G) and silent (D, H) genes. Median represented as the line through the box and whiskers representing values within 1.5 IQR of the first and third quartile. Additional details for this experiment can be found at https://osf.io/fn2y4/.
Figure 2 - figure supplement 2. Comparison of gene expression data as continuous.
This is the same experiment as in Figure 2. (A-C, E-G) Scatter plots of log2 transformed gene expression data for all genes analyzed at the indicated times on the y and x axes for both lots of serum. Active genes are blue, silent genes are red, and genes that are neither active or silent (expression was more than 0.5 transcript/cell and less than 1 transcript/cell at time 0 hr) are white. (D, H) Box and whisker plots showing gene expression changes (log2 ratio) between the indicated times for all genes analyzed for both lots of serum. Median represented as the line through the box and whiskers representing values within 1.5 IQR of the first and third quartile. Additional details for this experiment can be found at https://osf.io/fn2y4/.
library(pander)
pander(table1)
These confirmatory statistical
tests relate to the data presented in Figure 2. Wilcoxon signed-rank test, which treat the
data as paired, were conducted for the original study (Lin et al., 2012) and this
replication attempt (RP:CB). Uncorrected p values are reported with an a priori significance
threshold of sub('^(-)?0[.]','\\1.',round(0.05/3, digits = 4))
library(pander)
pander(table2)
These exploratory statistical tests relate to the data presented in Figure 2. Wilcoxon rank sum tests, which treat the data as unpaired, were conducted for the original study (Lin et al., 2012) and this replication attempt (RP:CB). Uncorrected p values are reported. Sample sizes reported are based on treating genes as unpaired between conditions. Additional details for this experiment can be found at https://osf.io/fn2y4/.
pander(table3)
These exploratory statistical tests relate to the data presented in Figure 2. Wilcoxon rank sum tests were conducted for the original study (Lin et al., 2012) and this replication attempt (RP:CB) on the difference in expression of active genes during c-Myc induction (e.g. from 0 hr to 24 hr) compared to the difference in expression of silent genes over that same period (e.g. from 0 hr to 24 hr). Uncorrected p values are reported. Sample sizes reported are based on number of active and silent genes used in the tests. Additional details for this experiment can be found at https://osf.io/fn2y4/.
Figure 3. Meta-analyses of each effect.
Effect size and 95% confidence
interval are presented for Lin et al., 2012, this replication study (RP:CB), and a random
effects meta-analysis of those two effects. Cohen’s d is the standardized difference
between the two measurements, with a larger positive value indicating total RNA levels are
increased at 24 hr compared to 0 hr. The effect size r is a standardized measure of the
correlation (strength and direction) of the association between gene expression and c-Myc
induction, with a larger positive value indicating gene expression increased during the
course of c-Myc induction. Sample sizes used in Lin et al., 2012 and this replication
attempt are reported under the study name. (A) Total RNA levels in P493-6 cells 0
hr compared to 24 hr after release from tetracycline (meta-analysis p = sub('^(-)?0[.]','\\1.',round(exp_meta$pval,4))
scinot(a.meta.0v1$pval)$coeff
scinot(a.meta.0v1$pval)$exp
scinot(a.meta.0v24$pval)$coeff
scinot(a.meta.0v24$pval)$exp
sub('^(-)?0[.]','\\1.',round(a.meta.1v24$pval,4))
sub('^(-)?0[.]','\\1.',round(s.meta.0v1$pval,3))
scinot(s.meta.0v24$pval)$coeff
scinot(s.meta.0v24$pval)$exp
sub('^(-)?0[.]','\\1.',round(s.meta.1v24$pval,4))