Difference between revisions of "Using Bioconductor To Analyse Beadarray Data"

From Bridges Lab Protocols
Jump to: navigation, search
m
m
Line 24: Line 24:
 
</pre>
 
</pre>
 
*You may need to alter either the ProbeID or ControlID to fit the illuminaprobe column from the sampleprobe or controlprobe datasets.
 
*You may need to alter either the ProbeID or ControlID to fit the illuminaprobe column from the sampleprobe or controlprobe datasets.
 +
*This fits the data into the BSData dataframe.  Phenotype data can be accessed by pData(BSData) and expression data can be accessed by exprs(BSData).
  
 
==Data Normalisation==
 
==Data Normalisation==
Line 35: Line 36:
 
*Save these boxplots as postscript files.
 
*Save these boxplots as postscript files.
  
 
+
==Clustering Analysis==
*This fits the data into the BSData dataframePhenotype data can be accessed by pData(BSData) and expression data can be accessed by exprs(BSData).
+
*This analysis will generate a euclidean distance matrix then a cluster analysis of that matrix and will show the distribution between replicatesIdeally similar treatments will cluster together.
 +
<pre>
 +
d = dist(t(exprs(BSData.quantile)))
 +
plot(hclust(d)
 +
</pre>

Revision as of 17:30, 21 August 2009


Software Requirements

  • R, get from [CRAN]
  • Bioconductor, get from [Bioconductor]
  • Bioconductor packages. Install as needed:
    • beadarray
    • limma
source("http://www.bioconductor.org/biocLite.R")
biocLite("PACKAGE")

Loading Data

  • At a minimum you need the Probe Profile data (normally a txt file).
  • For all R procedures first change directory to your working directory then next create a new script, and save all executed lines in that script file.
  • Load the beadarray library, indictate dataFile (required), sampleSheet (normally a xls or csv file) and control set (Control Probe, normally a txt file)
data = "FinalReport_SampleProbe.txt"
controls = "ControlProbe.txt"
samplesheet = "Proj_54_12Aug09_WGGEX_SS_name.csv"
BSData = readBeadSummaryData(dataFile = data, qcFile= controls, sampleSheet=samplesheet)
  • You may need to alter either the ProbeID or ControlID to fit the illuminaprobe column from the sampleprobe or controlprobe datasets.
  • This fits the data into the BSData dataframe. Phenotype data can be accessed by pData(BSData) and expression data can be accessed by exprs(BSData).

Data Normalisation

  • Microarray data is typically quantile normalised and log2 transformed:
BSData.quantile = normaliseIllumina(BSData, method="quantile", transform="log2")
  • To examine the effects of normalisation on the dataset use boxplots:
boxplot(as.data.frame(log2(exprs(BSData))),las=2,outline=FALSE, ylab="Intensity (Log2 Scale)")
boxplot(as.data.frame(exprs(BSData.quantile)),las=2,outline=FALSE, ylab="Intensity (Log2 Scale)")
  • Save these boxplots as postscript files.

Clustering Analysis

  • This analysis will generate a euclidean distance matrix then a cluster analysis of that matrix and will show the distribution between replicates. Ideally similar treatments will cluster together.
d = dist(t(exprs(BSData.quantile)))
plot(hclust(d)