Statistical Analysis
This is based on using either Excel or R for the analysis. To get data into R, the easiest way is to make the data in excel then import it into R with this command:
dataset <- read.csv("filename.csv") #generates a table called dataset with your values
Single Comparisons
Don't forget to adjust these p-values for multiple comparisons if you are doing more than one test.
If you have 2 groups you want to compare
Use a Student's T-Test
- Using excel, for unpaired samples. Unless you are comparing paired samples (ie left leg insulin, right leg control) alwayse use this command. This is for a heteroscedastic unpaired test. This means that each group can have unequal variances. For more information see http://office.microsoft.com/en-us/excel-help/ttest-HP005209325.aspx
=TTEST(GROUPRANGE1, GROUPRANGE2, 2 ,3)
- Using R (for more details see http://www.statmethods.net/stats/ttest.html and http://stat.ethz.ch/R-manual/R-patched/library/stats/html/t.test.html):
If you have to lists of numbers, not in a table then you can test them directly:
ttest(group1, group2) #this compares two arrays of numbers
If you have a table, named dataset with columns names values and group. The group column contains 2 different values (for example WT and KO). If you have more than 2 values in the group then you need to go to #If you are testing one variable with more than two groups(One Way ANOVA).
ttest(values ~ group) #this compares the values column if there are two different variables in the group column. It will not work if there are more than 2 groups
If you have one group you want to compare to a number
For example you might want to test if a series of numbers are >1
ttest(group1, mu=1, alternative="greater") #this test the alternative hypothesis that the numbers in group1 are > 1
Multiple Comparisons
If you are testing one variable with more than two groups(One Way ANOVA)
Not if you are comparing 2 groups to control, but if you are comparing three groups internally. For example this might be Normal Diet, High Fat Diet, High Protein Diet. Note that if you do this with just two groups, the result should be the same as a t-test.
- Using R, providing data is formatted in a dataframe named dataset with columns group and values (see http://stat.ethz.ch/R-manual/R-patched/library/stats/html/aov.html). The first step is to do an ANOVA, then depending on if the results of this comparison are significant, move on to post-hoc tests such as TukeyHSD:
fit.aov <- aov(values ~ group, data=dataset) #generates an object names fit.aov summary(fit.aov) #tests for significance of the ANOVA. If this is less than your alpha (usually 0.05) stop and declare no significant difference. If < 0.05 go on to next test. TukeyHSD(fit.aov) #this does a Tukey HSD test
If you are testing two variables simultaneously
For example this could be the effects of diet and genotype. It does not matter how many variables are in each group. If one of the variables is not a factor (instead is a continuous variable like age) then look below for #Correlations:
- Using R, providing data is formatted in a dataframe named dataset with columns genotype, diet and values (see AOV). The first step is to do an ANOVA, then depending the results, move on to the post-hoc tests such as TukeyHSD or separate your dataset:
fit.aov <- aov(values ~ genotype*diet, data=dataset) #generates an object names fit.aov summary(fit.aov) #tests for significance of the ANOVA.
At this stage you will get an output such as this:
Df Sum Sq Mean Sq F value Pr(>F) genotype 1 25.23 25.23 43.942 0.000164 *** diet 1 141.45 141.45 246.363 2.71e-07 *** genotype:diet 1 1.92 1.92 3.344 0.104853 Residuals 8 4.59 0.57
- First look at the genotype:diet column. If this p-value is <0.05 then you have a significant interaction between genotype and diet. If this is the case move on to #Interaction to separate out your groups. If this value is >0.05 then there is no interaction, check if the p value for either of your groups is significant. If it is (and there is no interaction) then go ahead to #Main Efect. In the above example there is no interaction, but there are two main effects:
Main Effect
If there is no interaction, but there is a significant effect for one or both groups then you can go on to look at Post-hoc tests such as TukeyHSD
TukeyHSD(fit.aov)
This will generate all possible pairwise comparisons between your groups
Interaction
If there is an interaction, you will need to separate out your groups and compare them separately. For example this will subset out just "WT" genotypes and analyse those.
wt.dataset <- subset(dataset, genotype=="WT") wt.fit <- aov(values ~ diet, data=dataset) summary(wt.fit) #at this point you can go on to a TukeyHSD if you have >2 diet values and a significant ANOVA TukeyHSD(wt.fit)
This will tell you, separate from the interaction, whether each pairwise comparison is significant. You will have to repeat this by re-doing subset with each genotype and diet value as needed. Alternatively you can also account for the effect of one variable on the significance of the other. For example, if both diet and genotype interact, you may want to know what the effect of diet is, controlling for the effect of genotype. This is done like this:
TukeyHSD(fit.aov, "diet") #This does pairwise comparisons of diet, while accounting for the effect of genotype
Correlations
This is when two variables are correlated rather than one of them being discreet