# R笔记：描述性统计分析

library(foreign)

【1】summary

summary(ma["weight"])

summary(ma["height"])

var=c("weight","height")

summary(ma[var])

summary(ma[c("weight","height")])  #等同于使用命令summary(ma[3:4])或summary(ma[-1:-2])

## by(ma[3:4],ma\$group,function(x)stat.desc(x,norm=TRUE))  #对数据框ma中的第3列和第4列变量按group分组，分别进行stat.desc获取基本描述统计量和正态分布的统计量后输出结果

stat.desc{pastecs}：Descriptive statistics on a data frame or time series。Compute a table giving various descriptive statistics about the series in a data frame or in a single/multiple time series

Useage：stat.desc(x, basic=TRUE, desc=TRUE, norm=FALSE, p=0.95)

x：a data frame or a time series

basic：do we have to return basic statistics (by default, it is TRUE)? These are: the number of values (nbr.val), the number of null values (nbr.null), the number of missing values (nbr.na), the minimal value (min), the maximal value (max), the range (range, that is, max-min) and the sum of all non-missing values (sum)

desc：do we have to return various descriptive statistics (by default, it is TRUE)? These are: the median (median), the mean (mean), the standard error on the mean (SE.mean), the confidence interval of the mean (CI.mean) at the p level, the variance (var), the standard deviation (std.dev) and the variation coefficient (coef.var) defined as the standard deviation divided by the mean

norm：do we have to return normal distribution statistics (by default, it is FALSE)? the skewness coefficient g1 (skewness), its significant criterium (skew.2SE, that is, g1/2.SEg1; if skew.2SE > 1, then skewness is significantly different than zero), kurtosis coefficient g2 (kurtosis), its significant criterium (kurt.2SE, same remark than for skew.2SE), the statistic of a Shapiro-Wilk test of normality (normtest.W) and its associated probability (normtest.p)

p：the probability level to use to calculate the confidence interval on the mean (CI.mean). By default, p=0.95
