# 医学情報処理演習：2011年度第8回課題

## 課題

carライブラリに含まれているUNというデータフレームは，国内総生産GDPと，乳児死亡率についての国連加盟国のデータである。含まれている２つの変数のうち，GDP（変数名gdp）を横軸，乳児死亡率（変数名infant.mortality）を縦軸にとって散布図と50％及び95％の集中楕円を描き，ピアソンの積率相関係数とスピアマンの順位相関係数を求め，相関が0と差が無いという帰無仮説の検定を有意水準5％で実行した。コードと結果と解釈を以下に示す。

(The data.frame UN, which is included in the car library, consists of two variables as GDP (gross domestic products) and infant.mortality (infant mortality rates) in the United Nations countries. We drew the scattergram of gdp being horizontal axis and infant.mortality being vertical axis, with 50% and 95% probability ellipse. Then we calculated the Pearson's product-moment correlation coefficient and Spearman's rank correlation coefficient and tested the null-hypothesis that those coefficients are 0, with significance level being 0.05. The code and result with its interpretation are shown below.)

(Please write the registry number and name, fill the boxes A to D.)

コードは以下の通り。

(The code is shown below.)

require(car) # load and activate car package
UNs <- subset(UN, complete.cases(UN)) # exclude cases with missing values
dataEllipse(UNs\$, UNs\$, levels=c(0.5,0.95)) # draw scattergram with probability ellipse of 50% and 95% CI.
cor.test(UNs\$, UNs\$) # estimate Pearson's r and test the zero-correlation
cor.test(UNs\$, UNs\$, method="") # estimate Spearman's rho and test the zero-correlation
# (cf.) As shown below, logarhythmic transformation makes correlation closer to be linear.
# Lgdp <- log10(UNs\$gdp)
# Linfant.mortality <- log10(UNs\$infant.mortality)
# dataEllipse(Lgdp, Linfant.mortality, levels=c(0.5, 0.95)) # probability ellipse
# cor.test(Lgdp, Linfant.mortality) # Pearson's correlation
# cor.test(Lgdp, Linfant.mortality, method="spearman") # Spearman's rank correlation

(After the graph is drawn, Pearson's correlation coefficient and Spearman's rank correlation coefficient are calculated, and the null-hypotheses that the coefficients are not different from 0 are tested. The point estimates of the coefficients were -0.511 and -0.807, and the p-values were far smaller than 0.05. Therefore, we can conclude that there (1. was | 2. was not) statistically significant correlation at 5% level.)