生
物统计2016
上课时间:周二:10:00-12:00;
周四:10:00-12:00(双周)
上课地点: 二教404
通
知:
2016年3月2日通知:
以下日期将由 Prof. Theis Lange 讲授
March 29th, Tuesday on power analysis etc.
April. 12nd, Tuesday on Survival Analysis
April. 14th, Thursday on Survival Analysis
April. 19th, Tuesday on Survival Analysis
本科生选课名额已增加到40名,有希望选课、但没有选上的同学尽快选课
2016年4月26日通知
计划4月28号要完成所有同学的开题报告,如不能完成则延后到下次课程。开题报告顺序以自愿和点名结合确定。点名未到者,无开题报告成绩。
2016年5月15日通知
5月17,18日统计科学中心举办大数据时代的高维统计会议。相信很多同学都对此会议感兴趣,因此
取消5月17日课程,鼓励各位同学去参加此会议,望相互转告。会议信息
请见
这
里。
课
程简介
参考书:
Generalized
Additive Models: an introduction with R by Simon Wood
Elements
of Statistical Learning by Jerome H. Friedman, Robert Tibshirani,
and Trevor Hastie
Advanced
Data Analysis from an Elementary Point of view by Cosma
Rohilla Shalizi
Mixed
Effects Models and Extensions in Ecology with R by Alain F. Zuur,
Anatoly A. Saveliev, Elena N. Ieno, and Graham M. Smith
An Introduction to Statistical Learning with Applications in R
by Gareth James, Daniela Witten, Trevor Hastie and Robert
Tibshirani
R相关
The R project
Bioconductor
Rstudio
Cookbook for R
R Graph Gallery
Swirl
DataCamp
一些有用的网络资源
Marshall
Hampton's Class on Bioinformatics
Bioinformatics
and Functional Genomics
Computational
Genomics: A Case Studies Approach
Introductory
Biology (MIT open course)
课件
Lecture1
Lecture2/
code
Lecture3/
code1,
code2
Lecture4,5/
Reading Material/
Code/
cnvData
Lecture6&7/
code
Lecture8
Lecture9/
code
Lecture10/
code
Lecture11/
leukemia data
Lecture12
Lecture 13
About the data in Lecture 13:
The Stanford heart data is included in R by birth. Loaded by doing:
library(survival)
data(heart)
jasa # this is the data to use
More data description here:
//stat.ethz.ch/R-manual/R-patched/library/survival/html/heart.html
Lecture 14/
code
Lecture 15/
code
Lecture 16&17/
code
Lecture 18
作业
All homework should be in pdf format and should be emailed
to the TA (cheung1990 AT 126 DOT com). If a homework involves coding,
you should also provide your code to the TA. Your code should also be
easily excutable by the TA (For the TA's convenience, you'd better
write a short document explaining how to run your code).
Homework 1 (Due: March 17)
Read
the paper by
Cleveland and McGill..
1. Summarize the paper. According to the paper, make some general
recommendations when making a plot.
2. Give at least one example that you encountered (in scientific
papers, social medias or other areas) where you can redesign the plot
to make it more accurate. You need to provide the source of the example
so that others can easily find the example you give.
Homework 2 (Due: April 5)
1. Exercise 9 of Chapter 10 in the
book An
Introduction to Statistical Learning with Applications in R.
Note that the data USArrests
is a part of the base R distribution. You may use data(USArrests) in R to load the
data.
2. Exercise 11 of Chapter 10 in
the book An
Introduction to Statistical Learning with Applications in R.
3. The problems in this file.
Homework 3 (Due: May 2)
The problems in this
file. The data is
here.
期末大作业
期末大作业可以为自选问题或从下面给出的两道备选题目中选择。鼓励每组同学尽量达到3人,如果小组成员有2个或2个以上本科生,此小组最多可有4人组成。
鼓励数院同学与其他院系同学混合编组。
如果是自选题目,每个小组必须提前汇报(
4月28日)你要做
的问题(背景,意义以及你的初步研究计划)。对于自选题目,将基于你做的问题的意义及你的完成情况对你的期末大作业评分。
对于两道备选题目,鉴于第二题比第一题简单,
对选第二题目的同学会有一定程
度的惩罚,即,如果某组同学第二题初步评分得到X分,则本组同学在期末大作业上的实际得分只有0.9X分。选择两道备选题目的同学也需要在
4月28日或5月3日汇报你组初步的研究计划。
每组同学需要在6月7日或6月9日汇报你组的结果。大作业最终版本必须在
6
月19日23:00点前发给主讲老师(请发邮箱ruibinxi AT
hotmail.com)和助教。邮件题目必须是“2016生物统计大作业+本组成员名单”。大作业中必须
明确每位同学在作业中的
贡献,贡献小的同学最终得分会有所惩罚。
两次汇报(开题汇报及最后的成果汇报)均将记录成绩,不汇报或汇报时间未到组别无此项成绩。开题汇报每组5分钟,2分钟问题。成果汇报每组10分钟,2分
钟问题。
请每组同学在
4月27日前将本组名单及本组选择的题目发给助
教。
备选题目一(Replication
Timing)
题目一阅读材料
备选题目二(Parkinsons)/
数据