シラバス参照

公式版のシラバスを表示  
最終更新日:2020/09/09  
筑波大学 教育課程編成支援システム(EN)

01CF115 Data Analysis

2.0 Credits, 1, 2 Year, FallAB Mon5,6
Yukino Baba, Sho Tsugawa, Yohei Akimoto

Course Overview

Major data analysis techniques from the basic to the state-of-the-art, used for evaluation of research results will be discussed. Accompanies exercises using the R language.

Remarks

Identical to 01CH738.
Online(Asynchronous)

Course Type

lectures

Relation to Degree Program Competences

Knowledge Utilization Skills,Management Skills,Research Skills,Expert Knowledge

Course Objectives(Learning Outcomes)

The course aims to prepare the attendee for interpretation, analysis and prediction using various data encountered in research.

Course Keywords

data analysis, statistics

Class Schedule

This course will include lectures and drills using the R language regarding various techniques of data analysis. It will cover the basic theory, standard techniques and advanced methods developed recently.

1.Introduction
- Review of basic probability theory: probability, stochastic event, stochastic variable, probability distribution, probability density function.
- Introduction to R : installation, language, calculation, data structure, input/output, packages
  
2.Estimation
- Estimation of density functions (Maximum likelihood estimation (MLE), Bayesian estimation, MLE of mixed distributions (EM algorithm), nonparametric estimation)
- Interval estimation and confidence level
  
3.Principal Component Analysis (PCA)
- Covariance (correlation) matrix and principal component, nonlinear (kernel-based) PCA
  
4.Correlation Analysis and Regression
- Correlation coefficient
- Single and multiple regresssion
  
5.Bayesian data analysis
- Bayesian inference and MCMC
- Probabilistic programming language: Stan
  
6.Probability distributions
- Discrete distributions
- Continuous distributions
  
7.Practical models
- Hierarchical models
- Models with discrete parameters
  
8.Network Analysis
- Data with network structure
- Network visualization
- Metrics used in network analysis
  
9.Clustering
- Clustering utilizing distances among data
- Network clustering: Clustering utilizing relationships among data
- Evaluation of clustering
  
10.Data ranking and evaluation
- Node ranking utilizing topological structure of networks
- Evaluation techniques for data ranking used in information retrieval
  

Course Prerequisites

Probability and Statistics of undergraduate level.

Grading Philosophy

The grade is determined based on the total score of the term papers assigned by the lecturers.

Course Hours Breakdown and Out-of-Class Learning

Every week, the first half will be devoted for lecture and the latter for drills using R. Please bring a notebook computer that can run a R-language environment.

Textbooks, References,and Supplementary Materials

Course materials will be available at course page on Manaba (https://manaba.tsukuba.ac.jp).

1. Rで学ぶデータサイエンスシリーズ(共立出版)
2. Max Kuhn & Kjell Johnson,Applied Predictive Modeling, Springer,2013
3. 松浦 健太郎,StanとRでベイズ統計モデリング(共立出版)
4. 馬場 真哉,RとStanではじめる ベイズ統計モデリングによるデータ分析入門(講談社)

Office Hours and Contact Information

Please contact by email.

Yukino Baba 100002372
Sho Tsugawa 23051794 http://www.mibel.cs.tsukuba.ac.jp/~s-tugawa/
Yohei Akimoto 100002461

Other(Behavioral expectations and points to note for students during coursework)

Relation to Other Courses

Teaching Fellow and/or Teaching Assistant

A teaching assistant will support this course.