data-scientist-the-sexiest-job-of-the-21st-century
Using wearables data to monitor and prevent health problems
Improving diagnostic accuracy and efficiency
Turning patient care into precision medicine
Advancing pharmaceutical research to find cure for cancer and Ebola
Optimizing clinic performance through actionable insights
Taking the risk out of prescription medicine
Reducing hospital readmissions to cut healthcare costs
Health Care Management and Strategy
Medical Informatics and Decision Management
Health IT Project Management
Population and Community Health Analytics
Business Intelligence & The Internet of Medical Things (IoMT)
Research Analytics & Predictive Analytics
Health Innovation and Entrepreneurship / Capstone
Harvard Univ.
2.6.1 Course Requirements for the Health Data Science SM60 Degree
The degree requirements include a 20 credit ordinally graded core curriculum consisting of:
BST 222 Basics of Statistical Inference (Fall, 5 credits)
BST 260 Introduction to Data Science (Fall, 5 credits)
BST 261 Data Science II (Spring 2, 2.5 credits)
BST 262 Computing for Big Data (Fall 2, 2.5 credits)
BST 263 Applied Machine Learning (Spring, 5 credits)
An additional five credits must be taken in computer science from the following list:
BST 234 Introduction to Data Structures and Algorithms (5 credits)
BST 281 Genomic Data Manipulation (5 credits)
APMTH 120 Applied Linear Algebra and Big Data (5 credits)
BMI 713 Computational Statistics for Biomedical Science (5 credits)
CS 105 Privacy and Technology (5 credits)
CS 124 Data Structures and Algorithms (5 credits)
CS 164 Software Engineering Computer Science (5 credits)
CS 165 Data Systems (5 credits)
CS 171 Visualization (5 credits)
CS 187 Computational Linguistics (5 credits)
STAT 171 Introduction to Stochastic Processes (5 credits)
5 Twenty-five additional credits must be taken. Courses that would satisfy these requirements may come from
the following list of elective courses:
BST 210 Applied Regression Analysis (5 credits)
BST 216 Introduction to Quantitative Methods for Monitoring and Evaluation (2.5 credits)
BST 223 Applied Survival Analysis (5 credits)
BST 226 Applied Longitudinal Analysis (5 credits)
BST 228 Applied Bayesian Analysis (5 credits)
BST 254 Sec 3 Measurement Error and Misclassification (2.5 credits)
BST 267 Introduction to Social and Biological Networks (2.5 credits)
BST 280 Introductory Genomics & Bioinformatics for Health Research (2.5 credits)
BST 282 Introduction to Computational Biology and Bioinformatics (5 credits)
BST 283 Cancer Genome Analysis (5 credits)
EPI 202 Elements of Epidemiologic Research: Methods 2 (2.5 credits)
EPI 203 Study Design in Epidemiologic Research (2.5 credits)
EPI 204 Analysis of Case-Control and Cohort Studies (2.5 credits)
EPI 233 Research Synthesis & Meta-Analysis (2.5 credits)
EPI 271 Propensity Score Analysis (1.25 credits)
EPI 286 Database Analytics in Pharmacoepidemiology (2.5 credits)
EPI 288 Data Mining and Prediction (2.5 credits)
EPI 293 Analysis of Genetic Association Studies (2.5 credits)
ID 271 Advanced Regression for Environmental Epidemiology (2.5 credits)
RDS 280 Decision Analysis for Health and Medical Practices (2.5 credits)
RDS 282 Economic Evaluation of Health Policy and Program Management (2.5 credits)
RDS 285 Decision Analysis Methods in Public Health and Medicine (2.5 credits)
APMTH 207 Advanced Scientific Computing: Stochastic Methods for Data Analysis, Inference and Optimization (5 credits)
APMTH 221 Advanced Optimization (5 credits)
BMI 701 Introduction to Biomedical Informatics (5 credits)
BMI 702 Foundation of Biomedical Informatics II (2.5 credits)
BMI 703 Precision Medicine I: Genomic Medicine (2.5 credits)
BMI 705 Precision Medicine II: Integrating Clinical and Genomic Data (2.5 credits)
BMI 706 Data Visualization for Biomedical Applications (2.5 credits)
CI 722.0 Clinical Data Science: Comparative Effectiveness Research I (2.5 credits)
ME 530M.1 Clinical Informatics (5 credits)
STAT 260 Design and Analysis of Sample Surveys (5 credits)
Other courses may also be acceptable. EPI 201 (see section 2.4.3) will count as one of the 55 credit
ordinal courses required. Students are advised to consult with the Executive Director about any substitutions.
Core courses
BST 222 Basics of Statistical Inference (5 credits)
BST 260 Introduction to Data Science (5 credits)
BST 261 Data Science II (2.5 credits)
BST 262 Computing for Big Data (2.5 credits)
BST 263 Applied Machine Learning (5 credits)
Epidemiology Requirement
Computing Requirement
BST 234 Introduction to Data Structures and Algorithms (5 credits)
BST 281 Genomic Data Manipulation (5 credits)
BMI 713 Computational Statistics for Biomedical Science (5 credits)
CS 105 Privacy and Technology (5 credits)
CS 164 Software Engineering Computer Science (5 credits)
CS 165 Data Systems (5 credits)
CS 171 Visualization (5 credits)
CS 187 Computational Linguistics (5 credits)
STAT 171 Introduction to Stochastic Processes (5 credits)
Project-Based Research Course
Elective Courses
BST 210 Applied Regression Analysis (5 credits)
BST 223 Applied Survival Analysis (5 credits)
BST 226 Applied Longitudinal Analysis (5 credits)
BST 228 Applied Bayesian Analysis (5 credits)
BST 267 Introduction to Social and Biological Networks (2.5 credits)
BST 270 Reproducible Data Science (2.5 credits)
BST 282 Introduction to Computational Biology and Bioinformatics (5 credits)
BST 283 Cancer Genome Analysis (5 credits)
EPI 202 Elements of Epidemiologic Research: Methods 2 (2.5 credits)
EPI 203 Study Design in Epidemiologic Research (2.5 credits)
EPI 204 Analysis of Case-Control and Cohort Studies (2.5 credits)
EPI 271 Propensity Score Analysis (1.25 credits)
EPI 288 Data Mining and Prediction (2.5 credits)
ID 271 Advanced Regression for Environmental Epidemiology (2.5 credits)
BMI 701 Introduction to Biomedical Informatics (5 credits)
BMI 702 Foundation of Biomedical Informatics II (2.5 credits)
BMI 703 Precision Medicine I: Genomic Medicine (2.5 credits)
BMI 705 Precision Medicine II: Integrating Clinical and Genomic Data (2.5 credits)
BMI 706 Data Visualization for Biomedical Applications (2.5 credits)
CI 722.0 Clinical Data Science: Comparative Effectiveness Research I (2.5 credits)
ME 530M.1 Clinical Informatics (5 credits)
temp
클라우드 컴퓨팅
머신 러닝
텍스트 마이닝 - 분석
Edutainment & Media
데이터조사방법론
데이터응용프로그래밍
애널리틱스프로젝트
데이터베이스
소셜미디어기획
소셜미디어휴먼
러닝사이언스
객체지향프로그래밍
소셜미디어애널리틱스
데이터와뉴미디어
미디어애널리틱스프로젝트
Data-Driven Game Design
창의성과데이터
데이터응용프로그래밍
알고리즘
게임애널리틱스
애널리틱스프로젝트
객체지향프로그래밍
데이터사이언스와UX
시리어스게임제작및데이터분석
데이터사이언스와UX
사물인터넷구축과활용
Data Mining & Comp Data Sci
운영체제
객체지향프로그래밍
알고리즘
데이터베이스
데이터응용프로그래밍
데이터마이닝
고급통계및회귀분석
선형대수학
텍스트마이닝과응용
데이터시각화
컴퓨터비전과영상처리
Fintech
선형대수학
경제학원론
조직행위론
마케팅관리
미분방정식
금융해석학
고급통계및회귀분석
계산금융
확률과측도
핀테크프로젝트
행동금융학
Minor prog
수학1
수학시뮬레이션1
확률과응용
컴퓨터프로그래밍
자료구조
데이터사이언스이론
통계학
통계학프로그래밍
데이터응용프로그래밍
Curriculum Design
아래 대 분류 섹션은 각 대학교의 DS 프로그램의 커리큘럼 내용입니다. 큰 그림으로 보면 미국의 이런 프로그램은 대개 “Math와 Stat,” “Comp Sci”의 과목이 주가 되는 듯 싶습니다. 미디어학과의 경우, “미디어” 사용이라는 도메인(혹은 익스퍼트) 지식이 연계된 내용이 포함이 되어야 할 텐데, 이런 예가 많지 않습니다. 아래는 다른 프로그램들에 기초해서 학생들에게 제공할 수 있는 내용을 정리해 본 것입니다.
아래에서 볼드체의 부분은 커리큘럼 과정의 대강입니다 (기초-심화-종합으로 요약했습니다).
그 다음 첫단계(숫자로 시작하는 부분)는 분야입니다. (콘텐츠산업, 금융공학, 등등)
두번째 단계는 세부 분야입니다 (콘텐츠산업 밑에 데이터저널리즘).
세번째 단계는 관련 수업이나 교육제목입니다 (데이터저널리즘을 위해서는 데이터마이닝, 데이터비주얼라이제이선, . . . 등의 수업이 필요함)
두번째와 세번째 단계를 개개인이 콘트리뷰션해 주시면 좋겠습니다. 나중에 필요없다 싶으면 지우고 정리하고 하는 작업이 있으면 되니 부담갖지 마시고 관련 수업이나 내용을 적어 주시면 되겠습니다.
어떻게 에디팅하는가? (이메일로 내용을 주고 받으면 컴파일도 어렵고 전체적인 그림도 파악이 되질 않으니 이 페이지를 에디팅하는게 어떨까 싶습니다. 어려우시면 이메일로 다른 분들에게 보내주시면 hkim이 반영하도록 하겠습니다)
기초(Foundation)
기초
수학
통계
전산
데이터베이스
(사회)심리, 경영, 경제 기초
빅데이터 윤리
심화(Advanced)
데이터 시스템 (하둡과 같은 데이터 관리 시스템 지식: 미디어학과와는 거리가 좀 있음)
금융공학
콘텐츠 산업관련 서비스
데이터 저널리즘
데이터 마이닝
소셜미디어 분석
데이터 비주얼라이제이션
광고 (빅 데이터 분석기반)
데이터 마이닝
인포메이션 리트리벌 및 웹서치 엔진
데이터 비주얼라이제이션
웹서비스 관련
웹 아날리틱스 & SEO
인포메이션 리트리벌 및 웹서치 엔진
클라우드 컴퓨팅
머신러닝
영상, 음악 서비스
데이터 분석에 기반한 영상, 음악 추천 서비스
클라우드 컴퓨팅
머신러닝
소셜네트워크 분석
게임
에듀테인먼트
부상, 부각되는 분야
의료 서비스
Drug Discover
군사 관련 서비스
투자 관련 서비스
종합
프로젝트 연계
캡스톤 디자인
산학연계
산학연계 인턴십 . . .
—-
Berkeley
USC
Curriculum:
Total Units: 32
You must take the following required courses (12 units):
CS 570 - Analysis of Algorithms (4)
CS 585 - Database Systems (4)
CS 561 - Foundations of Artificial Intelligence (4)
Group Electives (3 course - minimum of 1 course from each of the two groups, 9-12 units):
Group 1 - Data Systems:
CSCI 548 - Information Integration on the Web (4)
CSCI 572 - Information Retrieval and Web Search Engines (4)
CSCI 586 - Database Systems Interoperability (3)
CSCI 587 - Geospatial Information Management (4)
CSCI 653 - High Performance Computing and Simulation (4)
CSCI 685 - Advanced Topics in Database Systems (4)
Group 2 - Data Analysis:
CSCI 567 - Machine Learning (4)
CSCI 573 - Probabilistic Reasoning (3)
CSCI 686 - Advanced Big Data Analytics (4)
ISE 520 - Optimization: Theory and Algorithms (3)
MATH 467 - Theory and Computational Methods for Optimization (4)
MATH 574 - Applied Matrix Analysis (3)
Additional Electives (8-11 units):
Any 500 or 600 level course in CSCI
MATH 458 - Numerical Methods (4)
MATH 501 - Numerical Analysis and Computation (3)
MATH 502ab - Numerical Analysis (3-3)
MATH 505a - Applied Probability (3)
MATH 601 - Optimization Theory and Techniques (3)
MATH 650 - Seminar in Statistical Consulting (3)
CSCI 598 - Engineering Writing and Communication (1) AND*
ENGR 596 - Engineering Internship (1, max 3)
CSCI 590 - Directed Research (1-4, max 4)
CSCI 591 - Computer Science Research Colloquium (1, max 2)
*CSCI 598 must be taken BEFORE a student can be approved for ENGR 596.
Curriculum - NYU Center for Data Science
Indiana Univ. Bloomington
http://www.soic.indiana.edu/graduate/degrees/data-science/ms-data-science/ms-requirements.html
Sample residential curriculums
Every course may not yet be offered, and a selection of these course are currently offered online.
Example decision-maker curriculum
Year 1 Semester 1:
I590: Topics in Informatics: Big Data Applications and Analytics
I590: Topics in Informatics: Management, Access, and Use of Big and Complex Data
STAT S520 Introduction to Statistics
Year 1 Semester 2:
B661: Database Theory and System Design
Z637: Information Visualization
B669: Scientific Data Management and Preservation
Year 1 Summer:
Z605: Internship in Library and Information Science
Year 2 Semester 1:
Z604: Data Curation
I525: Organizational Informatics and Economics of Security
I590: Topics in Informatics: Big Data Open Source Software and Projects
Example technical curriculum
Year 1 Semester 1:
B503: Analysis of Algorithm;
B561: Advanced Database Concepts
S520: Introduction to Statistics
Year 1 Semester 2:
B649: Cloud Computing
Z534: Search
B555: Machine Learning
Year 1 Summer:
Z605: Internship in Library and Information Science
Year 2 Semester 3:
B565: Data Mining
I520: Security For Networked Systems
Z637: Information Visualization
Example: Computational and Analytic Data Science Track
B649: Cloud Computing
I590: Topics: Projects on Big Data Software
I590: Topics: Data Science for Drug Discover
I590: Topics: Perspectives in Data Science
Z636: Data Semantics
Z637: Information Visualization
Univ. of Virginia
Course requirements for the MSDS program
Summer Term (approximately 6 weeks, starting mid-July):
Fall Term:
STAT 6021: Linear Models for Data Science
CS 5012: Foundations of Computer Science
SYS 6018: Data Mining
DS 6001: Practice and Application of Data Science
DS 6002a: Ethics of Big Data
DS 6003a: Capstone Project
Spring Term:
SYS 6016: Machine Learning
DS 6002b: Ethics of Big Data
DS 6003b: Capstone Project
Elective 1
Elective 2
Selection of elective courses is done in consultation with the program director. There are a variety of possible electives available, including the following:
CS 6501: Special Topics in Computer Science
CE 6400: Traffic Operations
STAT 6130: Applied Multivariate Statistics
STAT 5390: Exploratory Data Analysis
SYS 6001: Introduction to Systems Engineering
SYS 6003: Optimization I
SYS 6005: Stochastic Systems I
CS 6750: Database Systems
STAT 5170: Applied Time Series
STAT 5340: Bootstrap and Other Resampling Methods
MATH 5110: Stochastic Processes
Minnesota
Statistics Track 6
Algorithmics Track 6
Infrastructure Track 6
Elective Credits 6
Capstone Credits 6
Colloquium Credits 6
2 sem design
CSCI 5523 - Introduction to Data Mining 3
CSCI 5707 - Principles of Database Systems 3
STAT 5302 - Applied Regression Analysis 3
Capstone Project (First Half) 3
CSCI 5451 - Introduction to Parallel Computing: Architectures, Algorithms, and Programming 3
EE 5239 - Introduction to Nonlinear Optimization 3
STAT 5401 - Applied Multivariate Methods 3
Elective 3
Capstone Project (Second Half) 3
The Open Source Data Science Curriculum
http://datasciencemasters.org
Foundation
Intro to Data Science UW / Coursera:
Data Science / Harvard Video Archive & Course:
Data Science with Open Source Tools Book $27:
Math
Linear Algebra & Programming
Statistics
Differential Equations & Calculus
Problem Solving (Problem-Solving Heuristics “How To Solve It”)
Computing
Algorithms
Distributed Computing Paradigms
Databases
Data Mining
Machine Learning
Probabilistic Modeling
Deep Learning (Neural Networks)
Social Network & Graph Analysis
Natural Language Processing
Analysis
Data Design
Visualization
Data Journalism
Python (Libraries)
Data Structures & Analysis Packages
Machine Learning Packages
Networks Packages
Statistical Packages
Natural Language Processing & Understanding
Live Data Packages
Visualization Packages
iPython Data Science Notebooks
Data Science as a Profession
Capstone Project
Udacity