CIS 732: Machine Learning and Pattern Recognition
Fall 2003
Hours: 3 hours (extended CIS732 course project option
available)
Prerequisite: CIS 501 or equivalent coursework in data structures
and algorithms; CIS 301 (set theory/logic) and 575 (algorithm design), Math 510
(discrete math), Stat 410 (intro probability) strongly recommended
Textbook: Machine Learning,
T. M. Mitchell. McGraw-Hill, 1997.
ISBN: 0070428077
Time and Venue: Mon, Wed, Fri 13:30
– 14:20, Room 127 Nichols Hall (N127)
Instructor: William H. Hsu, Department of Computing and
Information Sciences
Office: N213 Office phone: (785) 532-6350 x29 Home phone: (785) 539-7180
URL: http://www.cis.ksu.edu/~bhsu
E-mail: cis732ta@www.kddresearch.org
Office hours: 10:30-11:30 Mon, Wed,
Fri;
by appointment, Friday after
16:00
Class web page: http://www.kddresearch.org
/Courses/Fall-2003/CIS732/
An introductory course in
machine learning for development of intelligent knowledge based systems. The first half of the course will focus on
basic taxonomies and theories of learning, algorithms for concept learning,
statistical learning, knowledge representation, pattern recognition, and
reasoning under uncertainty. The second
half of the course will survey some basic topics in combining multiple models,
learning from time series, learning to reason, and selected applications in
knowledge discovery and data mining, especially in bioinformatics.
The course will include several written and programming
assignments and a term project option for graduate students. Ancillary readings will be assigned;
students will write a brief synopsis and review for one of these papers every
other lecture.
Homework: 3 out of 4 programming and written assignments
(15%)
Examinations: 1 in-class closed-book midterm (20%), 1 in-class
open-book final (25%)
Computer
language(s): C/C++, Java, or student
choice (upon instructor approval)
Project: research project and report for all students (25%)
Paper
reviews: 10 out of 13 paper reviews
(10%)
Class
participation: class and online discussions, asking and answering questions (5%)
·
Artificial Intelligence: A
Modern Approach, 2nd edition, S. J. Russell and P. Norvig. Prentice-Hall, 2003. ISBN: 0137903952
·
Readings in Machine Learning, J. W. Shavlik and T. G.
Dietterich, eds. Morgan Kaufmann, 1990.
ISBN: 1558601430
·
Readings in Computer Inference
and Knowledge Acquisition, B. G. Buchanan and D. C. Wilkins, eds.
Morgan Kaufmann, 1993. ISBN:
1558601635
·
Genetic Algorithms in Search,
Optimization, and Machine Learning, D. E. Goldberg.
Addison-Wesley, 1989. ISBN:
0201157675
·
Genetic Programming: On The Programming of Computers by Means of Natural
Selection, J.
Koza. MIT Press, 1992. ISBN: 0262111705
Lecture |
Date |
Topic |
(Primary)
Source |
0 |
2003 Aug 20 |
Administrivia; overview of learning |
TMM Chapter 1 |
1 |
2003 Aug 22 |
Concept learning, version spaces |
TMM 2 |
2 |
2003 Aug 25 |
Inductive bias |
TMM 2 |
3 |
2003 Aug 27 |
Decision trees; using MLC++ |
TMM 3; RN 18 |
4 |
2003 Aug 29 |
Decision trees |
TMM 3 |
5 |
2003 Sep 03 |
Overfitting and Occam’s Razor |
TMM 3 |
6 |
2003 Sep 05 |
Perceptrons |
TMM 4 |
7 |
2003 Sep 08 |
Winnow |
TMM 4 |
8 |
2003 Sep 10 |
Multi-layer perceptrons |
TMM 4; handouts |
9 |
2003 Sep 12 |
Estimation and confidence intervals |
TMM 5 |
10 |
2003 Sep 15 |
Bayesian learning: MAP, ML |
TMM 6 |
11 |
2003 Sep 17 |
Bayesian learning: MDL, BOC, Gibbs |
TMM 6 |
12 |
2003 Sep 19 |
Naïve Bayes; prob. learning over text |
TMM 6; handouts |
13 |
2003 Sep 22 |
Graphical models: inference basics |
TMM 6; RN 14-15 |
14 |
2003 Sep 26 |
Graphical models: learning
from data 1 |
TMM 6; paper |
15 |
2003 Sep 29 |
Graphical models: learning from data 2 |
TMM 6 |
16 |
2003 Oct 01 |
Unsupervised learning: AutoClass,
SOM |
TMM 6 |
17 |
2003 Oct 03 |
Expectation-Maximization (EM) |
TMM 6 |
18 |
2003 Oct 06 |
COLT 1: PAC learning |
TMM 2, 7.1-3; handouts |
19 |
2003 Oct 08 |
COLT 2: PAC, VC dimension, |
TMM 7.4.1-3 |
20 |
2003 Oct 10 |
COLT 3: Error bounds |
TMM 7.5.1-3 |
21 |
2003 Oct 15 |
Midterm Review |
TMM 1-7; RN 14-15, 18 |
22 |
2003 Oct 17 |
Midterm Exam
|
– |
23 |
2003 Oct 20 |
Reinforcement learning I |
TMM 13; RN 20 |
24 |
2003 Oct 22 |
Reinforcement learning II |
TMM 13; papers |
25 |
2003 Oct 24 |
Dynamic models I: value/policy iteration |
RN 16 |
26 |
2003 Oct 27 |
Dynamic models II: DBNs |
RN 17 |
27 |
2003 Oct 29 |
Markov Decision Processes |
RN 17 |
28 |
2003 Oct 31 |
Combining classifiers (WM, bagging) |
TMM 7 |
29 |
2003 Nov 03 |
Boosting |
TMM 9; papers |
30 |
2003 Nov 05 |
Introduction to GEC |
TMM 9 |
31 |
2003 Nov 07 |
Genetic algorithms |
DEG |
32 |
2003 Nov 10 |
Genetic programming 1 |
JK; paper |
33 |
2003 Nov 12 |
Genetic programming 2 |
JK; paper |
34 |
2003 Nov 14 |
IBL, k-nearest neighbor |
TMM 8.1-4 |
35 |
2003 Nov 17 |
Rule learning and extraction |
Paper |
36 |
2003 Nov 19 |
Inductive logic programming |
TMM 10; RN 21 |
37 |
2003 Nov 21 |
Support vector machines |
New |
38 |
2003 Nov 24 |
Applications 1: Decision Support |
– |
39 |
2003 Nov 26 |
Applications 2: Commercial KDD |
– |
40 |
2003 Dec 01 |
Applications 3: Bioinformatics |
– |
41 |
2003 Dec 03 |
Applications 4: KDD Cup |
– |
42 |
2003 Dec 05 |
Applications 5: Machine Translation |
– |
43 |
2003 Dec 08 |
Applications 6: Advanced Topics |
– |
44 |
2003 Dec 10 |
Final review |
TMM 1-10, 13; RN 14-21 |
|
TBD |
FINAL EXAM
|
TMM 1-10, 13 |
TMM: Machine Learning, T. M. Mitchell
RN: Artificial Intelligence: A Modern Approach, S. J. Russell and P.
Norvig
DEG: Genetic Algorithms in Search, Optimization, and Machine Learning,
D. E. Goldberg
JK: Genetic Programming, J. Koza
Lightly-shaded entries denote
the (Friday) due date of a written problem set.
Heavily-shaded entries
denote the (Friday) due date of a machine problem (programming assignment).
Green-shaded entries
denote the (Monday) due date of a paper review.