CIS 732 (Topics in Computer Science)

Machine Learning and Pattern Recognition

Fall, 2001

 

Credit: 3 or 4 hours (additional 1 hour requires class project or term paper)

Prerequisite: CIS 300 (Algorithms and Data Structures) or equivalent; basic course in probability and statistics recommended

Textbook: Machine Learning, T. M. Mitchell.  McGraw-Hill, 1997. ISBN: 0070428077

Instructor:

William H. Hsu, Department of Computing and Information Sciences

Office: 213 Nichols Hall                                   E-mail: bhsu@cis.ksu.edu

Office phone: 532-6350                                   Home phone: 539-7180

Office hours: after class; 1-3pm Monday, 9-11am Wednesday; by appointment

 

 

Course Objectives

 

                This is an introductory course in machine learning for development of intelligent knowledge based systems.  It is intended for students interested in the computational theory of learning, pattern recognition, artificial intelligence, and applications such as database mining, classification, knowledge acquisition and learning for expert system development, and plan generation and recognition.  Interested undergraduates who have had basic courses in software development and algorithm design, and who feel comfortable with basic probability, may also find this class useful.

 

The first half of the course will focus on basic taxonomies and theories of learning, algorithms for concept learning, statistical learning, knowledge representation, and reasoning under uncertainty.  The second half of the course will survey fundamental topics in combining multiple models, learning for plan generation, decision support, knowledge discovery and data mining, control and optimization, and learning to reason.

 

The course will include written and programming assignments (5 total) and a term project option for graduate students.  Ancillary readings will be assigned; students will write a brief review and commentary for one of these papers every other lecture.

 

 

Class Resources

 

Web pages

·         Official class page: http://ringil.cis.ksu.edu/Courses/Fall-2001/CIS732

·         Instructor’s home page: http://www.cis.ksu.edu/~bhsu

 

Note: It is the student’s responsibility to be aware of class announcements and materials posted on the official class page, so please check it frequently.

 

Course notes

Required readings, along with reference manuals and tutorials for software used in the course, will be available for purchase (in 2 packets) from the Engineering Copy Center.

Class newsgroup

·         Tentative title: ksu.class.cis732whh

·         Announcements from the web page will be re-posted here

·         Primary purpose: for class discussions (among students and with instructor)

 

Mailing list

·         Tentative title: CIS732WHH-L@cis.ksu.edu

·         Optional (no course-critical announcements)

·         For research announcements, related job opportunities, etc.

·         Sign up if interested

 

Selected readings (recommended books, on reserve in CIS Library):

·         Artificial Intelligence: A Modern Approach, S. J. Russell and P. Norvig.  Prentice Hall, 1995.      ISBN: 0131038052

·         Readings in Machine Learning, J. W. Shavlik and T. G. Dietterich, eds.  Morgan Kaufmann, 1990. ISBN: 1558601430

 

Additional bibliography (excerpted in course notes and handouts):

·         Readings in Computer Inference and Knowledge Acquisition, B. G. Buchanan and D. C. Wilkins, eds.  Morgan Kaufmann, 1993.  ISBN: 1558601635

·         Readings in Uncertain Reasoning, G. Shafer and J. Pearl.  Morgan Kaufmann, 1990.  ISBN: 1558601252

·         Genetic Algorithms in Search, Optimization, and Machine Learning, D. E. Goldberg.  Addison-Wesley, 1989.  ISBN: 0201157675

·         Neural Networks for Pattern Recognition, C. M. Bishop.  Oxford University Press, 1995.  ISBN: 0198538499

·         Genetic Programming: On The Programming of Computers by Means of Natural Selection, J. Koza.  MIT Press, 1992.  ISBN: 0262111705

 

 

Homework Assignments and Course Project

 

                Homework assignments will be given out 1 to 4 weeks apart (2½ on average), for a total of 5.  Your lowest score will be dropped (see below).  Two of these homeworks will be entirely written, two will be entirely programming-based, and one will require you to run (and possibly modify) an existing code on sample data, and analyze the results.

 

                For programming assignments, you are permitted to use your choice of a high-level programming language (C++ and Java are strongly preferred; consult the instructor if you intend to use any other programming language).  You must, however, use a development environment that is available to the CIS department.  Consult the class web page for approved compilers.

 

                For graduate students and advanced undergraduates interested in working on a class project, you may elect an additional 1 hour of credit and either turn in a term paper or work on an independent implementation or research project.  You may sign up for this option any time before September 23, 2001 (talk to me during office hours or send e-mail).  Suggested project topics and guidelines will be posted on the course web page.  Examples include: a small data mining experiment; an in-depth comparison of two or three techniques studied in the course; or improving an existing learning algorithm or analyzing it formally.

 

 

No-Cheating Policy

 

Cheating consists of misrepresenting another’s work as your own.  It includes not only copying of test answers, but plagiarism of another person’s written material.  While you are encouraged to discuss class material, homework problems, and projects with your classmates, the work you turn in must be entirely your own.  For homework assignments, this means that if you work together with a fellow student, you should still produce the final, written material from your own notes and individual work, rather than from common notes that you produced together.  You should follow similar guidelines for programming assignments and individual projects; while reuse of previously developed source codes may be permitted in these cases (provided you acknowledge the authors appropriately), you must not use directly use code developed by fellow students.  Please consult the new University honor code (http://www.ksu.edu/honor) for further guidelines on ethical conduct, and understand the regulations and penalties for violating them.

 

The codes that you are permitted to use on certain assignments may be limited, beyond the specifications of plagiarism standards.  When in doubt about whether you may use a particular program on a written or programming assignment, consult the instructor first.  My objective is to help you learn as much as possible from the assignments; sometimes this means that I want you to use existing code and sometimes I will prefer for you to develop it yourself, to better understand the techniques.

 

Grading

 

                Credit for the course will be distributed as follows:

 

Component

Quantity

Low Scores

Dropped

Points Each

(Out of 1000)

Value

Paper Reviews and Commentaries

13

3

15

15%

Homework (Written/Programming Assignments)

5

1

125

50%

Midterm Exam

1

0

150

15%

Final Exam

1

0

200

20%

 

                Homework and exams may contain extra credit problems.

 

Late policy: Homeworks are due on Thursdays; you may request an extension to the following Tuesday if you need one by the due date (but I recommend you do not take this option).  10% credit will be deducted for each day the assignment is late past that Tuesday.

 

                Letter grades will be assigned based on the distribution of raw scores (“curved”). Undergraduate and graduate students will be graded on the same curve.  Acquiring 85% of the possible points, however, guarantees an A; 70%, a B; 55%, a C.  Actual scales may be more generous than this if called for, but are not expected to be.

 

                If you elect to take an additional 1-hour project option (for 4 hours of credit), your project grade will affect only 25% of your course grade, and the course grade will be scaled proportionately (75% allocated to the above components, 25% to the project).

 

 

Paper Reviews and Commentaries

 

                An important part of learning about intelligent systems and machine learning, whether for research or development applications, is understanding the state of the field and the repercussions of important results.  The readings in this course are designed to give you not only a set of tutorials and references for machine learning tools and techniques, but to demonstrate the subject as a unified whole, and to encourage you to think more deeply about the practical and theoretical issues.

 

                Toward this end, I have selected 13 papers out of those in your (2) course notes packets.  The first 6 of these are in the first packet and the last 7 are in the second.  When you come to the last lecture of each week, starting from the second week and going through the fourteenth (two weeks before the final exam), you should bring a short review of, and commentary on, the assigned paper.  This commentary need be no longer than 1 page (though you can go up to 2 pages if you feel you have something meaningful to add). 

 

This review is an important part of the course, because it can:

 

·         help you to review and rehearse material from lecture

·         bring to light questions that you may have about the material

·         improve your ability to articulate what you have learned

·         help guide the lecture/discussion

·         help you to think about working on projects (implementations or research) in this field

 

                Here are some guidelines on writing the reviews:

 

1.        Try to be brief and concise.

2.        Concentrate on pointing out the paper’s main strengths and flaws, both in content (key points, accuracy, impact/implications, deficiencies) and in presentation (organization, clarity/density, interest).  Try not to merely summarize the paper.

3.        Some questions to address (typically a couple in each paper):

·         Is the paper of sufficiently broad interest?  What do you think its intended audience is?  To what audience do you think the paper is significant?

·         What makes the paper significant or insignificant?

·         How could the presentation be improved to better point out important implications?

·         Is the paper technically sound?  How so, or in what areas is it not entirely sound?

·         What novel ideas can we pick up (about the topics covered in lecture) from the paper?

4.        Comment on how the paper (or the topic) affects your own work.  How is it relevant (or irrelevant) to you?

5.        How might the research be improved in light of how the field has progressed since it was published?  Some of these papers were catalysts for research in their areas, so it is sometimes infeasible to second-guess their authors; but comment on what could be done better today.

 

The first 6 papers for which you will write short reviews are:

 

[Va94] L. G. Valiant.  A Theory of the Learnable.  Communications of the ACM, 24(11):1134-1142, reprinted in Readings in Knowledge Acquisition and Learning, B. G. Buchanan and D. C. Wilkins, editors.  Morgan-Kaufmann, San Mateo, CA, 1993.

 

[Mi80] T. M. Mitchell.  The Need for Biases in Learning Generalizations.  Technical Report CBM-TR-117, Department of Computer Science, Rutgers University, New Brunswick, NJ, 1980, reprinted in Readings in Machine Learning, J. W. Shavlik and T. G. Dietterich, editors, p. 184-191.  Morgan-Kaufmann, San Mateo, CA, 1990.

 

[MUB83] T. M. Mitchell, P. E. Utgoff, and R. Banerji. Learning by experimentation: Acquiring and refining problem-solving heuristics.  In Machine Learning: An Artificial Intelligence Approach, R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors, p. 163-190. Morgan-Kaufmann, San Mateo, CA, 1983, reprinted in Readings in Knowledge Acquisition and Learning, B. G. Buchanan and D. C. Wilkins, editors.  Morgan-Kaufmann, San Mateo, CA, 1993.

 

[Qu95] J. R. Quinlan. MDL and Categorical Theories (Continued).  In Proceedings of the Twelfth International Conference on Machine Learning (ICML-95), Morgan-Kaufmann, San Mateo, CA, 1995.

 

[Ro99] D. Roth.  Learning Natural Language.  In Proceedings of the 1999 International Joint Conference on Artificial Intelligence (IJCAI-99), Stockholm, 1999.

 

[PV91] J. Pearl and T. S. Verma.  A Theory of Inferred Causation.  In Proceedings of the Second International Conference on Principles of Knowledge Representation and Reasoning, J. A. Allen, R. Fikes, and E. Sandewall, editors, p. 441-452.  Morgan-Kaufmann, San Mateo, CA, 1991.

 

Class Calendar

 

Lecture

Date

Topic

Source

0

August 24

Administrivia; overview of learning

TMM Chapter 1

1

August 26

Concept learning, version spaces

TMM 2

2

August 31

Inductive bias, PAC learning

TMM 2, 7.1-3; notes

3

September 2

PAC, VC dimension, error bounds

TMM 7.4.1-3, 7.5.1-3

4

September 7

Decision trees; using MLC++

TMM 3; RN 18

5

September 9

Decision trees, overfitting, Occam

TMM 3

6

September 14

Perceptrons, Winnow

TMM 4

7

September 16

Multi-layer perceptrons, backprop

TMM 4; CB; notes

8

September 21

Estimation and confidence intervals

TMM 5

9

September 23

Bayesian learning: MAP, max likelihood

TMM 6

10

September 28

Bayesian learning: MDL, BOC, Gibbs

TMM 6

11

September 30

Naïve Bayes; prob. learning over text

TMM 6; notes

12

October 5

Bayesian networks

TMM 6; RN 14-15

13

October 7

Bayesian networks

TMM 6; paper

14

October 12

Bayesian networks; midterm review

TMM 1-7; RN 14-15, 18

15

October 14

Midterm Exam

(Paper)

16

October 19

EM, unsupervised learning

TMM 6

17

October 21

Time series and stochastic processes

Notes

18

October 26

Policy learning; MDPs

RN 16-17

19

October 28

Reinforcement learning I

TMM 13; RN 20; papers

20

November 2

Reinforcement learning II

TMM 13

21

November 4

Neural computation

Papers; RN 19

22

November 9

Combining classifiers (WM, bagging)

TMM 7

23

November 11

Boosting

TMM 9; papers

24

November 16

Introduction to genetic algorithms

TMM 9; DEG

25

November 18

Genetic programming

TMM 9; JK; papers

26

November 23

IBL, k-nearest neighbor, RBFs

TMM 8.1-4

27

November 30

Rule learning and extraction

TMM 10; paper

28

December 2

Inductive logic programming

TMM 10; RN 21

29

December 7

Data mining/KDD: application survey

Notes (no paper)

30

December 9

Final review

TMM 1-10, 13; RN 14-21

31

December 14

FINAL EXAM

 

 

TMM: Machine Learning, T. M. Mitchell

RN: Artificial Intelligence: A Modern Approach, S. J. Russell and P. Norvig

DEG: Genetic Algorithms in Search, Optimization, and Machine Learning, D. E. Goldberg

CB: Neural Networks for Pattern Recognition, C. M. Bishop

JK: Genetic Programming: On The Programming of Computers by Means of Natural Selection, J. Koza

 

Note: shaded entries are the due dates of paper reviews.

 

Tentative schedule for homeworks:

1.        Assigned Tuesday, August 31, 1999, due Thursday, September 16, 1999

2.        Assigned Tuesday, September 21, 1999, due Thursday, October 21, 1999

3.        Assigned Tuesday, October 26, 1999, due Thursday, November 4, 1999

4.        Assigned Tuesday, November 9, 1999, due Thursday, November 18, 1999

5.        Assigned Tuesday, November 23, 1999, due Thursday, December 7, 1999