Syllabus

 

Lecture

Date

Topic

(Primary) Source

0

18 Jan 2002

Administrivia; overview of topics

RN Chapters 1-9, 14, 18

1

23 Jan 2002

AI topics review

RN Chapters 1-9, 14, 18

2

28 Jan 2002

Introduction to machine learning

TMM 2-5, 7; RN 18

3

30 Jan 2002

Data mining basics

WF 1-2

4

04 Feb 2002

G: Intro to Genetic Algorithms

TMM 9, Goldberg

5

06 Feb 2002

B: Bayesian Networks

TMM 6, Goldberg

6

11 Feb 2002

N: Artificial Neural Networks (ANNs)

TMM 6, Goldberg

7

13 Feb 2002

K: DM/KDD/decision support overview

WF 3-4

8

18 Feb 2002

Discussion K1: 1 of 12

WHH

9

20 Feb 2002

Presentation K1: 1 of 12

Roby Joehanes

10

25 Feb 2002

Discussion A1: 2 of 12

WHH

11

27 Feb 2002

Presentation A1: 2 of 12

Siddharth Chandak

12

04 Mar 2002

Discussion B1: 3 of 12

WHH

13

06 Mar 2002

Presentation B1: 3 of 12

WHH

14

11 Mar 2002

Discussion G1: 4 of 12

WHH

15

13 Mar 2002

Presentation G1: 4 of 12

WHH

16

25 Mar 2002

Midterm review

RN Chapters 1-9, 14, 18-21

17

27 Mar 2002

Midterm exam

RN Chapters 1-9, 14, 18-21

18

01 Apr 2002

Discussion/presentation K2: 5 of 12

WHH

19

03 Apr 2002

Discussion/presentation A2: 6 of 12

WHH

20

08 Apr 2002

Discussion/presentation B2: 7 of 12

Yousheng Chang

21

10 Apr 2002

Discussion/presentation G2: 8 of 12

Indira Mohanty

22

15 Apr 2002

Discussion/presentation K3: 9 of 12

Vinod Chandana

23

17 Apr 2002

Discussion/presentation A3: 10 of 12

WHH

24

22 Apr 2002

Discussion/presentation B3: 11 of 12

WHH

25

24 Apr 2002

Discussion/presentation G3: 12 of 12

Sreenivas Babu

26

29 Apr 2002

Conclusion

TBD

27

01 May 2002

Project presentations I; projects due

N/A

28

06 May 2002

Project presentations II

N/A

29

08 May 2002

Project presentations III;

N/A

30

10 May 2002

NO CLASS; REVIEWS DUE

N/A

 

WF: Data Mining, I. H. Witten and E. Frank

RN: Artificial Intelligence: A Modern Approach, S. J. Russell and P. Norvig

TMM: Machine Learning, T. M. Mitchell

 

Course Requirements

 

Homework: 3 of 4 programming and written assignments (15%)

Papers: 8 (out of 12) written (1-page) reviews of research papers (8%); project presentations (10%)

Class participation: in-class discussion, quiz (2%)

Examinations: 1 in-class midterm (25%), no final exam

Computer language(s): C/C++, Java, or student choice (upon instructor approval)

Project: term programming project for all students (40%); additional term paper or project extension (4 credit hour) option for graduate students and advanced undergraduates

 

 

Class Resources

 

Web pages

·         Official class page: http://www.kddresearch.org/Courses/Spring-2002/CIS830

·         Instructor’s home page: http://www.cis.ksu.edu/~bhsu

 

Note: It is the student’s responsibility to be aware of class announcements and materials posted on the official class page, so please check it frequently.

 

Course notes

Required readings, along with reference manuals and tutorials for software used in the course, will be available for purchase (in 2 packets) from the Engineering Copy Center in 14 Seaton Hall.

 

Class web board

·         URL: http://groups.yahoo.com/group/ksu-cis830-spring2002/

·         Primary purpose: for class discussions (among students and with instructor)

 

 

Note: Postings on the web board will tend to get a more rapid response from the instructor than e-mail, besides which, they are sometimes of benefit to fellow students.

 

Homework Assignments and Course Project

 

                Homework assignments will be given out 2 to 3 weeks apart, for a total of 4.  Your lowest score will be dropped (see below).  One of these homeworks will be programming-based; one will require you to run (and possibly modify) an existing library or KDD package using a specification or sample data, and analyze the results; and two will be written.

 

                Type (do not hand-write) homeworks; handwritten solutions are worth 0.8 credit.

 

                For programming assignments and the course project, you are permitted to use your choice of a high-level programming language (C++ and Java are strongly preferred; consult the instructor if you intend to use any other programming language).  You must, however, use a development environment that is available to the CIS department.  Consult the class web page for approved compilers.

 

                For graduate students and advanced undergraduates interested in working on a class project, you may elect an additional 1 hour of credit as a section of CIS 798 (Special Topics in Computer Science) and either turn in a term paper or work on an extension of the course project or a small-scale independent study project.  You may sign up for this option any time before February 14, 2000 (talk to me during office hours or send e-mail).  Suggested project topics and guidelines will be posted on the course web page.  Examples include: improving a known supervised learning algorithm; developing an algorithm for combining classifiers; a KDD application of DBMS or OLAP, in support of an MSE concentration in database systems; a project on analysis of time series or document databases (e.g., text or source code); an in-depth comparison of two KDD techniques studied in the course; or improving an existing model or analyzing it formally.

No-Cheating Policy

 

Cheating consists of misrepresenting another’s work or knowledge as your own.  It includes not only copying of test answers, but plagiarism of another person’s written material.  While you are encouraged to discuss class material, homework problems, and projects with your classmates, the work you turn in must be entirely your own.  For homework assignments, this means that if you work together with a fellow student, you should still produce the final, written material from your own notes and individual work, rather than from common notes that you produced together.  You should follow similar guidelines for programming assignments and individual projects; while reuse of previously developed source codes may be permitted in these cases (provided you acknowledge the authors appropriately), you must not use directly use code developed by fellow students.  Please consult the University honor code (http://www.ksu.edu/honor) for further guidelines on ethical conduct, and understand the regulations and penalties for violating them.

 

The codes that you are permitted to use on certain assignments may be limited, beyond the specifications of plagiarism standards.  When in doubt about whether you may use a particular program on a written or programming assignment, consult the instructor first.  My objective is to help you learn as much as possible from the assignments; sometimes this means that I want you to use existing code and sometimes I will prefer for you to develop it yourself, to better understand the techniques.

 

Grading

 

                Credit for the course will be distributed as follows:

Component

Quantity

Low Scores

Dropped

Points Each

(Out of 1000)

Value

Homework (Written/Programming Assignments)

4

1

50

15%

Paper Reviews and Commentaries

12

8

10

10%

Presentation

1

0

150

15%

Midterm Exam (Take-Home, Open-Book)

1

0

250

25%

Course Project

1

0

500

50%

 

                Homework and exams may contain extra credit problems.

 

Late policy: Homeworks are due at 5:00pm on Fridays; you may request an extension to the following Monday if you need one by the due date (but I recommend you do not take this option).  10% credit will be deducted for each day the assignment is late past 5:00pm that Monday.  There will be no additional extensions!

 

                Letter grades will be assigned based on the distribution of raw scores (“curved”). Undergraduate and graduate students will be graded on the same curve.  Acquiring 85% of the possible points, however, guarantees an A; 70%, a B; 55%, a C.  Actual scales may be more generous than this if called for, but are not expected to be.

 

                If you elect to take an additional CIS 798 project option (for 1 hour of credit), your grade for CIS 830 will still be assigned based only on the above components.  The additional project component will be graded separately (as CIS 798) and weighted proportionately.

 

Paper Reviews and Commentaries

 

                An important part of learning about computer graphics and visualization systems, whether for research or development applications, is understanding the state of the field and the repercussions of important results.  The readings in this course are designed to give you not only a set of tutorials and references for machine learning tools and techniques, but to demonstrate the subject as a unified whole, and to encourage you to think more deeply about the practical and theoretical issues.

 

                Toward this end, I have selected 4 papers out of those in your (2) course notes packets.  The first 2 of these are in the first packet and the last 2 are in the second.  Before you come to lecture on the dates indicated on the class calendar, you should submit (by e-mail to the instructor) a short review of, and commentary on, the assigned paper.  This commentary need be no longer than 2 pages (though you can go up to 3 pages if you feel you have something meaningful to add).

 

This review is an important part of the course, because it can:

 

·          help you to review and rehearse material from lecture

·          bring to light questions that you may have about the material

·          improve your ability to articulate what you have learned

·          help guide the lecture/discussion

·          help you to think about working on projects (implementations or research) in this field

 

                Here are some guidelines on writing the reviews:

 

1.       Try to be brief and concise.

2.       Concentrate on pointing out the paper’s main strengths and flaws, both in content (key points, accuracy, impact/implications, deficiencies) and in presentation (organization, clarity/density, interest).  Try not to merely summarize the paper.

3.       Some questions to address (typically a couple in each paper):

·         Is the paper of sufficiently broad interest?  What do you think its intended audience is?  To what audience do you think the paper is significant?

·         What makes the paper significant or insignificant?

·         How could the presentation be improved to better point out important implications?

·         Is the paper technically sound?  How so, or in what areas is it not entirely sound?

·         What novel ideas can we pick up (about the topics covered in lecture) from the paper?

4.       Comment on how the paper (or the topic) affects your own work.  How is it relevant (or irrelevant) to you?

5.       How might the research be improved in light of how the field has progressed since it was published?  Some of these papers were catalysts for research in their areas, so it is sometimes infeasible to second-guess their authors; but comment on what could be done better today.

 

Paper reviews are late (worth 0 credit) after midnight of the day of the lecture when they are due (i.e., you must submit them before 12:00am Tuesday, Thursday, or Saturday).

 

Do not plagiarize.  It is relatively easy to detect plagiarism of material from the paper itself, related references, and paper reviews of classmates!  Again, refer to http://www.ksu.edu/honor for regulations and further guidelines on academic honesty.