Lecture |
Date |
Topic |
(Primary) Source |
0 |
18 Jan 2002 |
Administrivia; overview of topics |
RN Chapters 1-9, 14, 18 |
1 |
23 Jan 2002 |
AI topics review |
RN Chapters 1-9, 14, 18 |
2 |
28 Jan 2002 |
Introduction to machine learning |
TMM 2-5, 7; RN 18 |
3 |
30 Jan 2002 |
Data mining basics |
WF 1-2 |
4 |
04 Feb 2002 |
G: Intro to Genetic Algorithms |
TMM
9, Goldberg |
5 |
06 Feb 2002 |
B: Bayesian Networks |
TMM 6, Goldberg |
6 |
11 Feb 2002 |
N: Artificial Neural Networks (ANNs) |
TMM 6, Goldberg |
7 |
13 Feb 2002 |
K: DM/KDD/decision support overview |
WF 3-4 |
8 |
18 Feb 2002 |
Discussion K1: 1 of 12 |
WHH |
9 |
20 Feb 2002 |
Presentation K1: 1 of 12 |
Roby Joehanes |
10 |
25 Feb 2002 |
Discussion A1: 2 of 12 |
WHH |
11 |
27 Feb 2002 |
Presentation A1: 2 of 12 |
Siddharth Chandak |
12 |
04 Mar 2002 |
Discussion B1: 3 of 12 |
WHH |
13 |
06 Mar 2002 |
Presentation B1: 3 of 12 |
WHH |
14 |
11 Mar 2002 |
Discussion G1: 4 of 12 |
WHH |
15 |
13 Mar 2002 |
Presentation G1: 4 of 12 |
WHH |
16 |
25 Mar 2002 |
Midterm review |
RN Chapters 1-9, 14, 18-21 |
17 |
27 Mar 2002 |
Midterm
exam |
RN Chapters 1-9,
14, 18-21 |
18 |
01 Apr 2002 |
Discussion/presentation K2: 5 of 12 |
WHH |
19 |
03 Apr 2002 |
Discussion/presentation A2: 6 of 12 |
WHH |
20 |
08 Apr 2002 |
Discussion/presentation B2: 7 of 12 |
Yousheng Chang |
21 |
10 Apr 2002 |
Discussion/presentation G2: 8 of 12 |
Indira Mohanty |
22 |
15 Apr 2002 |
Discussion/presentation K3: 9 of 12 |
Vinod Chandana |
23 |
17 Apr 2002 |
Discussion/presentation A3: 10 of 12 |
WHH |
24 |
22 Apr 2002 |
Discussion/presentation B3: 11 of 12 |
WHH |
25 |
24 Apr 2002 |
Discussion/presentation G3: 12 of 12 |
Sreenivas
Babu |
26 |
29 Apr 2002 |
Conclusion |
TBD |
27 |
01 May 2002 |
Project presentations I; projects due |
N/A |
28 |
06 May 2002 |
Project presentations II |
N/A |
29 |
08 May 2002 |
Project presentations III; |
N/A |
30 |
10 May 2002 |
NO
CLASS; REVIEWS DUE |
N/A |
WF: Data Mining, I. H. Witten and E. Frank
RN: Artificial Intelligence: A Modern Approach, S.
J. Russell and P. Norvig
TMM: Machine
Learning, T. M. Mitchell
Homework: 3 of 4
programming and written assignments (15%)
Papers: 8 (out of
12) written (1-page) reviews of research papers (8%); project presentations
(10%)
Class participation: in-class discussion, quiz (2%)
Examinations: 1
in-class midterm (25%), no final exam
Computer language(s): C/C++,
Java, or student choice (upon instructor approval)
Project: term
programming project for all students (40%); additional term paper or project
extension (4 credit hour) option for graduate students and advanced
undergraduates
· Official class page: http://www.kddresearch.org/Courses/Spring-2002/CIS830
· Instructor’s home page: http://www.cis.ksu.edu/~bhsu
Note: It is the student’s responsibility to be aware of class
announcements and materials posted on the official class page, so please check
it frequently.
Required
readings, along with reference manuals and tutorials for software used in the
course, will be available for purchase (in 2 packets) from the Engineering Copy
Center in 14 Seaton Hall.
· URL: http://groups.yahoo.com/group/ksu-cis830-spring2002/
· Primary purpose: for class discussions (among students and with instructor)
Note: Postings on the web board will tend to get a more rapid response from the instructor than e-mail, besides which, they are sometimes of benefit to fellow students.
Homework
assignments will be given out 2 to 3 weeks apart, for a total of 4. Your lowest score will be dropped (see
below). One of these homeworks will be
programming-based; one will require you to run (and possibly modify) an
existing library or KDD package using a specification or sample data, and
analyze the results; and two will be written.
Type
(do not hand-write) homeworks; handwritten solutions are worth 0.8 credit.
For
programming assignments and the course project, you are permitted to use your
choice of a high-level programming language (C++ and Java are strongly preferred;
consult the instructor if you intend to use any other programming
language). You must, however, use a
development environment that is available to the CIS department. Consult the class web page for approved
compilers.
For
graduate students and advanced undergraduates interested in working on a class
project, you may elect an additional 1 hour of credit as a section of CIS 798 (Special
Topics in Computer Science) and either turn in a term paper or work on an
extension of the course project or a small-scale independent study
project. You may sign up for this
option any time before February 14, 2000 (talk to me during office hours or
send e-mail). Suggested project topics and
guidelines will be posted on the course web page. Examples include: improving a known supervised learning
algorithm; developing an algorithm for combining classifiers; a KDD application
of DBMS or OLAP, in support of an MSE concentration in database systems; a
project on analysis of time series or document databases (e.g., text or source
code); an in-depth comparison of two KDD techniques studied in the course; or
improving an existing model or analyzing it formally.
Cheating consists of misrepresenting another’s work
or knowledge as your own. It includes
not only copying of test answers, but plagiarism of another person’s written
material. While you are encouraged to discuss class material,
homework problems, and projects with your classmates, the work you turn in must be entirely your own. For homework assignments, this means that if
you work together with a fellow student, you should still produce the final,
written material from your own notes and individual work, rather than from common notes that you produced together. You should follow similar guidelines for
programming assignments and individual projects; while reuse of previously
developed source codes may be permitted in these cases (provided you
acknowledge the authors appropriately), you must not use directly use code developed by fellow students. Please consult the University honor code (http://www.ksu.edu/honor) for further
guidelines on ethical conduct, and understand the regulations and penalties for
violating them.
The codes that you are permitted to use on certain assignments may be limited, beyond the specifications of plagiarism standards. When in doubt about whether you may use a particular program on a written or programming assignment, consult the instructor first. My objective is to help you learn as much as possible from the assignments; sometimes this means that I want you to use existing code and sometimes I will prefer for you to develop it yourself, to better understand the techniques.
Credit
for the course will be distributed as follows:
Component
|
Quantity
|
Low
Scores
Dropped
|
Points
Each
(Out
of 1000)
|
Value
|
Homework (Written/Programming Assignments) |
4 |
1 |
50 |
15% |
Paper Reviews and Commentaries |
12 |
8 |
10 |
10% |
Presentation |
1 |
0 |
150 |
15% |
Midterm Exam (Take-Home, Open-Book) |
1 |
0 |
250 |
25% |
Course Project |
1 |
0 |
500 |
50% |
Homework
and exams may contain extra credit
problems.
Late
policy: Homeworks are due at 5:00pm on Fridays; you may request an extension to
the following Monday if you need one by
the due date (but I recommend you do not take this option). 10% credit will be deducted for each day the
assignment is late past 5:00pm that Monday.
There will be no additional extensions!
Letter
grades will be assigned based on the distribution of raw scores (“curved”).
Undergraduate and graduate students will be graded on the same curve. Acquiring 85% of the possible points,
however, guarantees an A; 70%, a B; 55%, a C.
Actual scales may be more generous than this if called for, but are not
expected to be.
If you elect to take an additional CIS 798 project option (for 1 hour of credit), your grade for CIS 830 will still be assigned based only on the above components. The additional project component will be graded separately (as CIS 798) and weighted proportionately.
An
important part of learning about computer graphics and visualization systems,
whether for research or development applications, is understanding the state of
the field and the repercussions of important results. The readings in this course are designed to give you not only a
set of tutorials and references for machine learning tools and techniques, but
to demonstrate the subject as a unified whole, and to encourage you to think
more deeply about the practical and theoretical issues.
Toward
this end, I have selected 4 papers out of those in your (2) course notes
packets. The first 2 of these are in
the first packet and the last 2 are in the second. Before you come to lecture on the dates indicated on the class
calendar, you should submit (by e-mail to the instructor) a short review of, and commentary on, the
assigned paper. This commentary need be no longer than 2
pages (though you can go up to 3 pages if you feel you have something
meaningful to add).
This review is an important part
of the course, because it can:
·
help you to review and rehearse material from lecture
·
bring to light questions that you may have about the
material
·
improve your ability to articulate what you have learned
·
help guide the lecture/discussion
·
help you to think about working on projects (implementations
or research) in this field
Here
are some guidelines on writing the reviews:
1.
Try to be brief and concise.
2.
Concentrate on pointing out the paper’s main strengths and flaws, both in content (key points, accuracy,
impact/implications, deficiencies) and in presentation (organization,
clarity/density, interest). Try not to merely summarize the paper.
3.
Some questions to address (typically a couple in each
paper):
·
Is the paper of sufficiently broad interest? What do you think its intended audience is? To
what audience do you think the paper is significant?
·
What makes the paper significant or insignificant?
·
How could the presentation be improved to better point out
important implications?
·
Is the paper technically sound? How so, or in what areas is it not entirely sound?
·
What novel ideas can we pick up (about the topics covered in
lecture) from the paper?
4.
Comment on how the paper (or the topic) affects your own
work. How is it relevant (or
irrelevant) to you?
5.
How might the research be improved in light of how the field
has progressed since it was published?
Some of these papers were catalysts for research in their areas, so it
is sometimes infeasible to second-guess their authors; but comment on what
could be done better today.
Paper reviews are late (worth 0
credit) after midnight of the day of the lecture when they are due (i.e., you
must submit them before 12:00am Tuesday, Thursday, or Saturday).
Do not plagiarize. It is relatively easy to detect plagiarism
of material from the paper itself, related references, and paper reviews of
classmates! Again, refer to http://www.ksu.edu/honor for regulations
and further guidelines on academic honesty.