CIS 798 (Topics in Computer Science)
Topics in Intelligent Systems and Machine Learning
Fall, 1999
Homework Assignment 2
Part 3 (35 points): Perceptrons and Winnow
You should finish this part by Friday, October 15, 1999.
In this machine problem, you will implement two very simple and important learning algorithms and experiment with them by comparing them on a synthetic data set. The algorithms are online, one-pass versions of Perceptron and Winnow, as described in Chapter 4 of Mitchell and Lecture 6.
You will run 3 experiments. In each, you will learn a different target function (generated using the same program) using each algorithm and present the corresponding learning curves.
Target function and data
The instance space is {0, 1}500 - that is, there are 500 binary (boolean-valued) attributes, or features, in the domain. The 3 target concepts will all be k-of-m functions, each time with a different k.
Definition: Let f: {0, 1}n ® {0, 1} be an k-of-m function. f is defined as follows: Let x Î {0, 1}n. Let S be a fixed subset of attributes (out of the n possible) and let |S| = m. f(x) = 1 if and only if k attributes out of the m fixed attributes have the value 1.
Note that if m = n, then we have the well-known k-of-n concept (called m-of-n in class and in most of the literature). This is the case where all the attributes are relevant. We are going experiment with the case where only the first half (m = 250) of the attributes are relevant. This function can be represented using the hypothesis language provided by Perceptron and Winnow.
You will run 3 experiments, with different values of k: k = 10, k = 50, and k = 100.
Data generation
First, you will generate the data for the experiments yourself. For each function f10, f50, f100, generate 1100 data points in the following way:
Implementation
This should be the easy part! The data generation may actually be more complicated.
The variants of Perceptron and Winnow that you are implementing are especially simple: use online updates (after each example) and train only one time per example. Note that Perceptron and Winnow are essentially the same algorithm, with a slightly different update rule and constants - e.g., start with a weight vector of (1, 1, …, 1) for Winnow as documented in Lecture 6.
Experiments
For each of the 3 target functions, train the 2 algorithms on the first 50 to 1000 examples in increments of 50 (i.e., 50, 100, 150, …, 950, 1000), and, in each case, test on the 100 test examples you generated. The easiest way to do this is to load the data set into memory and write an outer loop that calls each algorithm on the first m training examples.
You are encouraged to experiment with parameters to find the most effective ones, but be consistent (within and across data sets) when collecting the final results.
What to submit