Waiting for answer This question has not been answered yet. You can hire a professional tutor to get the answer.
The Perceptron convergence theorem is often stated in the following way. If R := max(x;y)2S kxk2, and u? 2 Rd satis es ku?k2 = 1 and yhu?
for all (x; y) 2 S
for some > 0, then Perceptron halts after at most (R=)2 iterations.
Explain why this theorem is the same as what was presented in lecture. Specically, let w? be the
vector from the Perceptron convergence theorem as given in the lecture, with length kw?k2 as small
as possible. And let u? be the vector from the present version of the Perceptron convergence theorem
such that is as large as possible. What is the relationship between w? and u?, and between kw?k2
and ? What is the shortest distance from a data point x from S to the (homogeneous) hyperplane
with normal vector w?? Give succinct (but precise) explanations for your answers.