Waiting for answer This question has not been answered yet. You can hire a professional tutor to get the answer.
Give an example of a dataset consisting of four data vectors where there exist two different optimal (minimum Sum of Squared Errors) 2-means (k = 2)...
1. Give an example of a dataset consisting of four data vectors where there exist two different optimal (minimum Sum of Squared Errors) 2-means (k = 2) clustering of the dataset.
a. Calculate the optimal SSE value for your example.
b. In general, how should datasets look like geometrically so that we have more than one optimal solution?
c. What defines the number of optimal solutions?
2. a. Given k clusters and their respective cluster sizes s1, s2, . . . , sk, what is the probability that two random (with replacement) data vectors (from the clustered dataset) belong to the same cluster?
b. Now, assume you are given this probability (you do not have si 's and k), and the fact that clusters are equally sized, can you find k?