1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237
| %% Machine Learning Online Class % Exercise 8 | Anomaly Detection and Collaborative Filtering % % Instructions % ------------ % % This file contains code that helps you get started on the % exercise. You will need to complete the following functions: % % estimateGaussian.m % selectThreshold.m % cofiCostFunc.m % % For this exercise, you will not need to change any code in this file, % or any other files other than those mentioned above. %
%% =============== Part 1: Loading movie ratings dataset ================ % You will start by loading the movie ratings dataset to understand the % structure of the data. % fprintf('Loading movie ratings dataset.\n\n');
% Load data load ('ex8_movies.mat');
% Y is a 1682x943 matrix, containing ratings (1-5) of 1682 movies on % 943 users % % R is a 1682x943 matrix, where R(i,j) = 1 if and only if user j gave a % rating to movie i
% From the matrix, we can compute statistics like average rating. fprintf('Average rating for movie 1 (Toy Story): %f / 5\n\n', ... mean(Y(1, R(1, :))));
% We can "visualize" the ratings matrix by plotting it with imagesc imagesc(Y); ylabel('Movies'); xlabel('Users');
fprintf('\nProgram paused. Press enter to continue.\n'); pause;
%% ============ Part 2: Collaborative Filtering Cost Function =========== % You will now implement the cost function for collaborative filtering. % To help you debug your cost function, we have included set of weights % that we trained on that. Specifically, you should complete the code in % cofiCostFunc.m to return J.
% Load pre-trained weights (X, Theta, num_users, num_movies, num_features) load ('ex8_movieParams.mat');
% Reduce the data set size so that this runs faster num_users = 4; num_movies = 5; num_features = 3; X = X(1:num_movies, 1:num_features); Theta = Theta(1:num_users, 1:num_features); Y = Y(1:num_movies, 1:num_users); R = R(1:num_movies, 1:num_users);
% Evaluate cost function J = cofiCostFunc([X(:) ; Theta(:)], Y, R, num_users, num_movies, ... num_features, 0); fprintf(['Cost at loaded parameters: %f '... '\n(this value should be about 22.22)\n'], J);
fprintf('\nProgram paused. Press enter to continue.\n'); pause;
%% ============== Part 3: Collaborative Filtering Gradient ============== % Once your cost function matches up with ours, you should now implement % the collaborative filtering gradient function. Specifically, you should % complete the code in cofiCostFunc.m to return the grad argument. % fprintf('\nChecking Gradients (without regularization) ... \n');
% Check gradients by running checkNNGradients checkCostFunction;
fprintf('\nProgram paused. Press enter to continue.\n'); pause;
%% ========= Part 4: Collaborative Filtering Cost Regularization ======== % Now, you should implement regularization for the cost function for % collaborative filtering. You can implement it by adding the cost of % regularization to the original cost computation. %
% Evaluate cost function J = cofiCostFunc([X(:) ; Theta(:)], Y, R, num_users, num_movies, ... num_features, 1.5); fprintf(['Cost at loaded parameters (lambda = 1.5): %f '... '\n(this value should be about 31.34)\n'], J);
fprintf('\nProgram paused. Press enter to continue.\n'); pause;
%% ======= Part 5: Collaborative Filtering Gradient Regularization ====== % Once your cost matches up with ours, you should proceed to implement % regularization for the gradient. %
% fprintf('\nChecking Gradients (with regularization) ... \n');
% Check gradients by running checkNNGradients checkCostFunction(1.5);
fprintf('\nProgram paused. Press enter to continue.\n'); pause;
%% ============== Part 6: Entering ratings for a new user =============== % Before we will train the collaborative filtering model, we will first % add ratings that correspond to a new user that we just observed. This % part of the code will also allow you to put in your own ratings for the % movies in our dataset! % movieList = loadMovieList();
% Initialize my ratings my_ratings = zeros(1682, 1);
% Check the file movie_idx.txt for id of each movie in our dataset % For example, Toy Story (1995) has ID 1, so to rate it "4", you can set my_ratings(1) = 4;
% Or suppose did not enjoy Silence of the Lambs (1991), you can set my_ratings(98) = 2;
% We have selected a few movies we liked / did not like and the ratings we % gave are as follows: my_ratings(7) = 3; my_ratings(12)= 5; my_ratings(54) = 4; my_ratings(64)= 5; my_ratings(66)= 3; my_ratings(69) = 5; my_ratings(183) = 4; my_ratings(226) = 5; my_ratings(355)= 5;
fprintf('\n\nNew user ratings:\n'); for i = 1:length(my_ratings) if my_ratings(i) > 0 fprintf('Rated %d for %s\n', my_ratings(i), ... movieList{i}); end end
fprintf('\nProgram paused. Press enter to continue.\n'); pause;
%% ================== Part 7: Learning Movie Ratings ==================== % Now, you will train the collaborative filtering model on a movie rating % dataset of 1682 movies and 943 users %
fprintf('\nTraining collaborative filtering...\n');
% Load data load('ex8_movies.mat');
% Y is a 1682x943 matrix, containing ratings (1-5) of 1682 movies by % 943 users % % R is a 1682x943 matrix, where R(i,j) = 1 if and only if user j gave a % rating to movie i
% Add our own ratings to the data matrix Y = [my_ratings Y]; R = [(my_ratings ~= 0) R];
% Normalize Ratings [Ynorm, Ymean] = normalizeRatings(Y, R);
% Useful Values num_users = size(Y, 2); num_movies = size(Y, 1); num_features = 10;
% Set Initial Parameters (Theta, X) X = randn(num_movies, num_features); Theta = randn(num_users, num_features);
initial_parameters = [X(:); Theta(:)];
% Set options for fmincg options = optimset('GradObj', 'on', 'MaxIter', 100);
% Set Regularization lambda = 10; theta = fmincg (@(t)(cofiCostFunc(t, Ynorm, R, num_users, num_movies, ... num_features, lambda)), ... initial_parameters, options);
% Unfold the returned theta back into U and W X = reshape(theta(1:num_movies*num_features), num_movies, num_features); Theta = reshape(theta(num_movies*num_features+1:end), ... num_users, num_features);
fprintf('Recommender system learning completed.\n');
fprintf('\nProgram paused. Press enter to continue.\n'); pause;
%% ================== Part 8: Recommendation for you ==================== % After training the model, you can now make recommendations by computing % the predictions matrix. %
p = X * Theta'; my_predictions = p(:,1) + Ymean;
movieList = loadMovieList();
[r, ix] = sort(my_predictions, 'descend'); fprintf('\nTop recommendations for you:\n'); for i=1:10 j = ix(i); fprintf('Predicting rating %.1f for movie %s\n', my_predictions(j), ... movieList{j}); end
fprintf('\n\nOriginal ratings provided:\n'); for i = 1:length(my_ratings) if my_ratings(i) > 0 fprintf('Rated %d for %s\n', my_ratings(i), ... movieList{i}); end end
|