====== Differences ====== This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
gibson:teaching:fall-2014:math445:lab5 [2014/11/12 19:01] gibson |
gibson:teaching:fall-2014:math445:lab5 [2015/03/09 13:39] (current) gibson |
||
---|---|---|---|
Line 50: | Line 50: | ||
</code> | </code> | ||
- | or equivalently | ||
- | |||
- | <code> | ||
- | N=100; | ||
- | p=(G^N)*p; | ||
- | </code> | ||
After that is complete the elements of ''p'' give the probabilities that a random web surfer will end up at the web pages listed in ''U'', and you can rank the pages in ''U'' according to their probabilities in ''p''. Look up the ''sort'' command in the Help menu and find a way to print the list of websites in order of their rank. Turn in the list of the top 10, the fraction of the time a random web surfer would spend in each one of those 10, and your code to complete the Lab. | After that is complete the elements of ''p'' give the probabilities that a random web surfer will end up at the web pages listed in ''U'', and you can rank the pages in ''U'' according to their probabilities in ''p''. Look up the ''sort'' command in the Help menu and find a way to print the list of websites in order of their rank. Turn in the list of the top 10, the fraction of the time a random web surfer would spend in each one of those 10, and your code to complete the Lab. | ||
Line 73: | Line 67: | ||
- | **Problem 4:** Produce a plot of execution time T versus the number of web page M for three | + | **Problem 4:** This lab has mentioned three different algorithms for calculating the page rank. For this problem, you will compare the execution time T of these three algorithms as a function of M, the number of pages in your web of links. The end result will be a plot of execution time versus M, with three lines, one for each algorithm. |
- | different algorithms for computing the page rank. | + | |
- | + | ||
- | This lab has mentioned three different algorithms for calculating the page rank. For this problem, you will compare the execution time T of these three algorithms as a function of M, the number of pages in your web of links. The end result will be a plot of execution time versus M, with three lines, one for each algorithm. | + | |
It will be easiest to do the comparisons if you convert your page rank script to a Matlab function and then write three different algorithms as three different versions of that function. | It will be easiest to do the comparisons if you convert your page rank script to a Matlab function and then write three different algorithms as three different versions of that function. | ||
Line 93: | Line 84: | ||
function [i,p] = pagerank_fullpower(L); | function [i,p] = pagerank_fullpower(L); | ||
- | that takes a matrix of links L as an input and returns the indices i of webpages sorted by | + | that takes a matrix of links L as an input and returns the indices i of webpages sorted by their page ranks, and the page ranks p sorted the same way (so that p(1) >= p(2) etc). |
- | their page ranks, and the page ranks p sorted the same way (so that p(1) >= p(2) etc). | + | |
- | To be sure that the function operates with full matrices, | + | To be sure that the function operates with full matrices, make this the first line of the function body |
L = full(L); | L = full(L); | ||
Line 127: | Line 117: | ||
function [i,p] = pagerank_sparseiterate(L); | function [i,p] = pagerank_sparseiterate(L); | ||
- | that converts the input L to a sparse matrix in the first line | + | that converts the input L to a sparse matrix in the first line with |
L = sparse(L) | L = sparse(L) | ||
Line 146: | Line 136: | ||
p = one_alpha_T*p + alpha_ones; | p = one_alpha_T*p + alpha_ones; | ||
end | end | ||
+ | |||
+ | Note that the sparse algorithm avoids the matrix of 1's that is present in the other two algorithms, because that matrix is necessarily full. | ||
+ | |||
+ | |||
+ | Now write a script that tests and times these algorithms on link matrices L of sizes M = 32, 64, 128, 256, ...., up to whatever power of two your computer can handle. It's smart to develop and test your code on a smaller set of M, perhaps just M = 32, 64, 128. As a starting point, here's the code we developed in class to measure execution time of full versus sparse matrices. | ||
+ | |||
+ | <code matlab> | ||
+ | |||
+ | % script to compare full versus sparse matrix multiplication | ||
+ | |||
+ | for n=1:12 | ||
+ | | ||
+ | % set up random M x M Ax=b problem | ||
+ | m = 2^n; | ||
+ | A = randomL(m); | ||
+ | x = randn(m,1); | ||
+ | |||
+ | % measure execution time of sparse matrix multiplication | ||
+ | tic; | ||
+ | b = A*x; | ||
+ | Tsparse(n) = toc; | ||
+ | |||
+ | % measure execution time of full matrix multiplication | ||
+ | A = full(A); | ||
+ | tic; | ||
+ | b = A*x; | ||
+ | Tfull(n) = toc; | ||
+ | |||
+ | M(n) = m; | ||
+ | end | ||
+ | </code> | ||
+ | |||
+ | You can use this Matlab function to produce synthetic data for testing any given value of M | ||
+ | |||
+ | <code matlab> | ||
+ | function L = randomL(M); | ||
+ | % return random link matrix for testing pagerank algorithms | ||
+ | |||
+ | maxlinks = 10; % maximum of number of links per page | ||
+ | L = sparse(M,M); | ||
+ | | ||
+ | for j=1:M | ||
+ | i = randi(M, maxlinks, 1); % get vector of random numbers between 1 and M | ||
+ | for n = 1:maxlinks | ||
+ | L(i(n),j) = 1; % create links from page j to the random pages i | ||
+ | end | ||
+ | end | ||
+ | | ||
+ | end | ||
+ | </code> | ||
- | Now write a script that tests and times these algorithms on link matrices L of sizes M = 32, 64, 128, 256, 512, and 1024. Produce a plot of execution time T versus M for each of your three pagerank functions using Matlab's ''tic'' and ''toc''. From the plot, estimate the explicit function that relates T to M for each of the three algorithms. | + | Produce a plot of execution time T versus M for each of your three pagerank functions using Matlab's ''tic'' and ''toc''. From the plot, estimate the explicit function that relates T to M for each of the three algorithms. |
Which algorithm is fastest? Why? How long do you estimate each algorithm would take to rank M=10^6 web pages? Provide your answer in easily understandable units of time (days, weeks, years, etc.) | Which algorithm is fastest? Why? How long do you estimate each algorithm would take to rank M=10^6 web pages? Provide your answer in easily understandable units of time (days, weeks, years, etc.) |