channelflow.org

====== Differences ====== This shows you the differences between two versions of the page.

--- gibson:teaching:fall-2012:math445:lab8 [2012/11/01 09:40]
gibson created
+++ gibson:teaching:fall-2012:math445:lab8 [2012/11/06 06:24] (current)
gibson [Math 445 Lab 8: Presidential election]
@@ Line 1: / Line 1: @@
-====== Math 445 Lab 8; Presidential election ======
-Your job is to predict the outcome of today's Presidential election, given the last-minute polling data.
+====== Math 445 Lab 8: Presidential election ======
-Specifically, given a list of states, their electoral votes, the composite polling percentages for each
+Your job is to predict the outcome of today's Presidential election
-candidate, and the margins of error those polling percentages, you are to run a large number (''Nelections'')
+given the last-minute polling data, using Monte Carlo simulation.
-of simulations of the election. For each state, award each of the two candidates the specified composite
-polling percent plus
-a random number in the range between -margin and +margin. Then award that state's electoral votes to the candidate
-with the larger percentage of votes. Add up all the electoral votes for each candidate, and award the nth election
+Specifically, given a list of states, their electoral votes, the composite
+polling percentages for each candidate, and the margins of error those
+polling percentages, you are to run a large number of simulations of the
+election and determine the likelihood that either candidate will win based
+on the results of those simulations. For each state, start by assigning the
+specified composite polling percentages to the two candidates. Then add to
+each candidate's percentage a different random number in the range between
+''-margin'' and ''+margin''. Compare the two percentages and award that state's
+electoral votes to the candidate with the larger percentage of votes. Do this
+for all fifty states (plus DC), add up all the electoral votes for each candidate,
+and award the ''n''th election to the candidate with the majority of electoral votes.
+Run a large number of such simulated elections, keeping track of the number of
+electoral votes for each candidate in each election. Make a histogram that
+shows the statistical distribution of total electoral votes for one of the
+candidates, using bins of width 10 between 0 and 540 (0-9.99 for bin 1, 10-14.99 for
+bin 2, etc). If you can figure out how, color the bins corresponding to Romney
+wins red and the bins corresponding to Obama wins blue, or else just draw a vertical
+line at the magic number of 270 electoral votes needed to win the election outright.
+===== Questions =====
+Then answer the following questions
+  - Who is most likely to win the presidential election?
+  - What is the probability that the most likely winner will actually win?
+  - What is the most likely range of electoral votes for the winner? (among the bins of width 10 specified above)
+  - What is the likelihood of a 269-269 electoral vote tie?
+Turn in print-outs of your codes, your histogram, and your answers to the above questions.
+===== Tips =====
+  * Start with a small number of simulated elections (say 100) and then increase to a large number (say 10,000) when you're confident your code is working correctly.
+  * You can also develop your code using simulated data, for example, just ten states all with the same polling numbers and a very small margin of error.
+  * Try to use as few for-loops as possible. If you are really on fire, you can do it with just one for-loop that loops over the number of trials.
+  * Changing the colors of histogram bins in Matlab is not as easy as one might hope. You'll need to take data returned from the **hist** function and replot it with the **bar** command. See http://www.mathworks.com/matlabcentral/newsreader/view_thread/290534 for an example of how to do this.
+===== Broader questions =====
+Some further questions you might also address
+  * The margins of error reported in the table are really 95% confidence levels, corresponding to two standard deviations of a Gaussian distribution. Modify your code so that the random number added to each percentage is from a Gaussian distribution with standard deviation of one-half the margin of error.  Does this significantly change your results?
+  * Does doubling or halving the margins of error significantly change your results?
+  * How many elections do you need to simulate in order to get reliable answers?
+  * The lab as written assumes a two-party presidential election. Should we include third-party candidates? Why or why not? How would you revise your code to include a third party? Would it change the results significantly?
+  * We are trusting that the polling data form an accurate estimate of the actual votes cast, to within the margins of error. The data reported below was obtained from [[http://fivethirtyeight.blogs.nytimes.com/]], and is claimed by its compiler to be unbiased and statistically reliable estimate, though there is a fair amount of controversy about this, split along party and ideological lines. Do you think the given polling data is fair and accurate? Is there a reason to suspect it is or is not?
+  * Do you believe your own election prediction? Why or why not?
+Relevant matlab commands; **rand**, **randn**, **sum**, **hist**, and **bar**, plus standard plotting commands such as **xlabel**, **ylabel**, **title**.
+===== Background =====
+Nate Silver, a sports statistician, pioneered the use of Monte Carlo methods
+in election prediction during the 2008 elections ([[http://fivethirtyeight.blogs.nytimes.com/]], [[http://en.wikipedia.org/wiki/FiveThirtyEight]]). In the 2008 elections, His model predicted 49 of 50 states correctly for the Presidential race (missing Indiana, which went to Obama by 1%) and all 35 Senate races correctly. Note that this lab does not cover the subtlest and most difficult aspect of election prediction: producing good composite poll numbers and margins of error from large numbers of pollsters using different methods, sample sizes, and polling dates. There is quite a bit of controversy in the current election over Mr. Silver's methods and his assessment that Obama has an 91% chance of winning the election. See, for example,
+  * [[http://cosmiclog.nbcnews.com/_news/2012/10/30/14809227-political-forecasts-stir-up-a-storm?lite]],
+  * [[http://www.dailykos.com/story/2012/11/01/1153661/-Nate-Silver-s-Math-Based-Math]]
+  * [[http://2012.talkingpointsmemo.com/2012/11/nate-silver-colbert-report-pundits.php?ref=fpnewsfeed|Nate Silver on Colbert   the Colbert Report]]
+  * google:"Nate Silver controversy"|
+===== Data =====
+Here's some current polling data, taken from [[http://fivethirtyeight.blogs.nytimes.com]] on 2012-11-01. You can load this into Matlab as a matrix ''P'' by cutting and pasting the data into a text file ''P.asc'' and running ''load P.asc'' within Matlab. If you don't believe this polling data, feel free to use something you trust more.
+<code>
+% Composite Presidential election polling numbers
+% from http://fivethirtyeight.blogs.nytimes.com
+% 2012-11-06 1am
+%
+%  O == Obama percentage
+%  R == Romney percentage
+%  M == margin of error
+% EV == electoral votes
+%
+% O    R    M    EV      state
+.8  62.7  3.8   9   %  AL
+.8  59.7  6.0   3   %  AK
+.2  53.0  3.3  11   %  AZ
+.7  59.7  3.8   6   %  AR
+.2  40.5  2.9  55   %  CA
+.9  48.2  3.0   9   %  CO
+.7  42.4  3.3   7   %  CT
+.6  39.7  5.5   3   %  DE
+.1   6.3  3.2   3   %  DC
+.9  49.7  2.7  29   %  FL
+.5  54.1  2.7  16   %  GA
+.5  32.6  3.9   4   %  HA
+.2  66.1  4.4   4   %  ID
+.9  39.5  3.0  20   %  IL
+.3  53.9  3.0  11   %  IN
+.2  47.8  3.2   6   %  IA
+.0  61.0  6.1   6   %  KA
+.4  58.7  4.5   8   %  KY
+.4  59.8  3.5   8   %  LA
+.1  42.7  3.7   4   %  ME
+.0  38.0  3.0  10   %  MD
+.1  39.8  3.7  11   %  MA
+.1  45.8  2.7  16   %  MI
+.8  45.0  2.9  10   %  MN
+.4  60.1  5.3   6   %  MS
+.6  53.6  2.8  10   %  MO
+.3  53.1  3.9   3   %  MT
+.5  58.8  3.3   5   %  NE
+.9  47.2  2.9   6   %  NV
+.5  47.8  3.4   4   %  NH
+.6  43.4  3.3  14   %  NJ
+.2  44.6  3.6   5   %  NM
+.5  36.9  2.8  29   %  NY
+.9  50.5  2.6  15   %  NC
+.1  56.5  3.9   3   %  ND
+.4  47.6  2.7  18   %  OH
+.9  65.8  3.8   7   %  OK
+.7  44.0  3.6   7   %  OR
+.6  46.5  2.6  20   %  PA
+.9  36.3  4.3   4   %  RI
+.3  56.0  4.6   9   %  SC
+.6  56.1  4.2   3   %  SD
+.4  57.7  3.9  11   %  TN
+.3  58.1  3.1  38   %  TX
+.8  70.5  4.1   6   %  UT
+.3  32.5  4.8   3   %  VT
+.8  48.6  2.5  13   %  VA
+.2  42.5  3.5  12   %  WA
+.4  57.4  4.7   5   %  WV
+.5  46.8  2.9  10   %  WI
+.9  67.6  6.0   3   %  WY
+</code>

channelflow.org

User Tools

Site Tools

Page Tools