User Tools

Site Tools


gibson:teaching:fall-2012:math445:lab8

====== Differences ====== This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
gibson:teaching:fall-2012:math445:lab8 [2012/11/01 09:40]
gibson created
gibson:teaching:fall-2012:math445:lab8 [2012/11/06 06:24] (current)
gibson [Math 445 Lab 8: Presidential election]
Line 1: Line 1:
-====== Math 445 Lab 8; Presidential election ====== 
  
-Your job is to predict the outcome of today'​s ​Presidential election, given the last-minute polling data. +====== Math 445 Lab 8: Presidential election ​======
  
-Specifically,​ given a list of states, their electoral votes, ​the composite polling percentages for each  +Your job is to predict ​the outcome ​of today's Presidential ​election  
-candidate, and the margins ​of error those polling percentages,​ you are to run a large number (''​Nelections''​)  +given the last-minute polling data, using Monte Carlo simulation.
-of simulations of the election. For each state, award each of the two candidates the specified composite ​ +
-polling percent plus +
-a random number in the range between ​-margin and +marginThen award that state'​s electoral votes to the candidate +
-with the larger percentage of votes. Add up all the electoral votes for each candidate, and award the nth election ​+
  
 +Specifically,​ given a list of states, their electoral votes, the composite ​
 +polling percentages for each candidate, and the margins of error those 
 +polling percentages,​ you are to run a large number of simulations of the 
 +election and determine the likelihood that either candidate will win based 
 +on the results of those simulations. For each state, start by assigning the 
 +specified composite polling percentages to the two candidates. Then add to
 +each candidate'​s percentage a different random number in the range between ​
 +''​-margin''​ and ''​+margin''​. Compare the two percentages and award that state'​s ​
 +electoral votes to the candidate with the larger percentage of votes. Do this 
 +for all fifty states (plus DC), add up all the electoral votes for each candidate, ​
 +and award the ''​n''​th election to the candidate with the majority of electoral votes. ​
  
 +Run a large number of such simulated elections, keeping track of the number of 
 +electoral votes for each candidate in each election. Make a histogram that 
 +shows the statistical distribution of total electoral votes for one of the 
 +candidates, using bins of width 10 between 0 and 540 (0-9.99 for bin 1, 10-14.99 for 
 +bin 2, etc). If you can figure out how, color the bins corresponding to Romney ​
 +wins red and the bins corresponding to Obama wins blue, or else just draw a vertical ​
 +line at the magic number of 270 electoral votes needed to win the election outright.
  
 +===== Questions =====
 +
 +Then answer the following questions
 +
 +  - Who is most likely to win the presidential election?
 +  - What is the probability that the most likely winner will actually win?
 +  - What is the most likely range of electoral votes for the winner? (among the bins of width 10 specified above)
 +  - What is the likelihood of a 269-269 electoral vote tie? 
 +
 +Turn in print-outs of your codes, your histogram, and your answers to the above questions. ​
 +
 +
 +===== Tips =====
 +
 +
 +  * Start with a small number of simulated elections (say 100) and then increase to a large number (say 10,000) when you're confident your code is working correctly. ​
 +  * You can also develop your code using simulated data, for example, just ten states all with the same polling numbers and a very small margin of error.
 +  * Try to use as few for-loops as possible. If you are really on fire, you can do it with just one for-loop that loops over the number of trials. ​
 +  * Changing the colors of histogram bins in Matlab is not as easy as one might hope. You'll need to take data returned from the **hist** function and replot it with the **bar** command. See http://​www.mathworks.com/​matlabcentral/​newsreader/​view_thread/​290534 for an example of how to do this.
 +
 +===== Broader questions =====
 +
 +Some further questions you might also address
 +
 +  * The margins of error reported in the table are really 95% confidence levels, corresponding to two standard deviations of a Gaussian distribution. Modify your code so that the random number added to each percentage is from a Gaussian distribution with standard deviation of one-half the margin of error. ​ Does this significantly change your results?
 +  * Does doubling or halving the margins of error significantly change your results?
 +  * How many elections do you need to simulate in order to get reliable answers?
 +  * The lab as written assumes a two-party presidential election. Should we include third-party candidates? Why or why not? How would you revise your code to include a third party? Would it change the results significantly?​
 +  * We are trusting that the polling data form an accurate estimate of the actual votes cast, to within the margins of error. The data reported below was obtained from [[http://​fivethirtyeight.blogs.nytimes.com/​]],​ and is claimed by its compiler to be unbiased and statistically reliable estimate, though there is a fair amount of controversy about this, split along party and ideological lines. Do you think the given polling data is fair and accurate? Is there a reason to suspect it is or is not?
 +  * Do you believe your own election prediction? Why or why not?
 +
 +
 +Relevant matlab commands; **rand**, **randn**, **sum**, **hist**, and **bar**, plus standard plotting commands such as **xlabel**, **ylabel**, **title**. ​
 +
 +===== Background =====
 +
 +Nate Silver, a sports statistician,​ pioneered the use of Monte Carlo methods ​
 +in election prediction during the 2008 elections ([[http://​fivethirtyeight.blogs.nytimes.com/​]],​ [[http://​en.wikipedia.org/​wiki/​FiveThirtyEight]]). In the 2008 elections, His model predicted 49 of 50 states correctly for the Presidential race (missing Indiana, which went to Obama by 1%) and all 35 Senate races correctly. Note that this lab does not cover the subtlest and most difficult aspect of election prediction: producing good composite poll numbers and margins of error from large numbers of pollsters using different methods, sample sizes, and polling dates. There is quite a bit of controversy in the current election over Mr. Silver'​s methods and his assessment that Obama has an 91% chance of winning the election. See, for example, ​
 +
 +  * [[http://​cosmiclog.nbcnews.com/​_news/​2012/​10/​30/​14809227-political-forecasts-stir-up-a-storm?​lite]], ​
 +  * [[http://​www.dailykos.com/​story/​2012/​11/​01/​1153661/​-Nate-Silver-s-Math-Based-Math]]
 +  * [[http://​2012.talkingpointsmemo.com/​2012/​11/​nate-silver-colbert-report-pundits.php?​ref=fpnewsfeed|Nate Silver on Colbert ​  the Colbert Report]]
 +  * google:"​Nate Silver controversy"​| ​
 +===== Data =====
 +
 +Here's some current polling data, taken from [[http://​fivethirtyeight.blogs.nytimes.com]] on 2012-11-01. You can load this into Matlab as a matrix ''​P''​ by cutting and pasting the data into a text file ''​P.asc''​ and running ''​load P.asc''​ within Matlab. If you don't believe this polling data, feel free to use something you trust more. 
 +<​code>​
 +% Composite Presidential election polling numbers
 +% from http://​fivethirtyeight.blogs.nytimes.com
 +% 2012-11-06 1am
 +%
 +%  O == Obama percentage ​
 +%  R == Romney percentage
 +%  M == margin of error
 +% EV == electoral votes
 +%
 +% O    R    M    EV      state
 +36.8  62.7  3.8   ​9 ​  ​% ​ AL
 +38.8  59.7  6.0   ​3 ​  ​% ​ AK
 +46.2  53.0  3.3  11   ​% ​ AZ
 +38.7  59.7  3.8   ​6 ​  ​% ​ AR
 +58.2  40.5  2.9  55   ​% ​ CA
 +50.9  48.2  3.0   ​9 ​  ​% ​ CO
 +56.7  42.4  3.3   ​7 ​  ​% ​ CT
 +59.6  39.7  5.5   ​3 ​  ​% ​ DE
 +93.1   ​6.3 ​ 3.2   ​3 ​  ​% ​ DC  ​
 +49.9  49.7  2.7  29   ​% ​ FL
 +45.5  54.1  2.7  16   ​% ​ GA
 +66.5  32.6  3.9   ​4 ​  ​% ​ HA
 +32.2  66.1  4.4   ​4 ​  ​% ​ ID
 +59.9  39.5  3.0  20   ​% ​ IL
 +45.3  53.9  3.0  11   ​% ​ IN
 +51.2  47.8  3.2   ​6 ​  ​% ​ IA
 +38.0  61.0  6.1   ​6 ​  ​% ​ KA
 +40.4  58.7  4.5   ​8 ​  ​% ​ KY
 +39.4  59.8  3.5   ​8 ​  ​% ​ LA
 +56.1  42.7  3.7   ​4 ​  ​% ​ ME
 +61.0  38.0  3.0  10   ​% ​ MD
 +59.1  39.8  3.7  11   ​% ​ MA
 +53.1  45.8  2.7  16   ​% ​ MI
 +53.8  45.0  2.9  10   ​% ​ MN
 +39.4  60.1  5.3   ​6 ​  ​% ​ MS
 +45.6  53.6  2.8  10   ​% ​ MO
 +45.3  53.1  3.9   ​3 ​  ​% ​ MT
 +40.5  58.8  3.3   ​5 ​  ​% ​ NE
 +51.9  47.2  2.9   ​6 ​  ​% ​ NV
 +51.5  47.8  3.4   ​4 ​  ​% ​ NH
 +55.6  43.4  3.3  14   ​% ​ NJ
 +54.2  44.6  3.6   ​5 ​  ​% ​ NM
 +62.5  36.9  2.8  29   ​% ​ NY
 +48.9  50.5  2.6  15   ​% ​ NC  ​
 +42.1  56.5  3.9   ​3 ​  ​% ​ ND
 +51.4  47.6  2.7  18   ​% ​ OH
 +33.9  65.8  3.8   ​7 ​  ​% ​ OK
 +53.7  44.0  3.6   ​7 ​  ​% ​ OR
 +52.6  46.5  2.6  20   ​% ​ PA
 +61.9  36.3  4.3   ​4 ​  ​% ​ RI
 +43.3  56.0  4.6   ​9 ​  ​% ​ SC
 +42.6  56.1  4.2   ​3 ​  ​% ​ SD
 +41.4  57.7  3.9  11   ​% ​ TN
 +41.3  58.1  3.1  38   ​% ​ TX
 +27.8  70.5  4.1   ​6 ​  ​% ​ UT
 +66.3  32.5  4.8   ​3 ​  ​% ​ VT
 +50.8  48.6  2.5  13   ​% ​ VA
 +56.2  42.5  3.5  12   ​% ​ WA
 +41.4  57.4  4.7   ​5 ​  ​% ​ WV
 +52.5  46.8  2.9  10   ​% ​ WI
 +30.9  67.6  6.0   ​3 ​  ​% ​ WY
 +</​code>​
gibson/teaching/fall-2012/math445/lab8.1351788057.txt.gz · Last modified: 2012/11/01 09:40 by gibson