====== Differences ====== This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
gibson:teaching:fall-2012:math445:lab8 [2012/11/02 12:06] gibson |
gibson:teaching:fall-2012:math445:lab8 [2012/11/06 06:24] (current) gibson [Math 445 Lab 8: Presidential election] |
||
---|---|---|---|
Line 20: | Line 20: | ||
electoral votes for each candidate in each election. Make a histogram that | electoral votes for each candidate in each election. Make a histogram that | ||
shows the statistical distribution of total electoral votes for one of the | shows the statistical distribution of total electoral votes for one of the | ||
- | candidates, using bins of width 5 between 0 and 539 (0-4 for bin 1, 5-9 for | + | candidates, using bins of width 10 between 0 and 540 (0-9.99 for bin 1, 10-14.99 for |
- | bin 2, etc). Color the bins corresponding to Romney wins red and the bins | + | bin 2, etc). If you can figure out how, color the bins corresponding to Romney |
- | corresponding to Obama wins blue. | + | wins red and the bins corresponding to Obama wins blue, or else just draw a vertical |
+ | line at the magic number of 270 electoral votes needed to win the election outright. | ||
+ | |||
+ | ===== Questions ===== | ||
Then answer the following questions | Then answer the following questions | ||
Line 28: | Line 31: | ||
- Who is most likely to win the presidential election? | - Who is most likely to win the presidential election? | ||
- What is the probability that the most likely winner will actually win? | - What is the probability that the most likely winner will actually win? | ||
- | - What is the most likely range of electoral votes for the winner? (among the bins of width 5 specified above) | + | - What is the most likely range of electoral votes for the winner? (among the bins of width 10 specified above) |
- | | + | - What is the likelihood of a 269-269 electoral vote tie? |
Turn in print-outs of your codes, your histogram, and your answers to the above questions. | Turn in print-outs of your codes, your histogram, and your answers to the above questions. | ||
- | Tips: | ||
- | * Start with a small number of simulated elections, say 100, and then increase to a large number (10,000 or 100,000) when you're confident your code is working correctly. | + | ===== Tips ===== |
- | * You can also develop your code using simulated data, for example, just ten states with 50-50 chances for each candidate and zero or a very small margin of error. | + | |
+ | |||
+ | * Start with a small number of simulated elections (say 100) and then increase to a large number (say 10,000) when you're confident your code is working correctly. | ||
+ | * You can also develop your code using simulated data, for example, just ten states all with the same polling numbers and a very small margin of error. | ||
+ | * Try to use as few for-loops as possible. If you are really on fire, you can do it with just one for-loop that loops over the number of trials. | ||
+ | * Changing the colors of histogram bins in Matlab is not as easy as one might hope. You'll need to take data returned from the **hist** function and replot it with the **bar** command. See http://www.mathworks.com/matlabcentral/newsreader/view_thread/290534 for an example of how to do this. | ||
+ | |||
+ | ===== Broader questions ===== | ||
Some further questions you might also address | Some further questions you might also address | ||
Line 43: | Line 53: | ||
* How many elections do you need to simulate in order to get reliable answers? | * How many elections do you need to simulate in order to get reliable answers? | ||
* The lab as written assumes a two-party presidential election. Should we include third-party candidates? Why or why not? How would you revise your code to include a third party? Would it change the results significantly? | * The lab as written assumes a two-party presidential election. Should we include third-party candidates? Why or why not? How would you revise your code to include a third party? Would it change the results significantly? | ||
- | * We are trusting that the polling data form an accurate estimate of the actual votes cast, to within the margins of error. Is this a valid assumption? Why or why not? | + | * We are trusting that the polling data form an accurate estimate of the actual votes cast, to within the margins of error. The data reported below was obtained from [[http://fivethirtyeight.blogs.nytimes.com/]], and is claimed by its compiler to be unbiased and statistically reliable estimate, though there is a fair amount of controversy about this, split along party and ideological lines. Do you think the given polling data is fair and accurate? Is there a reason to suspect it is or is not? |
* Do you believe your own election prediction? Why or why not? | * Do you believe your own election prediction? Why or why not? | ||
- | Relevant matlab commands; **rand**, **randn**, **sum**, **hist**, plus standard plotting commands such as **xlabel**, **ylabel**, **title**. | + | Relevant matlab commands; **rand**, **randn**, **sum**, **hist**, and **bar**, plus standard plotting commands such as **xlabel**, **ylabel**, **title**. |
+ | |||
+ | ===== Background ===== | ||
Nate Silver, a sports statistician, pioneered the use of Monte Carlo methods | Nate Silver, a sports statistician, pioneered the use of Monte Carlo methods | ||
- | in election prediction during the 2008 elections ([[http://fivethirtyeight.blogs.nytimes.com/]], [[http://en.wikipedia.org/wiki/FiveThirtyEight]]). In the 2008 elections, His model predicted 49 of 50 states correctly for the Presidential race (missing Indiana, which went to Obama by 1%) and all 35 Senate races correctly. Note that this lab does not cover the subtlest and most difficult aspect of election prediction: producing good composite poll numbers and margins of error from large numbers of pollsters using different methods, sample sizes, and polling dates. There is quite a bit of controversy in the current election over Mr. Silver's methods and his assessment that Obama has an 80% chance of winning the election. See, for example, [[http://cosmiclog.nbcnews.com/_news/2012/10/30/14809227-political-forecasts-stir-up-a-storm?lite]], | + | in election prediction during the 2008 elections ([[http://fivethirtyeight.blogs.nytimes.com/]], [[http://en.wikipedia.org/wiki/FiveThirtyEight]]). In the 2008 elections, His model predicted 49 of 50 states correctly for the Presidential race (missing Indiana, which went to Obama by 1%) and all 35 Senate races correctly. Note that this lab does not cover the subtlest and most difficult aspect of election prediction: producing good composite poll numbers and margins of error from large numbers of pollsters using different methods, sample sizes, and polling dates. There is quite a bit of controversy in the current election over Mr. Silver's methods and his assessment that Obama has an 91% chance of winning the election. See, for example, |
- | [[http://www.dailykos.com/story/2012/11/01/1153661/-Nate-Silver-s-Math-Based-Math]], or google "Nate Silver controversy". | + | |
+ | * [[http://cosmiclog.nbcnews.com/_news/2012/10/30/14809227-political-forecasts-stir-up-a-storm?lite]], | ||
+ | * [[http://www.dailykos.com/story/2012/11/01/1153661/-Nate-Silver-s-Math-Based-Math]] | ||
+ | * [[http://2012.talkingpointsmemo.com/2012/11/nate-silver-colbert-report-pundits.php?ref=fpnewsfeed|Nate Silver on Colbert the Colbert Report]] | ||
+ | * google:"Nate Silver controversy"| | ||
+ | ===== Data ===== | ||
Here's some current polling data, taken from [[http://fivethirtyeight.blogs.nytimes.com]] on 2012-11-01. You can load this into Matlab as a matrix ''P'' by cutting and pasting the data into a text file ''P.asc'' and running ''load P.asc'' within Matlab. If you don't believe this polling data, feel free to use something you trust more. | Here's some current polling data, taken from [[http://fivethirtyeight.blogs.nytimes.com]] on 2012-11-01. You can load this into Matlab as a matrix ''P'' by cutting and pasting the data into a text file ''P.asc'' and running ''load P.asc'' within Matlab. If you don't believe this polling data, feel free to use something you trust more. | ||
Line 58: | Line 74: | ||
% Composite Presidential election polling numbers | % Composite Presidential election polling numbers | ||
% from http://fivethirtyeight.blogs.nytimes.com | % from http://fivethirtyeight.blogs.nytimes.com | ||
- | % 2012-11-01 | + | % 2012-11-06 1am |
% | % | ||
% O == Obama percentage | % O == Obama percentage | ||
Line 66: | Line 82: | ||
% | % | ||
% O R M EV state | % O R M EV state | ||
- | 36.5 62.9 3.9 9 % AL | + | 36.8 62.7 3.8 9 % AL |
- | 38.1 60.1 6.3 3 % AK | + | 38.8 59.7 6.0 3 % AK |
- | 45.6 53.4 4.1 11 % AZ | + | 46.2 53.0 3.3 11 % AZ |
- | 37.9 60.3 4.1 6 % AR | + | 38.7 59.7 3.8 6 % AR |
- | 58.2 40.5 3.4 55 % CA | + | 58.2 40.5 2.9 55 % CA |
- | 50.1 48.9 3.5 9 % CO | + | 50.9 48.2 3.0 9 % CO |
- | 55.9 43.3 3.7 7 % CT | + | 56.7 42.4 3.3 7 % CT |
- | 59.4 39.8 5.7 3 % DE | + | 59.6 39.7 5.5 3 % DE |
- | 92.7 6.7 3.2 3 % DC | + | 93.1 6.3 3.2 3 % DC |
- | 49.4 50.1 3.2 29 % FL | + | 49.9 49.7 2.7 29 % FL |
- | 44.9 54.6 3.3 16 % GA | + | 45.5 54.1 2.7 16 % GA |
- | 67.0 31.9 4.9 4 % HA | + | 66.5 32.6 3.9 4 % HA |
- | 31.4 66.6 4.7 4 % ID | + | 32.2 66.1 4.4 4 % ID |
- | 59.3 40.1 3.4 20 % IL | + | 59.9 39.5 3.0 20 % IL |
- | 43.8 55.4 3.5 11 % IN | + | 45.3 53.9 3.0 11 % IN |
- | 51.0 48.0 3.8 6 % IA | + | 51.2 47.8 3.2 6 % IA |
- | 37.4 61.5 6.3 6 % KA | + | 38.0 61.0 6.1 6 % KA |
- | 39.9 59.2 4.7 8 % KY | + | 40.4 58.7 4.5 8 % KY |
- | 40.6 58.5 6.1 8 % LA | + | 39.4 59.8 3.5 8 % LA |
- | 56.1 42.5 4.5 4 % ME | + | 56.1 42.7 3.7 4 % ME |
- | 60.4 38.4 3.4 10 % MD | + | 61.0 38.0 3.0 10 % MD |
- | 58.9 39.7 4.4 11 % MA | + | 59.1 39.8 3.7 11 % MA |
- | 52.9 45.7 3.3 16 % MI | + | 53.1 45.8 2.7 16 % MI |
- | 52.8 45.9 3.3 10 % MN | + | 53.8 45.0 2.9 10 % MN |
- | 39.1 60.3 5.4 6 % MS | + | 39.4 60.1 5.3 6 % MS |
- | 45.1 54.1 3.3 10 % MO | + | 45.6 53.6 2.8 10 % MO |
- | 44.1 53.8 4.6 3 % MT | + | 45.3 53.1 3.9 3 % MT |
- | 39.2 60.1 3.6 5 % NE | + | 40.5 58.8 3.3 5 % NE |
- | 51.3 47.8 3.4 6 % NV | + | 51.9 47.2 2.9 6 % NV |
- | 51.0 48.2 4.1 4 % NH | + | 51.5 47.8 3.4 4 % NH |
- | 55.0 44.0 3.7 14 % NJ | + | 55.6 43.4 3.3 14 % NJ |
- | 53.9 45.0 4.0 5 % NM | + | 54.2 44.6 3.6 5 % NM |
- | 62.4 36.9 3.2 29 % NY | + | 62.5 36.9 2.8 29 % NY |
- | 48.4 51.0 3.0 15 % NC | + | 48.9 50.5 2.6 15 % NC |
- | 41.5 57.0 4.3 3 % ND | + | 42.1 56.5 3.9 3 % ND |
- | 50.7 48.1 3.2 18 % OH | + | 51.4 47.6 2.7 18 % OH |
- | 33.5 66.2 4.0 7 % OK | + | 33.9 65.8 3.8 7 % OK |
- | 52.7 44.8 4.0 7 % OR | + | 53.7 44.0 3.6 7 % OR |
- | 52.0 46.9 3.0 20 % PA | + | 52.6 46.5 2.6 20 % PA |
- | 62.2 36.4 4.9 4 % RI | + | 61.9 36.3 4.3 4 % RI |
- | 43.0 56.3 4.7 9 % SC | + | 43.3 56.0 4.6 9 % SC |
- | 41.2 57.5 4.9 3 % SD | + | 42.6 56.1 4.2 3 % SD |
- | 39.8 59.2 4.6 11 % TN | + | 41.4 57.7 3.9 11 % TN |
- | 41.7 57.8 3.6 38 % TX | + | 41.3 58.1 3.1 38 % TX |
- | 26.3 71.6 4.4 6 % UT | + | 27.8 70.5 4.1 6 % UT |
- | 65.6 33.0 5.2 3 % VT | + | 66.3 32.5 4.8 3 % VT |
- | 50.1 49.2 2.9 13 % VA | + | 50.8 48.6 2.5 13 % VA |
- | 55.6 43.1 3.9 12 % WA | + | 56.2 42.5 3.5 12 % WA |
- | 40.6 58.1 5.2 5 % WV | + | 41.4 57.4 4.7 5 % WV |
- | 51.6 47.6 3.4 10 % WI | + | 52.5 46.8 2.9 10 % WI |
- | 30.4 68.0 6.3 3 % WY | + | 30.9 67.6 6.0 3 % WY |
</code> | </code> |