class: center, middle, inverse, title-slide .title[ # PSY 503: Foundations of Statistics in Psych Science ] .subtitle[ ## Basics of Probability ] .author[ ### Jason Geller, Ph.D.Β (he/him/his) ] .institute[ ### Princeton University ] .date[ ### 2022-09-26 ] --- # Knowledge Check <div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'><iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='315' src='https://www.mentimeter.com/app/presentation/a545260831446a899d989a08b7445b38/34954ecab6bd/embed' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'></iframe></div> --- # Last Class - Measurement is hard, but so important - Make sure you understand different types of reliability: - Test-retest - Internal - Inter-rater - Make sure you understand different types of validity: - Construct - Face - Convergent - Divergent --- # Today - What is Probability? - Different ways of thinking about probability - Rules of probability --- # Probability Warm-up 1. What is probability of drawing the ace of spades from a fair deck of cards? 2. What is the probability of drawing an ace of any suit? 3. You are going to roll some dice twice. What is the chance you roll double 1s? 4. What is the chance that a live specimen of New Jersey Devil will be found? 5. Who is more likely to be a victim of a street robbery, a young man or an old lady? 6. Yesterday the whether forecaster said that there was a 30% chance of rain today, and it rained today. Was she right or wrong? --- # What is Probability Theory? <img src="prob.JPG" width="80%" height="20%" style="display: block; margin: auto;" /> --- # What is Probability Theory? <br> <br> Probability is the study of __random processes__ - Probability is used to characterize uncertainty/randomness <img src="probability-line.svg" width="50%" style="display: block; margin: auto;" /> --- # Random Processes: Intuition .pull-left[ - Let's flip a fair coin ```r set.seed(973) coinflips <- function(x) { flip <- rbinom(x, 1, 0.5) flip <- ifelse(flip==1, "Tails", "Heads") return(flip) } ``` 1. Can you tell me what the outcome will be? 2. If we were to flip a fair coin many many times, would you be able to tell the proportion of times that we would obtain heads? ] -- .pull-right[ <br> <br> - If answer to first question is "NO" AND - Answer to second question is "YES" THEN - You are dealing with a random process ] --- # Definition <br> <br> > Random processes are __mechanisms__ that produce outcomes... from __a world/set of possible__ outcomes... with some degree of __uncertainty__ but with __regularity__. --- # Probability Terminlogy - __Experiment__ or __Trial__: - Any activity that produces or observes an outcome - __Sample space:__ `\(\Omega\)` - The set of all possible outcomes - __Outcome:__ `\(\omega\)` - Possible realization of the random process - heads - __Event:__ `\(A\)`, `\(B\)`, `\(C\)`, etc. - A given outcome or set of outcomes - __Probability__: Proportion of outcomes favoring an event --- # Examples of Random Processes - Random assignment of `\(N\)` individuals to an experimental condition -- <br> - Random draw of a sample of `\(n\)` individuals from a population of `\(N\)` individuals -- <br> - Rolling a die --- # Illustration: Random Assignment - We randomly assigned an individual to a Treatment (T) vs. Control (C) - Sample space? -- - We could express `\(\Omega\)` in the following ways: - `\(\Omega = \{\mathrm{Treatment, \: Control\}}\)` - `\(\Omega = \{\mathrm{T, \: C\}}\)` -- - What if we assigned two individuals to Treatment (T) vs. Control (C) -- - `\(\Omega = \{\mathrm{TT, \:TC, \:CT, \:CC}\}\)` --- # Events - An _event_ is a subset of the sample space `\(\Omega\)` and corresponds to the realization of one or more than one outcomes `\(\omega\)` - Let `\(\Omega = \{\mathrm{TT, \:TC, \:CT, \:CC}\}\)` - We could let `\(A\)` be __event__ that both individuals are assigned to the same experimental condition - We could write: - `\(A = \{TT, \: CC\}\)` - Another example? --- # Notations <img src="images/prob_notation.png" width="100%" style="display: block; margin: auto;" /> --- # Practice with Events - We randomly assign 8 participants to T vs. C - Possible outcome: -- - `\(\omega = \mathrm{TTTTCCTC}\)` -- - Sample space: -- - Set of all possible strings of length 8 of T's and C's --- # Practice with Events - Let's __randomly__ generate a possible outcome `\(\omega_j\)` in R \vspace{.40cm} ```r sample(c("T", "C"), size = 8, replace = TRUE) ``` - In the background, does R draw from this sample space? -- - NO: Keep in mind that R draws an outcome `\(\omega_j\)` from `\(\Omega = \{T, C\}\)` 8 times in a row with replacement --- # Probability Warm-up - What is probability of drawing the ace of spades from a fair deck of cards? ```r ace=1/52 ace ``` ``` ## [1] 0.01923077 ``` - What is the probability of drawing an ace of any suit? ```r ace=4/52 ace ``` ``` ## [1] 0.07692308 ``` --- - You are going to roll some dice twice. What is the chance you roll double 1s? ```r dice1s <- 1/6*1/6 dice1s ``` ``` ## [1] 0.02777778 ``` --- - What is the chance that a live specimen of New Jersey Devil will be found? - 0% - Who is more likely to be a victim of a street robbery, a young man or an old lady? - old lady - Yesterday the whether forecaster said that there was a 30% chance of rain today, and it rained today. Was she right or wrong? - Depends --- class: center, inverse background-image: url("weather.png") --- # Different Ways of Thinking About Probability - Classic/Naive - **All outcomes are equally likely** Let `\(A\)` be an event with a finite sample space `\(\Omega\)`. The _naive probability_ of `\(A\)` is `\begin{equation} P(A) = \frac{|A|}{|\Omega|} \end{equation}` in which |A| is the number of possible outcomes `\(\omega\)` that satisfy A, and |$\Omega$| is the total number of possible outcomes `\(\omega\)` within `\(\Omega\)`. --- # Dice Rolls <img src="basics_of_probability_theory_files/figure-html/unnamed-chunk-9-1.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- # Wait, why is this naive? - Requires `\(\Omega\)` to be finite - Requires each possible outcome `\(\omega\)` to have the same weight - This can be misleading! --- # Wait, why is this naive? Is the assumption of equal probability realistic? -- - `\(d_1\)`: Watching a horror movie - `\(d_0\)`: Watching a neutral movie - `\(Y\)`: Fear response measured - Is their an equal probability of attrition in this study? -- - Online data collection --- # Different Ways of Thinking About Probability - Frequentist view - **Past Performance** - Relative frequency -> Proportion of times an event occurred out of all occasions it could have occurred <center> `\(P(A) = \frac{|f|}{*N*}\)` <center> - Where `\(f\)` = frequency of outcome and `\(N\)` = Total # - Over the long-run (many repetitions) what is the probability of X event? --- # Different Ways of Thinking About Probability - Empirical probability - Should we cross the bridge? <img src="bridge.png" width="25%" style="display: block; margin: auto;" /> $$ P(death) = \frac{P(number of deaths)}{P(total)} $$ --- # Coin Flips <iframe src="https://seeing-theory.brown.edu/basic-probability/index.html" width="100%" height="400px" data-external="1"></iframe> --- # Globe Toss --- # Different Ways of Thinking About Probability - Bayesian (Personal belief) - In what realistic setting would we actually perform the same experiment infinite times? - Many probability questions concern the outcome of a singular trial rather than hypothetical repeated trials, and decision makers with the same information may differ --- class: center, inverse background-image: url("fightclub.jpeg") # Rules of Probability --- # Probability Rules - Probabilities take values between 0 and 1 (inclusive) - For some event `\(A\)`: `$$0 \leq P(A) \leq 1$$` - Probability cannot be negative - Probability cannot be greater than 1 --- # Probability Rule # 2 - Since `\(\Omega\)` is the entire sample space, `$$P(\Omega) = 1$$` - e.g.,If you belong to one of three political parties then the sum of P(R), P(D) and P(I) = 1 --- # Probability Rule #3 (Subtraction) .pull-left[ - Complement - By definition $$ P(A) + P(A^c) = 1$$ - This implies $$ P(A^c) = 1 - P(A)$$ ] .pull-right[ <img src="complement-venndiagram.svg" width="100%" style="display: block; margin: auto;" /> ] --- # Probability Rule # 4 (Addition) - Addition Rule: If A and B are two events in a probability experiment, then the probability that either one of the events will occur is: .pull-left[ - Mutually Exclusive P(A or B) = P(A) + P(B) <img src="mutually-exclusive-venndiagram.svg" width="50%" style="display: block; margin: auto;" /> ] .pull-right[ - Non-Mutually Exclusive P(A or B) = P(A) + P(B)-(A and B) <img src="addition-rule-independent-venndiagram.webp" width="50%" style="display: block; margin: auto;" /> ] --- # Practice <template id="e00cf577-dc58-470c-8798-74f35af411bf"><style> .tabwid table{ border-spacing:0px !important; border-collapse:collapse; line-height:1; margin-left:auto; margin-right:auto; border-width: 0; display: table; margin-top: 1.275em; margin-bottom: 1.275em; border-color: transparent; } .tabwid_left table{ margin-left:0; } .tabwid_right table{ margin-right:0; } .tabwid td { padding: 0; } .tabwid a { text-decoration: none; } .tabwid thead { background-color: transparent; } .tabwid tfoot { background-color: transparent; } .tabwid table tr { background-color: transparent; } .katex-display { margin: 0 0 !important; } </style><div class="tabwid"><style>.cl-916dba28{}.cl-9169a046{font-family:'Helvetica';font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-9169b216{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-9169b220{margin:0;text-align:right;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-9169d7b4{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9169d7be{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9169d7c8{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9169d7c9{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9169d7d2{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9169d7d3{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}</style><table class='cl-916dba28'><thead><tr style="overflow-wrap:break-word;"><td class="cl-9169d7d2"><p class="cl-9169b216"><span class="cl-9169a046">Color</span></p></td><td class="cl-9169d7d3"><p class="cl-9169b220"><span class="cl-9169a046">Count</span></p></td></tr></thead><tbody><tr style="overflow-wrap:break-word;"><td class="cl-9169d7b4"><p class="cl-9169b216"><span class="cl-9169a046">Brown</span></p></td><td class="cl-9169d7be"><p class="cl-9169b220"><span class="cl-9169a046">13</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9169d7b4"><p class="cl-9169b216"><span class="cl-9169a046">Red</span></p></td><td class="cl-9169d7be"><p class="cl-9169b220"><span class="cl-9169a046">13</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9169d7b4"><p class="cl-9169b216"><span class="cl-9169a046">Yellow</span></p></td><td class="cl-9169d7be"><p class="cl-9169b220"><span class="cl-9169a046">14</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9169d7b4"><p class="cl-9169b216"><span class="cl-9169a046">Green</span></p></td><td class="cl-9169d7be"><p class="cl-9169b220"><span class="cl-9169a046">16</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9169d7b4"><p class="cl-9169b216"><span class="cl-9169a046">Orange</span></p></td><td class="cl-9169d7be"><p class="cl-9169b220"><span class="cl-9169a046">20</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9169d7c8"><p class="cl-9169b216"><span class="cl-9169a046">Blue</span></p></td><td class="cl-9169d7c9"><p class="cl-9169b220"><span class="cl-9169a046">24</span></p></td></tr></tbody></table></div></template> <div class="flextable-shadow-host" id="2d1d94dc-29d2-4686-bd6f-3256e0044abf"></div> <script> var dest = document.getElementById("2d1d94dc-29d2-4686-bd6f-3256e0044abf"); var template = document.getElementById("e00cf577-dc58-470c-8798-74f35af411bf"); var caption = template.content.querySelector("caption"); if(caption) { caption.style.cssText = "display:block;text-align:center;"; var newcapt = document.createElement("p"); newcapt.appendChild(caption) dest.parentNode.insertBefore(newcapt, dest.previousSibling); } var fantome = dest.attachShadow({mode: 'open'}); var templateContent = template.content; fantome.appendChild(templateContent); </script> *p*(blue or green) --- <template id="faf91536-13e6-46c8-9751-4f84c07f5555"><style> .tabwid table{ border-spacing:0px !important; border-collapse:collapse; line-height:1; margin-left:auto; margin-right:auto; border-width: 0; display: table; margin-top: 1.275em; margin-bottom: 1.275em; border-color: transparent; } .tabwid_left table{ margin-left:0; } .tabwid_right table{ margin-right:0; } .tabwid td { padding: 0; } .tabwid a { text-decoration: none; } .tabwid thead { background-color: transparent; } .tabwid tfoot { background-color: transparent; } .tabwid table tr { background-color: transparent; } .katex-display { margin: 0 0 !important; } </style><div class="tabwid"><style>.cl-9177d06c{}.cl-91747cb4{font-family:'Helvetica';font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-9174865a{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-9174865b{margin:0;text-align:right;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-9174a55e{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9174a568{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9174a569{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9174a572{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9174a573{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-9174a574{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}</style><table class='cl-9177d06c'><thead><tr style="overflow-wrap:break-word;"><td class="cl-9174a573"><p class="cl-9174865a"><span class="cl-91747cb4">Color</span></p></td><td class="cl-9174a574"><p class="cl-9174865b"><span class="cl-91747cb4">Count</span></p></td></tr></thead><tbody><tr style="overflow-wrap:break-word;"><td class="cl-9174a55e"><p class="cl-9174865a"><span class="cl-91747cb4">Brown</span></p></td><td class="cl-9174a568"><p class="cl-9174865b"><span class="cl-91747cb4">13</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9174a55e"><p class="cl-9174865a"><span class="cl-91747cb4">Red</span></p></td><td class="cl-9174a568"><p class="cl-9174865b"><span class="cl-91747cb4">13</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9174a55e"><p class="cl-9174865a"><span class="cl-91747cb4">Yellow</span></p></td><td class="cl-9174a568"><p class="cl-9174865b"><span class="cl-91747cb4">14</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9174a55e"><p class="cl-9174865a"><span class="cl-91747cb4">Green</span></p></td><td class="cl-9174a568"><p class="cl-9174865b"><span class="cl-91747cb4">16</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9174a55e"><p class="cl-9174865a"><span class="cl-91747cb4">Orange</span></p></td><td class="cl-9174a568"><p class="cl-9174865b"><span class="cl-91747cb4">20</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-9174a569"><p class="cl-9174865a"><span class="cl-91747cb4">Blue</span></p></td><td class="cl-9174a572"><p class="cl-9174865b"><span class="cl-91747cb4">24</span></p></td></tr></tbody></table></div></template> <div class="flextable-shadow-host" id="b2e5a541-5276-49a7-99ff-1b2c2c559aab"></div> <script> var dest = document.getElementById("b2e5a541-5276-49a7-99ff-1b2c2c559aab"); var template = document.getElementById("faf91536-13e6-46c8-9751-4f84c07f5555"); var caption = template.content.querySelector("caption"); if(caption) { caption.style.cssText = "display:block;text-align:center;"; var newcapt = document.createElement("p"); newcapt.appendChild(caption) dest.parentNode.insertBefore(newcapt, dest.previousSibling); } var fantome = dest.attachShadow({mode: 'open'}); var templateContent = template.content; fantome.appendChild(templateContent); </script> *p*(blue or green) ``` ## [1] 0.4 ``` --- # Union > The union of two sets encompasses any element that exists in either one or both of them. We can represent this visually as a venn diagram as shown. <img src="union-venndiagram.svg" width="50%" style="display: block; margin: auto;" /> --- # Intersection > The intersection between two sets encompasses any element that exists in BOTH sets and is often written out as <img src="intersection-venndiagram.svg" width="50%" style="display: block; margin: auto;" /> - Joint probability --- # Multiplication Rule - The multiplication rule is used to find the probability of two events, *A* and *B*, happening simultaneously. Dependent: `\begin{equation} P(A and B) = P(A)*P(B|A) \end{equation}` Independent: `\begin{equation} P(A and B) = P(A)*P(B) \end{equation}` --- # Independent Events - `\(A\)` and `\(B\)` are independent if the occurrence of `\(A\)` does not influence the occurrence of `\(B\)`, and if the occurrence of `\(B\)` does not influence the occurrence of `\(A\)`. If two events `\(A\)` and `\(B\)` are independent, knowing that `\(A\)` occurred does not inform the chances that `\(B\)` occurred. We have: `\begin{equation} P(A|B) = P(A) \end{equation}` `\begin{equation} P(B|A) = P(B) \end{equation}` --- # M&Ms <template id="82d08733-3822-48d4-a365-55857d81c557"><style> .tabwid table{ border-spacing:0px !important; border-collapse:collapse; line-height:1; margin-left:auto; margin-right:auto; border-width: 0; display: table; margin-top: 1.275em; margin-bottom: 1.275em; border-color: transparent; } .tabwid_left table{ margin-left:0; } .tabwid_right table{ margin-right:0; } .tabwid td { padding: 0; } .tabwid a { text-decoration: none; } .tabwid thead { background-color: transparent; } .tabwid tfoot { background-color: transparent; } .tabwid table tr { background-color: transparent; } .katex-display { margin: 0 0 !important; } </style><div class="tabwid"><style>.cl-91837048{}.cl-91801330{font-family:'Helvetica';font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-91801ccc{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-91801ccd{margin:0;text-align:right;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-91803b4e{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-91803b4f{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-91803b58{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-91803b59{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-91803b62{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-91803b63{width:54pt;background-color:transparent;vertical-align: middle;border-bottom: 2pt solid rgba(102, 102, 102, 1.00);border-top: 2pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}</style><table class='cl-91837048'><thead><tr style="overflow-wrap:break-word;"><td class="cl-91803b62"><p class="cl-91801ccc"><span class="cl-91801330">Color</span></p></td><td class="cl-91803b63"><p class="cl-91801ccd"><span class="cl-91801330">Count</span></p></td></tr></thead><tbody><tr style="overflow-wrap:break-word;"><td class="cl-91803b4e"><p class="cl-91801ccc"><span class="cl-91801330">Brown</span></p></td><td class="cl-91803b4f"><p class="cl-91801ccd"><span class="cl-91801330">13</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-91803b4e"><p class="cl-91801ccc"><span class="cl-91801330">Red</span></p></td><td class="cl-91803b4f"><p class="cl-91801ccd"><span class="cl-91801330">13</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-91803b4e"><p class="cl-91801ccc"><span class="cl-91801330">Yellow</span></p></td><td class="cl-91803b4f"><p class="cl-91801ccd"><span class="cl-91801330">14</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-91803b4e"><p class="cl-91801ccc"><span class="cl-91801330">Green</span></p></td><td class="cl-91803b4f"><p class="cl-91801ccd"><span class="cl-91801330">16</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-91803b4e"><p class="cl-91801ccc"><span class="cl-91801330">Orange</span></p></td><td class="cl-91803b4f"><p class="cl-91801ccd"><span class="cl-91801330">20</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-91803b58"><p class="cl-91801ccc"><span class="cl-91801330">Blue</span></p></td><td class="cl-91803b59"><p class="cl-91801ccd"><span class="cl-91801330">24</span></p></td></tr></tbody></table></div></template> <div class="flextable-shadow-host" id="d373de80-b59b-4698-885d-8cb1b30f0fa8"></div> <script> var dest = document.getElementById("d373de80-b59b-4698-885d-8cb1b30f0fa8"); var template = document.getElementById("82d08733-3822-48d4-a365-55857d81c557"); var caption = template.content.querySelector("caption"); if(caption) { caption.style.cssText = "display:block;text-align:center;"; var newcapt = document.createElement("p"); newcapt.appendChild(caption) dest.parentNode.insertBefore(newcapt, dest.previousSibling); } var fantome = dest.attachShadow({mode: 'open'}); var templateContent = template.content; fantome.appendChild(templateContent); </script> What is the *p*(blue and blue)? ```r 24/100*24/100 ``` ``` ## [1] 0.0576 ``` --- # Knowledge Check <div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'><iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='315' src='https://www.mentimeter.com/app/presentation/b31b2cd73fb1be1d91a21e800d2acdf0/973e8e360a76/embed' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'></iframe></div> --- # Practice with grant proposal You are about to send a grant proposal to an organization. While you read about the grant, you realize that your grant proposal will be sent to 5 different referees, who can be either social or cognitive psychologists. Imagine that for each grant proposal, the committee flips a coin five times and assigns the proposal to a social psychologist every time the flip returns heads, and to a cognitive psychologist every time the flip returns tails. Assume an infinite pool of social and cognitive psychologists. What are the chances that your grant proposal is assigned to 5 cognitive psychologists? --- # Practice with grant proposal Let `\(C_i\)` be the event that your grant proposal is assigned to a cognitive psychologist. Since the events are independent from each other, we have: `$$\begin{split} P(C_1 \: \cap C_2\: \cap C_3\: \cap C_4\: \cap C_5) &= P(C_1) \times P(C_2) \times P(C_3) \times P(C_4) \times P(C_5) \\ &= (\frac{1}{2})^5 \\ &= \frac{1}{32} \\ \end{split}$$` --- # Today - More fun with probability - Conditional probability - Bayes' Rule - Probability and Statistics - Probability density function (PDF) - Cumulative distribution function (CDF) - Computing conditional probabilities from data --- # Conditional Probablity - The likelihood of an event or outcome occurring, based on the occurrence of a previous event or outcome $$P(B|A) = \frac{P(A\: \cap \: B)}{P(A)} $$ - π(B|A) -> Conditional probability - π(A β© B)-> Joint probability - π(A) -> Marginal probability --- # Conditional Probability - __Marginal probability__: Probability of single event occurring independent of other events - __Joint probability__: Intersection (overlap) of A and B - __Conditional probability__: Likelihood that an outcome randomly sampled from the subset with π΅ has π΄ (i.e., conditional is opposed to marginal) - We would say βB given Aβ or B conditional on Aβ --- # Conditional Probability Practice A math teacher gave her class two tests. 25% of the class passed both tests and 42% of the class passed the first test. What percent of those who passed the first test also passed the second test? <center> `\(p(second β£ first)\)` -- - π(A β© B): .25 - π(A): .42 `\(p(second β£ first)\)` = .6 --- # Conditional Probability Practice I just got accepted to graduate school. The acceptance rate is 30%. Not everyone gets funding if they have been accepted (only 13% do). What is the probability I receive funding given that I was accepted? <center> `\(p(funding|accepted)\)` -- - π(A β© B): .13 - π(A): .3 -- `\(p(funding β£ accepted)\)` = .43 <center> --- # Bayes' Rule - Reversing a conditional probability allows us to find `\(P(A|B)\)` if we know `\(P(B|A)\)`: - Bayes' rule: `\begin{equation} P(A|B) = \frac{P(B|A)P(A)}{P(B)} \end{equation}` `\begin{equation} P(A|B) = \frac{P(B|A)*P(A)}{P(B|A)*P(A) + P(B|\neg A)*P(\neg A)} \end{equation}` --- # Bayes' Rule - Allows us to update the probability of an event `\(A\)` based on the occurrence of another event - `\(P(A)\)` is called the _prior probability_ - `\(P(A|B)\)` is called the _posterior probability_ --- # Monty Hall Problem <center> <iframe width="600" height="400" src="https://www.youtube.com/embed/AD6eJlbFa2I" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- # Monty Hall <img src="images/monty.png" width="70%" style="display: block; margin: auto;" /> - The winning strategy is to switch, but how is this possible? - Our intuition tells us our chance of winning the car increases from 1β3 to 1β2 when there are only two doors to choose from - In reality, our chance of winning the car remains 1β3 if we stick with our original choice, but increases to 2β3 if we switch --- # Monty Hall Simulations ```r monty <- function() { prize <- sample(x = 1:3, size = 1, replace = TRUE) choice <- sample(x = 1:3, size = 1, replace = TRUE) monty <- sample(x = c(1:3)[-c(choice, prize)], size = 1, replace = TRUE) return(ifelse(prize != choice, yes = "Switch", no = "Stick")) } monty ``` ``` ## function() { ## prize <- sample(x = 1:3, size = 1, replace = TRUE) ## choice <- sample(x = 1:3, size = 1, replace = TRUE) ## monty <- sample(x = c(1:3)[-c(choice, prize)], size = 1, replace = TRUE) ## return(ifelse(prize != choice, yes = "Switch", no = "Stick")) ## } ``` --- ```r run <- rep(NA, 100000) for (i in 1:100000) { run[i] <- monty() } prop.table(table(run)) ``` ``` ## run ## Stick Switch ## 0.33393 0.66607 ``` ```r ## strategy ## Stick Switch ## 0.33147 0.66853 ``` --- # Illustration: Bayes' Rule Doctors recommend getting a PSA test after 50 to screen for prostate cancer - If you tested positive for prostate cancer, what is the chance you actually have it? 1. 80% of the people who test positive have prostate cancer `\((\text{sensitivity} = P(\text{positive test| disease}))\)` 2. 70% of the people who have a negative test do not have cancer `\((\text{specificity} = P(\text{negative test|no disease}))\)` 3. 5% of individuals over 60 have prostate cancer --- # Illustration: Prostate Cancer <br> <br> <br> <center> `\(P(\text{cancer|test}) = \frac{P(\text{test|cancer})*P(\text{cancer})}{P(\text{test|cancer})*P(\text{cancer}) + P(\text{test|}\neg\text{cancer})*P(\neg\text{cancer})}\)` <br> <br> -- = `\(\frac{0.8*0.058}{0.8*0.058 +0.3*0.942 }=\)` -- 0.14 <center> --- # Class Activity
−
+
03
:
00
Suppose there is a disease outbreak in an enclosed population. It is turning folks into zombies. - Your friend tested positive. How likely is it that they are a zombie? 1. 99% of the people who test positive have Zombie Virus `\((\text{sensitivity} = P(\text{positive test| zombie}))\)` 2. 85% of the people who have a negative test do not have Zombie virus `\((\text{specificity} = P(\text{negative test|not zombie}))\)` 3. 15% of individuals are zombies --- # Bayes' Rule: Zombie Outbreak <br> <br> <br> <center> `\(P(\text{zombie|test}) = \frac{P(\text{test|zombie})*P(\text{zombie})}{P(\text{test|zombie})*P(\text{zombie}) + P(\text{test|}\neg\text{zombie})*P(\neg\text{zombie})}\)` -- <br> <br> = `\(\frac{0.99*0.15}{0.99*0.15 +0.15*0.85 } = 0.53\)` <center> --- # Lessons from Bayes' rule - Based on the results of this test, the probability that your friend actually is a zombie is .54 - That's a 54% chance of being zombie - What would you do? - Bayes' rule often yields counter-intuitive results! - Importance of base rates --- # Base-rate neglect <img src="images/base.png" width="70%" style="display: block; margin: auto;" /> --- class: center middle # Probability Theory vs. Statistical Inference --- # Probability Theory - For any given random phenomenon, probability theory is a set of tools that assume prior knowledge of: - The sample space - The probability of a set of events defined on that sample space - Allows you to find the probability of any other possible event from that sample space --- # Problem - We usually don't know the probability model - OK, we can find the probability of every outcome in the sample space by observing many many repetitions - BUT most random phenomena cannot be repeated again, again, and again - We generally need to infer the probability of each possible outcome using information on a few realizations of the random phenomenon of interest --- # Probability and Statistics - By knowing your population makeup, you have a better idea of the probability of obtaining certain samples. - Probability links population with samples. - Inferential statistics rely on this connection when they use sample data as the basis for making conclusions about populations. <img src="unnamed.png" width="100%" style="display: block; margin: auto;" /> --- # Probability Distributions .pull-left[ - Probability density function (PDF) - Indicates the probability of observing a measurement with specific value `$$f(x|\mu,\sigma) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$` - e.g., Where does an IQ of 140 lie? ] .pull-right[ <br> <br> <img src="basics_of_probability_theory_files/figure-html/unnamed-chunk-28-1.png" width="100%" style="display: block; margin: auto;" /> ] ??? discreate= fin cont=infin --- # Probability Distributions - Cumulative distribution function (CDF) - `\(X <= x\)` (less than or equal) - E.g., Is IQ less than or equal given value <img src="basics_of_probability_theory_files/figure-html/unnamed-chunk-29-1.png" width="80%" style="display: block; margin: auto;" /> --- # R - PDF ```r dnorm(x, # X-axis values (grid) mean = 0, # Integer or vector representing the mean/s sd = 1, # Integer or vector representing the standard deviation/s ) ``` - CDF ```r pnorm() ``` --- class: middle center # In-Class Analysis --- # Data - Florida voter registration data ```r library(here) voter=read.csv(here::here("static","slides","05-Probability","data", "florida-voters.csv")) voter <- na.omit(voter) voter %>% glimpse() ``` ``` ## Rows: 9,113 ## Columns: 6 ## $ surname <chr> "PIEDRA", "LYNCH", "LATHROP", "HUMMEL", "CHRISTISON", "HOMAN",β¦ ## $ county <int> 115, 115, 115, 115, 115, 115, 115, 1, 1, 115, 115, 115, 115, 1β¦ ## $ VTD <int> 66, 13, 80, 8, 55, 84, 48, 41, 39, 26, 45, 11, 48, 88, 25, 82,β¦ ## $ age <int> 58, 51, 54, 77, 49, 77, 34, 56, 60, 44, 45, 80, 83, 55, 33, 63β¦ ## $ gender <chr> "f", "m", "m", "f", "m", "f", "f", "f", "m", "m", "f", "m", "fβ¦ ## $ race <chr> "white", "white", "white", "white", "white", "white", "white",β¦ ``` --- # Data: Setup ```r library(kableExtra) head(voter) %>% kable(align = "cccccc")%>% kable_material_dark() ``` <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> surname </th> <th style="text-align:center;"> county </th> <th style="text-align:center;"> VTD </th> <th style="text-align:center;"> age </th> <th style="text-align:center;"> gender </th> <th style="text-align:center;"> race </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 1 </td> <td style="text-align:center;"> PIEDRA </td> <td style="text-align:center;"> 115 </td> <td style="text-align:center;"> 66 </td> <td style="text-align:center;"> 58 </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> white </td> </tr> <tr> <td style="text-align:left;"> 2 </td> <td style="text-align:center;"> LYNCH </td> <td style="text-align:center;"> 115 </td> <td style="text-align:center;"> 13 </td> <td style="text-align:center;"> 51 </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> white </td> </tr> <tr> <td style="text-align:left;"> 4 </td> <td style="text-align:center;"> LATHROP </td> <td style="text-align:center;"> 115 </td> <td style="text-align:center;"> 80 </td> <td style="text-align:center;"> 54 </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> white </td> </tr> <tr> <td style="text-align:left;"> 5 </td> <td style="text-align:center;"> HUMMEL </td> <td style="text-align:center;"> 115 </td> <td style="text-align:center;"> 8 </td> <td style="text-align:center;"> 77 </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> white </td> </tr> <tr> <td style="text-align:left;"> 6 </td> <td style="text-align:center;"> CHRISTISON </td> <td style="text-align:center;"> 115 </td> <td style="text-align:center;"> 55 </td> <td style="text-align:center;"> 49 </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> white </td> </tr> <tr> <td style="text-align:left;"> 7 </td> <td style="text-align:center;"> HOMAN </td> <td style="text-align:center;"> 115 </td> <td style="text-align:center;"> 84 </td> <td style="text-align:center;"> 77 </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> white </td> </tr> </tbody> </table> --- # Marginal Probabilties - What are these again? - The probability of an event irrespective of the outcomes -- ```r marg.race <- voter %>% count(race)%>% mutate(prop=prop.table(n)) ``` --- <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:center;"> race </th> <th style="text-align:center;"> n </th> <th style="text-align:center;"> prop </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> asian </td> <td style="text-align:center;"> 175 </td> <td style="text-align:center;"> 0.0192033 </td> </tr> <tr> <td style="text-align:center;"> black </td> <td style="text-align:center;"> 1194 </td> <td style="text-align:center;"> 0.1310216 </td> </tr> <tr> <td style="text-align:center;"> hispanic </td> <td style="text-align:center;"> 1192 </td> <td style="text-align:center;"> 0.1308022 </td> </tr> <tr> <td style="text-align:center;"> native </td> <td style="text-align:center;"> 29 </td> <td style="text-align:center;"> 0.0031823 </td> </tr> <tr> <td style="text-align:center;"> other </td> <td style="text-align:center;"> 310 </td> <td style="text-align:center;"> 0.0340173 </td> </tr> <tr> <td style="text-align:center;"> white </td> <td style="text-align:center;"> 6213 </td> <td style="text-align:center;"> 0.6817733 </td> </tr> </tbody> </table> --- # Gender ```r marg.gender <- voter %>% group_by(gender) %>% summarise(n=n())%>% mutate(freq=n/sum(n)) marg.gender %>% kable(align = "cccccc")%>% kable_material_dark() ``` <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:center;"> gender </th> <th style="text-align:center;"> n </th> <th style="text-align:center;"> freq </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 4883 </td> <td style="text-align:center;"> 0.5358279 </td> </tr> <tr> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 4230 </td> <td style="text-align:center;"> 0.4641721 </td> </tr> </tbody> </table> --- # Conditional Probability $$ P(black|male) = $$ --- <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:center;"> race </th> <th style="text-align:center;"> gender </th> <th style="text-align:center;"> n </th> <th style="text-align:center;"> prop </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> asian </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 92 </td> <td style="text-align:center;"> 0.0217494 </td> </tr> <tr> <td style="text-align:center;"> black </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 516 </td> <td style="text-align:center;"> 0.1219858 </td> </tr> <tr> <td style="text-align:center;"> hispanic </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 526 </td> <td style="text-align:center;"> 0.1243499 </td> </tr> <tr> <td style="text-align:center;"> native </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 12 </td> <td style="text-align:center;"> 0.0028369 </td> </tr> <tr> <td style="text-align:center;"> other </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 152 </td> <td style="text-align:center;"> 0.0359338 </td> </tr> <tr> <td style="text-align:center;"> white </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 2932 </td> <td style="text-align:center;"> 0.6931442 </td> </tr> </tbody> </table> --- # Joint Probability $$ P(black \cap male) $$ ```r library(janitor) joint <- voter %>% select(race, gender) %>% group_by(race, gender) %>% count(race, gender) %>% ungroup() %>% mutate(total=sum(n), prop=n/total) ``` --- <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:center;"> race </th> <th style="text-align:center;"> gender </th> <th style="text-align:center;"> n </th> <th style="text-align:center;"> total </th> <th style="text-align:center;"> prop </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> asian </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 83 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0091079 </td> </tr> <tr> <td style="text-align:center;"> asian </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 92 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0100955 </td> </tr> <tr> <td style="text-align:center;"> black </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 678 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0743992 </td> </tr> <tr> <td style="text-align:center;"> black </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 516 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0566224 </td> </tr> <tr> <td style="text-align:center;"> hispanic </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 666 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0730824 </td> </tr> <tr> <td style="text-align:center;"> hispanic </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 526 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0577197 </td> </tr> <tr> <td style="text-align:center;"> native </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 17 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0018655 </td> </tr> <tr> <td style="text-align:center;"> native </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 12 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0013168 </td> </tr> <tr> <td style="text-align:center;"> other </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 158 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0173379 </td> </tr> <tr> <td style="text-align:center;"> other </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 152 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0166795 </td> </tr> <tr> <td style="text-align:center;"> white </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 3281 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.3600351 </td> </tr> <tr> <td style="text-align:center;"> white </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 2932 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.3217382 </td> </tr> </tbody> </table> --- # Data: Independance $$ P(black \cap male) $$ `$$p(blackβ©male)=p(black)Γp(male)$$` ```r #Are race and gender independent? Recall that two #events are independent if and only if, for example: marg.race <- voter %>% group_by(race)%>% tabyl(race) marg.gender <- voter %>% group_by(gender)%>% tabyl(gender) ``` --- <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> race </th> <th style="text-align:right;"> n </th> <th style="text-align:right;"> percent </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> asian </td> <td style="text-align:right;"> 175 </td> <td style="text-align:right;"> 0.0192033 </td> </tr> <tr> <td style="text-align:left;"> black </td> <td style="text-align:right;"> 1194 </td> <td style="text-align:right;"> 0.1310216 </td> </tr> <tr> <td style="text-align:left;"> hispanic </td> <td style="text-align:right;"> 1192 </td> <td style="text-align:right;"> 0.1308022 </td> </tr> <tr> <td style="text-align:left;"> native </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 0.0031823 </td> </tr> <tr> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 310 </td> <td style="text-align:right;"> 0.0340173 </td> </tr> <tr> <td style="text-align:left;"> white </td> <td style="text-align:right;"> 6213 </td> <td style="text-align:right;"> 0.6817733 </td> </tr> </tbody> </table> --- ```r marg.gender %>% kable() %>% kable_material_dark() ``` <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> gender </th> <th style="text-align:right;"> n </th> <th style="text-align:right;"> percent </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> f </td> <td style="text-align:right;"> 4883 </td> <td style="text-align:right;"> 0.5358279 </td> </tr> <tr> <td style="text-align:left;"> m </td> <td style="text-align:right;"> 4230 </td> <td style="text-align:right;"> 0.4641721 </td> </tr> </tbody> </table> ```r 0.13*0.464 ``` ``` ## [1] 0.06032 ``` --- ```r voter %>% select(race, gender) %>% group_by(race, gender) %>% count(race, gender) %>% ungroup() %>% mutate(total=sum(n), prop=n/total) %>% kable(align = "cccccc") %>% kable_material_dark() ``` <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:center;"> race </th> <th style="text-align:center;"> gender </th> <th style="text-align:center;"> n </th> <th style="text-align:center;"> total </th> <th style="text-align:center;"> prop </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> asian </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 83 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0091079 </td> </tr> <tr> <td style="text-align:center;"> asian </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 92 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0100955 </td> </tr> <tr> <td style="text-align:center;"> black </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 678 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0743992 </td> </tr> <tr> <td style="text-align:center;"> black </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 516 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0566224 </td> </tr> <tr> <td style="text-align:center;"> hispanic </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 666 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0730824 </td> </tr> <tr> <td style="text-align:center;"> hispanic </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 526 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0577197 </td> </tr> <tr> <td style="text-align:center;"> native </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 17 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0018655 </td> </tr> <tr> <td style="text-align:center;"> native </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 12 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0013168 </td> </tr> <tr> <td style="text-align:center;"> other </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 158 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0173379 </td> </tr> <tr> <td style="text-align:center;"> other </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 152 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.0166795 </td> </tr> <tr> <td style="text-align:center;"> white </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 3281 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.3600351 </td> </tr> <tr> <td style="text-align:center;"> white </td> <td style="text-align:center;"> m </td> <td style="text-align:center;"> 2932 </td> <td style="text-align:center;"> 9113 </td> <td style="text-align:center;"> 0.3217382 </td> </tr> </tbody> </table> --- # Your Turn 1. What is the conditional probability: `\((P(black|female))\)` ```r library(tidyverse) data <- read_csv("https://raw.githubusercontent.com/jgeller112/psy503-psych_stats/master/static/slides/05-Probability/data/florida-voters.csv") ``` --- ```r cond_racegender <- voter %>% dplyr::filter(gender=="f")%>% dplyr::count(race, gender) %>% dplyr::mutate(prop=n / sum(n))%>% kable(align = "cccccc")%>% kable_material_dark() cond_racegender ``` <table class=" lightable-material-dark" style='font-family: "Source Sans Pro", helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:center;"> race </th> <th style="text-align:center;"> gender </th> <th style="text-align:center;"> n </th> <th style="text-align:center;"> prop </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> asian </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 83 </td> <td style="text-align:center;"> 0.0169977 </td> </tr> <tr> <td style="text-align:center;"> black </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 678 </td> <td style="text-align:center;"> 0.1388491 </td> </tr> <tr> <td style="text-align:center;"> hispanic </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 666 </td> <td style="text-align:center;"> 0.1363916 </td> </tr> <tr> <td style="text-align:center;"> native </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 17 </td> <td style="text-align:center;"> 0.0034815 </td> </tr> <tr> <td style="text-align:center;"> other </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 158 </td> <td style="text-align:center;"> 0.0323572 </td> </tr> <tr> <td style="text-align:center;"> white </td> <td style="text-align:center;"> f </td> <td style="text-align:center;"> 3281 </td> <td style="text-align:center;"> 0.6719230 </td> </tr> </tbody> </table>