Differences

This shows you the differences between two versions of the page.

--- b:head_first_statistics:using_discrete_probability_distributions [2019/10/14 03:44] – [Fat Dan changed his prices] hkimscil
+++ b:head_first_statistics:using_discrete_probability_distributions [2020/10/05 10:38] – hkimscil
@@ Line 1: / Line 1: @@
 ====== using discrete probability distributions ======
-{{:b:head_first_statistics:pasted:20190930-104212.png?400|}}
+{{:b:head_first_statistics:pasted:20190930-104212.png?600|}}
 <WRAP clear />
@@ Line 277: / Line 277: @@
 > </code>
-<WRAP box help>
 <WRAP col2>
-Q: So expectation is a lot like the
+Q: So expectation is a lot like the mean. Is there anything for probability distributions that's like the median or mode?
-mean. Is there anything for probability
+A: You can work out the most likely probability, which would be a bit like the mode, but you won't normally have to do this. When it comes to probability distributions, the measure that statisticians are most interested in is the expectation.
-distributions that’s like the median or
-mode?
-A: You can work out the most likely
-probability, which would be a bit like the
-mode, but you won’t normally have to do this.
-When it comes to probability distributions,
-the measure that statisticians are most
-interested in is the expectation.
-Q: Shouldn’t the expectation be one of
+Q: Shouldn't the expectation be one of the values that X can take?
-the values that X can take?
+A: It doesn't have to be. Just as the mean of a set of values isn't necessarily the same as one of the values, the expectation of a probability distribution isn't necessarily one of the values X can take.
-A: It doesn’t have to be. Just as the mean
-of a set of values isn’t necessarily the same
-as one of the values, the expectation of a
-probability distribution isn’t necessarily one
-of the values X can take.
-Q: Are the variance and standard
+Q: Are the variance and standard deviation the same as we had before when we were dealing with values?
-deviation the same as we had before
+A: They're the same, except that this time we're dealing with probability distributions. The variance and standard deviation of a set of values are ways of measuring how far values are spread out from the mean. The variance and standard deviation of a probability distribution measure how the probabilities of particular values are dispersed.
-when we were dealing with values?
-A: They’re the same, except that this time
-we’re dealing with probability distributions.
-The variance and standard deviation of a
-set of values are ways of measuring how
-far values are spread out from the mean.
-The variance and standard deviation of
-a probability distribution measure how
-the probabilities of particular values are
-dispersed.
-Q: I find the concept of E(X - μ)2
+Q: I find the concept of E(X - μ)2 confusing. Is it the same as finding E(X - μ) and then squaring the end result?
-confusing. Is it the same as finding
+A: No, these are two different calculations. E(X - μ)2 means that you find the square of X - μ for each value of X, and then find the expectation of all the results. If you calculate E(X - μ) and then square the result, you'll get a completely different answer. Technically speaking, you're working out E%%(%%(X - μ)2%%)%%, but it's not often written that way.
-E(X - μ) and then squaring the end result?
-A: No, these are two different calculations.
-E(X - μ)2 means that you find the square of
-X - μ for each value of X, and then find the
-expectation of all the results. If you calculate
-E(X - μ) and then square the result, you’ll get
-a completely different answer.
-Technically speaking, you’re working out
-E((X - μ)2), but it’s not often written that way.
-Q: So what’s the difference between a
+Q: So what's the difference between a slot machine with a low variance and one with a high variance?
-slot machine with a low variance and one
+A: A slot machine with a high variance means that there's a lot more variability in your overall winnings. The amount you could win overall is less predictable. In general, the smaller the variance is, the closer your average winnings per game are likely to be to the expectation. If you play on a slot machine with a larger variance, your overall winnings will be less reliable.
-with a high variance?
-A: A slot machine with a high variance
-means that there’s a lot more variability in
-your overall winnings. The amount you could
-win overall is less predictable.
-In general, the smaller the variance is, the
-closer your average winnings per game are
-likely to be to the expectation. If you play on
-a slot machine with a larger variance, your
-overall winnings will be less reliable.
-</WRAP>
 </WRAP>
 Pool puzzle
-<WRAP box>
 \begin{eqnarray*}
 X & = & (\text{original win}) - (\text{original cost}) \\
@@ Line 350: / Line 306: @@
 & = & 5 * X + 3  \\
 \end{eqnarray*}
-</WRAP>
-<WRAP box>
 E(X) = -.77 and E(Y) = -.85. What is 5 * E(X) + 3?
@@ Line 359: / Line 312: @@
 $ 5 * E(X) + 3 = -0.85 $
 $ E(Y) = 5 * E(X) + 3 $
-</WRAP>
-<WRAP box>
 $ 5 * Var(X) = 13.4855 $
 $ 5^2 * Var(X) = 67.4275 $
 $ Var(Y) = 5^2 * Var(X) $
-</WRAP>
 \begin{eqnarray*}
@@ Line 428: / Line 379: @@
 > </code>
 x2e will spend more.
+====== e.g. ======
+Sam likes to eat out at two restaurants. Restaurant A is generally more expensive than
+restaurant B, but the food quality is generally much better.
+Below you’ll find two probability distributions detailing how much Sam tends to spend at each
+restaurant. As a general rule, what would you say is the difference in price between the two
+restaurants? What’s the variance of this?
+| Restaurant A:   |||||
+| x  | 20  | 30  | 40  | 45  |
+| P(X = x)  | 0.3  | 0.4  | 0.2  | 0.1  |
+| Restaurant B:   ||||
+| y  | 10  | 15  | 18  |
+| P(Y = y)  | 0.2  | 0.6  | 0.2  |
+<code>
+x3 <- c(20,30,40,45)
+x3p <- c(.3,.4,.2,.1)
+x4 <- c(10,15,18)
+x4p <- c(.2,.6,.2)
+x3e <- sum(x3*x3p)
+x4e <- sum(x4*x4p)
+x3e
+x4e
+## difference in price between the two
+x3e-x4e
+x3var <- sum(((x3-x3e)^2)*x3p)
+x4var <- sum(((x4-x4e)^2)*x4p)
+x3var
+x4var
+## difference in variance between the two
+## == variance range
+x3var+x4var
+</code>
+<code>
+> x3 <- c(20,30,40,45)
+> x3p <- c(.3,.4,.2,.1)
+> x4 <- c(10,15,18)
+> x4p <- c(.2,.6,.2)
+>
+> x3e <- sum(x3*x3p)
+> x4e <- sum(x4*x4p)
+>
+> x3e
+[1] 30.5
+> x4e
+[1] 14.6
+> ## difference in price between the two
+> x3e-x4e
+[1] 15.9
+>
+>
+> x3var <- sum(((x3-x3e)^2)*x3p)
+> x4var <- sum(((x4-x4e)^2)*x4p)
+>
+> x3var
+[1] 72.25
+> x4var
+[1] 6.64
+> ## difference in variance between the two
+> ## == variance range
+> x3var+x4var
+[1] 78.89
+</code>
+====== e.g. ======
+| E(aX + b) | $aE(X) + b$  |
+| Var(aX + b) | $a^{2}Var(X)$  |
+| E(X) | $\sum{x} \cdot P(X=x) $  |
+| E(f(X)) | $\sum{f(X)} \cdot P(X=x)$  |
+| Var(aX - bY) | $a^{2}Var(X) + b^{2}Var(Y)$ see 1 |
+| Var(X) | $E(X-\mu)^{2} = E(X^{2})-\mu^{2}$  |
+| E(aX - bY) | $aE(X)-bE(Y)$  |
+| E(X1 + X2 + X3) | $3E(X)$  |
+| Var(X1 + X2 + X3) | $3Var(X)$  |
+| E(X2) | $\sum{X^{2}}\cdot P(X=x)$  |
+| Var(aX - b) | $a^{2}Var(X)$  |
+\begin{eqnarray*}
+Var(aX - bY) & = & Var(aX + -bY) \\
+& = & Var(aX) + Var(-bY) \\
+& = & a^{2}Var(X) + b^{2}Var(Y)
+\end{eqnarray*}
+see also [[:why n-1]]