Binomial
Distributions
Conditions for a Binomial Distribution: [Memorize this!]
There are n trials or repetitions
There are 2 outcomes for each trial, S or F
The P(Success), P, for each trial is constant. Note: Often, the probability is expressed as a "Proportion" e.g. the true proportion of defectives in a batch of TV–sets in 8% => P = 8% = 0.08 [Here, somewhat amusingly, Success ~ Defective TV–set...but that's OK!]
The trials are identical and independent.
X is a r.v. denoting number of successes, x, in n trials [Note: there are n – x failures in n trials]
Notation: X ~ B(n, P)
Formulas:
P(X = x successes) = BinomPdf (n, P, x)
= nCx Sx Fn – x ~ nCx Px Qn – x where Q = P(Failure) = 1 – P
P(X < k successes) = BinomCdf (n, P, k)
= P(X = 0) + P(X = 1) + P(X
= 2) + ... P(X = k)
= Σ
[from
x
= 0 to
x
= k]
nCx Sx Fn – x
~
Σ [from
x
= 0 to
x
= k]
nCx Px
(1 – P)n – x
Caution: For P(X < x), P(X > x) and P(X > x) Qs, since the TI 83s and 84s can perform only < operations for BinomCdf...you must "transform" the Q into something usable: if uncertain, draw a number line and use common–sense / logic!
Examples
to
help you with interpretations:
P(5 successes) = P(X = 5) = BinomPdf(n, P, 5)
P(at most 10 successes) = P(X < 10) = P(X = 0 ) + P(X = 1) + ...P(X = 10)
= BinomCdf(n, P, 10)
P(fewer than 7 successes) = P(X < 7) = P(X = 0) + ...P(X = 6)
= P(X < 6) = BinomCdf(n, P, 6)
P(more
than 11 successes) = P(X
> 11)
=
1 – P(X <
11)
= 1 – BinomCdf(n, P, 11)
P(at least 8 successes) = P(X > 8)
= 1 – P(X <
7)
= 1 –
BinomCdf(n,
P, 7)
More
Examples:
more than 4 ~ P(X > 5) = 1 ― P(X < 5) = 1 – BinomCdf(n, P, 5)
no(t) more than 5 ~ P(X < 5) = BinomCdf(n, P, 5)
greater than 4 ~ P(X > 4) = P(X > 5) = 1 ― P(X < 4) = 1 – BinomCdf(n, P, 4)
no(t) fewer than 5 ~ P(X > 5) = 1 ― P(X < 4) = 1 – BinomCdf(n, P, 4)
less than 5 ~ P(X < 5) = P(X < 4) = BinomCdf(n, P, 4)
no(t) less than 5 ~ P(X > 5) = 1 ― P(X < 4) = 1 – BinomCdf(n, P, 4)
exceed 5 ~ P(X > 5) = 1 ― P(X < 5) = 1 – BinomCdf(n, P, 5)
at least 5 ~ P(X > 5) = 1 ― P(X < 4) = 1 – BinomCdf(n, P, 4)
at most 5 ~ P(X < 5) = BinomCdf(n, P, 5)
exactly 5 ~ P(X = 5) = BinomPdf(n, P, 5)
For the TI-83 and 84
calculators:
Subtract from 1 only when the Q relates to > or >.
Use Summation and nCx only when the Q relates to > or <.
Use numbers for n and x in nCx when the Q relates to =.
For Rare–Events, choose the Inequality so that you dont cross the Mean, E(X) = nP.
Use BinomPdf for P(X = a) situations.
Use BinomCdf for P(X < a) situations.
Calculator
Clarification! Only
when the Q deals with expressions as P(X > a)
or P(X >
b)
do we rewrite it as: 1 – P(X <
c).
For expressions as P(X < a)
or P(X <
b),
there is no
need
to subtract from 1...because the calculator processes <
probabilities!
Mean
and s.d. of a Binomial Distribution
[Memorize
this!]
Mean: E(X) = nP
Standard Deviation: σ(X) = √nPQ
Example 1
I Interpret the following situations in Probability Notation and rewrite – if necessary – to make them calculator–ready i.e. describe using nCx formulas; do not, however, perform any calculations!
a) P(at least 3 successes out of 10 trials), P(S) = 0.1
b) P(at most 7 successes out of 12 trials), P(S) = 0.25
c) P(fewer than 6 successes out of 8 trials), P(S) = 0.83
d) P(more than 3 successes out of 9 trials), P(S) = 0.35
e) P(exactly 11 failures out of 15 trials), P(S) = 0.19
f) P(at least 9 failures out of 25 trials), P(S) = 0.75
g) P(not more than 4 failures out of 6 trials), P(S) = 0.42
Solution. If X is a r.v. that denotes # of successes, and Y, the # of failures:
a)
P(X >
3) = 1 – P(X <
2) = ∑[x
= 0 to 2] 10Cx
(0.1)x
(0.9)10
– x
b)
P(X <
7) = ∑ [x
= 0 to 7] 12Cx
(0.25)x
(0.75)12
– x
c)
P(X < 6) = P(X <
5) = ∑ [x
= 0 to 5] 8Cx
(0.83)x
(0.17)8
– x
d)
P(X > 3) = 1 – P(X <
3) = 1 – ∑ [x
= 0 to 3] 9Cx
(0.35)x
(0.65)9
– x
e)
Q = 0.81 and P(Y = 11): 15C11 (0.81)11
(0.19)4
or
P(X
= 4) = 15C4 (0.19)4
(0.81)11
f)
Q = 0.25 and P(Y >
9) = 1 – P(Y <
8) = 1 – ∑ [y
= 0 to 8] 25Cy
(0.25)y
(0.75)25
– y
or
P(X
<
16)
= ∑ [x
=
0 to 16] 25Cx
(0.75)x
(0.25)25
– x
g)
Q = 0.58 and P(Y <
4) = 1 – ∑ [y
= 0 to 4] 6Cy
(0.58)y
(0.42)6
– y
or
P(X
>
2)
= 1 – P(X <
1)
= 1 – ∑ [x
=
0 to 1] 6Cx
(0.42)x
(0.58)6
– x
Example
2
Suppose
we're interested in finding out about the support a candidate John
Smith has, and we randomly interview 12 voters. Assume his overall
approval rating to be 41%.
a) Describe how the 4 conditions of a
Binomial situation are met in
context.
b)
Define a suitable Binomial r.v., X.
c) What is the probability
exactly 6 chaps approve of John Smith?
d) What is the probability
more than 8 chaps approve of John Smith?
e) What is the
probability fewer than 3 chaps approve of John Smith?
f) What is
the probability not more than 5 chaps approve of John Smith?
g)
What is the probability of getting at least 1 John Smith
supporter?
h) In a sample of 12 individuals, how many would you
expect to be John Smith supporters? What is the s.d. of the number of
supporters?
i) Suppose voters are repeatedly asked for their
preferences. What is the probability that the 1st
John Smith voter shall be the 5th
one chosen?
j)
Suppose voters are asked for their preferences, one after the other.
What is the probability that the 4th
John Smith voter shall be the 15th
one chosen?
Solution.
a)
1. There are 2 outcomes: an individual is a John Smith supporter or not;
2. There are a fixed number of "trials", n = 12;
3. The probability of success i.e. being a John Smith supporter, is constant, P = 0.41;
4. The outcomes [supporter / The Devil] are independent of each other since there are at least 120 individuals in the population [N > 10n = 10·12]
b)
X is a r.v. denoting Number of John Smith supporters in a sample of
12; X ~ B(12, 0.41)
c) P(X = 6) = 12C6
(0.41)6(0.59)6
= 18.51% [Use
the BinomPdf command because of the EQUAL TO sign…]
d)
P(X > 8)
=
1 – P(X <
8) Use
a Number Line to see why!
=
1 – (x
=
0)∑(x
= 8) 12Cx
(0.41)x
(0.59)(12
– x) [Use
the BinomCdf command because of the LESS THAN OR EQUAL TO sign…]
= 1.82%
e)
P(X < 3)
= P(X <
2) Use
a Number Line to see why!
=
(x
=
0)∑(x
= 2) 12Cx
(0.41)x
(0.59)(12
– x) [Use
the BinomCdf command because of the LESS THAN OR EQUAL TO sign…]
= 7.34%
f)
P(X <
5)
= (x
=
0)∑(x
= 5) 12Cx
(0.41)x
(0.59)(12
– x) [Use
the BinomCdf command because of the LESS THAN OR EQUAL TO sign…]
= 63.84%
g) P(X > 1)
= 1 – P(X = 0) Use a Number Line to see why!
= 1 – 12C0 (0.41)0(0.59)12 [Use the BinomCdf command because of the LESS THAN OR EQUAL TO sign…]
= 99.82%
h) E(X) = nP = 12(0.41) = 4.92
σ(X) = √nPQ = √12(0.41)(0.59) = 1.7038
i)
This
is not
a Binomial event
since the number of trials is not fixed…this is just a simple
Probability Rules situation. If A ~ event that a voter is a John
Smith supporter, required: P(A’, A’, A’, A’,
A) = P(A’)4
P(A), assuming independence of voting preferences = (1 – 0.41)4
(0.41) = 4.96%
j) If the 4th John Smith voter shall be the 15th one chosen, then we must have 3 John Smith voters amongst the 1st 14…and the last (15th) fellow must be a John Smith voter!
The
1st
of these is a Binomial event:
X is a r.v. denoting Number of John Smith supporters in a sample of
14; X ~ B(14, 0.41)...and the 2nd probability is simply
0.41!
Required: P(X
= 3)×0.41
[Use
the BinomPdf command because of the EQUAL TO sign…]
= [14C3 (0.41)3(0.5912)]×0.41
= 3.1%
Example
3
Suppose
26% of drivers in a city are driving without seat–belts. If we
randomly select 15 drivers on a weekend,
a) In phrases [only] and
in context, describe how the 4 conditions of a Binomial situation are
met in
context.
b)
Define a suitable Binomial r.v., X.
c) What is the probability
exactly 3 drivers are driving without seat–belts?
d) What is
the probability more than 6 drivers are driving without
seat–belts?
e) What is the probability fewer than 5 drivers
would be driving without seatbelts?
f) What is the probability
not more than 5 drivers are driving without seat–belts?
g)
What is the probability of getting at least 2 drivers that are
driving without seat–belts?
h) How many drivers would you
expect to be driving without seat–belts? What is the s.d. of
the number of drivers?
i) Suppose drivers are observed for their
seat–belt compliance. What is the probability that the 1st
driver not wearing his seat–belt is is the 4th
one chosen?
j) Suppose drivers are observed for their seat–belt compliance, one after the other. What is the probability that the 6th driver not wearing a seat–belt is the 10th one chosen?
Suppose
26% of drivers in a city are driving without seat–belts. If we
randomly select 15 drivers on a weekend,
Solution.
a)
2 outcomes: driver wearing seatbelt / not;
P(wearing seatbelt) = 0.26 = constant;
incidence of wearing seat–belt is independent of the drivers
fixed number of trials, n = 12
b)
X ~ r.v. denoting # of drivers not wearing a seat–belt amongst
15: X ~ Bin(15, 0.26)
c) P(X = 3) = 21.56% Show
nCx notation.
d) P(X > 6) = 1 – P(X <
6) Use
a Number Line to see why... =
6.83% Show
nCx notation.
e)
P(X < 5) = P(X <
4) Use
a Number Line to see why... =
65.31% Show
nCx notation.
f)
P(X <
5) = 82.87% Show
nCx notation.
g)
P(X >
2) = 1 – P(X <
1) Use
a Number Line to see why... =
93.14% Show
nCx notation.
h)
E(X) = nP = 3.9, σ(X) = √nPQ = √15(0.26)(0.74) =
1.6988 Show
nCx notation.
i)
This
is not
a Binomial event
since the number of trials is not fixed…this is just a simple
Probability Rules situation. If A ~ event that a driver is not
wearing a seat–belt, required: P(A’, A’, A’,
A) = P(A’)3
P(A), assuming independence of seat–belt wearing behaviour
=
(1 – 0.26)3
(0.26) = 10.53%
j) If the 6th driver not wearing a seat–belt shall be the 10th one chosen, then we must have 5 non–seat–belt drivers amongst the 1st 9!…and the last (10th) fellow must be a non–seat–belt driver!
The
1st
of these is a Binomial event:
X is a r.v. denoting Number of drivers not wearing seat–belts
in a sample of 9; X ~ B(9, 0.26)...and the 2nd probability is simply
0.26!
Required: P(X
= 5)×0.26
[Use
the BinomPdf command because of the EQUAL TO sign…]
= [9C5 (0.26)5(0.74)4]×0.26
= 1.16%
Example
4
Suppose
we're interested in finding out about the support a candidate John
Smith has, and we randomly interview 12 individuals. Assume his
overall approval rating to be 41%.
a) What is the probability
that at least 7 chaps approve of John Smith?
b) What is the
probability that more than 6 chaps disapprove
of John Smith?
c) What is the probability that fewer than 3 chaps
disapprove
of John Smith?
Solution.
a)
Let X is a r.v. denoting Number of Obama supporters in a sample of
12;
X ~ B(12, 0.41)
P(X >
7)
=
1 – P(X <
6)
= 1 – (x
=
0)∑(x
= 6) 12Cx
(0.41)x
(0.59)(12
– x)
= 17.64%
b) Method I [a little complicated!]
More than 6 chaps disapproving of John Smith
~ 7, 8...11, 12 chaps disapproving
~ 5, 4, ...1, 0 chaps approving of John Smith [because we have 12 chaps in all!].
P(X
<
5)
= (x
=
0)∑(x
= 5) 12Cx
(0.41)x
(0.59)(12
– x)
=
63.84%
Method II [very elegant!!]
Let Y be a r.v. denoting # of individuals who disapprove of John Smith
Y ~ B(12, 0.59)
Required:
P(Y > 6)
= 1 – P(Y <
6)
= 1 – (y
=
0)∑(y
= 6) 12Cy
(0.59)y
(0.41)(12
– y)
=
63.84%%
c)
P(Y < 3)
= P(Y <
2)
= (y
=
0)∑(y
=
2) 12Cy
(0.59)y
(0.41)(12
– y)
=
0.35%
Example 5
Historically, the pass rate for AP Calculus AB has been about 59.1%. Suppose we randomly select 150 AP Calculus AB students.
a)
Describe how the 4 conditions of a Binomial situation are met in
context.
b) Define a suitable Binomial r.v., X and state
its distribution.
c) What is the probability 94 students pass the
AB Exam?
d) What is the probability more than 70 students shall
fail the AB Exam?
e) What is the probability fewer
than 80 students shall pass the AB Exam?
f) What is the
probability at least 35 students shall fail the AB
Exam?
g) What is the probability that between 86 and 110 students
shall pass the AB Exam? Tip! Draw a Number Line to determine
how BinomCdf shall be used cleverly…
h) How many would you
expect to pass the AB Exam? With what s.d.?
i) How unusual
would it be for 70 students out of 150 to pass the AB Exam?
If this were found to be true, what might you conclude about the pass
rate? Write a detailed 3–sentence conclusion. Solve
using the Normal Approximation. Tip! Find the
distribution of X.
j) Suppose students are repeatedly asked for their performance [pass / fail the Calculus AB Exam]. What is the probability that the 1st student that passed the Exam is the 6th one chosen?
k)
Suppose students are repeatedly asked for their performance [pass /
fail the Calculus AB Exam]. What is the probability that the 10th
student that passed is the 17th
one chosen?
Solution.
a)
2 outcomes: student passes / fails the AB Exam;
P(passing AB Exam) = 0.591 = constant;
outcomes are independent since there are at least 1500 students [N > 10n = 10·150] taking the AB Exam
fixed number of trials, n = 150 students
b)
X ~ r.v. denoting # of Calculus AB students amongst 150: X ~ Bin(150,
0.591)
c) P(X = 94) = 150C94 (0.591)94
(0.90)(56) = 4.51%
d)
If Y is r.v. Denoting the Number of who that Fail the AB Exam, then Y
~ Bin(150, 0.409).
P(Y > 70) = P(Y >
71) = 1
– P(Y <
70)
= 1 – (y
=
0)∑(y
=
70) 150Cy
(0.409)y
(0.591)(150
– y)
=
6.5%
e) P(X <
80) = P(X <
79) = (x
= 0)∑(x
= 79) 150Cx
(0.591)x
(1 – 0.591)(150
– x)
= 6.5%
f) P(Y >
35) = 1 – P(Y <
34)
= 1 – (y
=
0)∑(y
=
34) 150Cy
(0.409)y
(0.591)(150
– y)
= 99.99%
g)
P(86 <
X
< 110)
Can
you see why?! Draw a number line and shade 86 – 110
= P(X < 110) – P(X < 85)
= (x = 0)∑(x = 110) 150Cx (0.591)x (1 – 0.591)(150 – x) –
[(x = 0)∑(x = 85) 150Cx (0.591)x (1 – 0.591)(150 – x)]
=
70.06%
h) E(X) = nP = 150·0.591 = 88.65
σ(X)
= √nPQ = √150·0.591·(1
– 0.591) = 6.02144.
i) P(X <
70) [since E(X) = 88.65]
Centre:
E(X) = nP = 150·0.591
= 88.65
Spread:
σ(X) √nPQ
= √150·0.591·(1 – 0.591) = 6.02144.
Shape:
Since nP = 88.65 > 5, and n(1 – P) = 150 – 88.65 =
61.35 > 5, X ≈
N(88.65,
6.02144)
P(X <
70) = 0.0009766
Note:
Precisely, P(X
<
70)
= (x
= 0)∑(x
= 70) 150Cx
(0.591)x
(1 – 0.591)(150
– x)
= 0.001405.
MASTER
THIS FORMULATION: Our P-value of 0.0009766 or 0.09766%
indicates that if indeed the Calculus AB pass rate is 59.1%,
we'd get a result as extreme as that observed i.e. 70 students of 150
passing the Exam, only 0.09766% of
the time. Since P-value = 0.09766%
< α = 5%, we find the results statistically
significant,
and not
attributable
to natural sampling variations. We conclude that we did
find evidence that the pass-rate is lower than 59.1%!
j)If
A ~ event that a student passes the AB Exam, required: P(A’,
A’, A’, A', A', A) = P(A’)5
P(A), assuming independence of scores
= (1 – 0.591)5
(0.591) = Finish.
k) If the 10th student that passed is the 17th one chosen, then we must have 9 students that passed amongst the 1st 16 students!…and the last (17th) fellow must have passed, too!
The
1st
of these is a Binomial event:
X ~ B(16, 0.591)...and the 2nd probability is simply 0.591!
Required,
P = [16C9
(0.591)9(0.409)7]×0.591
=
Finish.