# Chapter3.ReasoningfromSampletoPopulation.pptx

Reasoning from Sample to Population

Chapter 3

© 2019 McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or distribution without the prior written consent of McGraw-Hill Education

Learning Objectives

Calculate standard summary statistics for a given data sample.

Explain the reasoning inherit in a confidence level.

Construct a confidence interval.

Explain the reasoning inherit in a hypothesis test.

Execute a hypothesis test.

Outline the roles of deductive and inductive reasoning in making active predictions.

‹#›

Population parameter a numerical expression that summarizes some feature of the population

Objective degree of support using inductive and deductive reasoning

Construction of a confidence interval

Hypothesis testing

Distributions and Sample Statistics

‹#›

Random variable

A variable that can take on multiple values, with any given realization of the variable being due to chance (or randomness)

Deterministic variable

A variable whose value can be predicted with certainty

Distributions of Random Variables

‹#›

Random Variables

Distributions of Random Variables

DISCRETE

COUNTABLE NUMBER OF VALUES (e.g., 5, 9, 19, 27…)

CONTINUOUS

UNCOUNTABLE INFINITE NUMBER OF VALUES (ALL THE NUMBERS, TO ANY DECIMAL PLACE, BETWEEN 0 AND 1)

‹#›

Distributions of Random Variables

The probabilities of individual outcomes for a discrete random variable are represented by a probability function

EXAMPLE

OF 10 PEOPLE:

3 OF THEM ARE 25 YEARS OLD,

4 OF THEM ARE 30 YEARS OLD,

2 OF THEM ARE 40 TEARS OLD,

1 OF THEM IS 45 YEARS OLD.

PROBABILITY THAT A SINGLE DRAW WILL BE:

25 YEARS OLD IS 3/10 = 0.3

30 YEARS OLD IS 4/10 = 0.4,

40 TEARS OLD IS 2/10 = 0.2,

45 YEARS OLD IS 1/10 = 0.1.

DISCRETE RANDOM VARIABLE

POPULATION

‹#›

Graphical Representation for a Discrete Random Variable (Age)

‹#›

Distributions of Random Variables

The probabilities of individual outcomes for a continuous random variable are represented by a probability density function (pdf)

A special type of continuous random variable is called normal random variable, which has a “bell shaped” pdf

‹#›

Graphical Representation for a Normal Random Variable

‹#›

Distributions and Sample Statistics

For a normal random variable, and any other continuous random variable, the pdf allows us to calculate the probabilities that the random variable falls in various ranges.

The probability that a random variable falls between two numbers A and B is the area under the pdf curve between A and B

‹#›

Probability that a Random Variable Falling Between Two Numbers

‹#›

Distributions of Random Variables

Expected Value or Population Mean

The summation of each possible realization of Xi multiplied by the probability of that realization.

Variance

A common measure for the spread of the distribution; defined by E[(Xi – E(Xi)2].

Standard Deviation

The square root of the variance.

‹#›

Data Samples and Sample Statistics

Sample Size of N

A collection of N realizations of Xi ; {Xi, X2…. XN }

Sample Statistics

Single measures of some feature of a data sample

Sample Mean

A common measure of the center of a sample

‹#›

Sample Variance

Common measure of the spread of a sample

For a sample size of N for random variable Xi is:

Sample Standard Deviation

The square root of the sample variance

For a sample size of N for random variable Xi is:

Data and Sample Statistics

‹#›

Confidence Interval

Suppose a firm wants to know the average age of its customers

It collects data from 872 of its customers, thus its sample size

Agei = a random variable defined as the age of a single customer

agei = the observed age of customer i in the sample

‹#›

Confidence Interval

Estimator

A calculation using sample data that is used to provide information about a population parameter

Random sample

A sample where every member of the population has an equal chance of being selected

‹#›

Confidence Interval

Deductive argument: If we have a random sample, the sample mean is a “reasonable guess” for the population mean

Inductive argument: Then the population mean is the same as the sample mean

How sure are we that the population mean in our example is the same as the sample mean?

Confidence interval a range of values such that there is a specified probability that they contain a population parameter

‹#›

Confidence Interval

How do we build confidence intervals and determine their objective degree of support?

Independent

The distribution of one random variable does not depend on the realization of the another

Independent and identically distributed (i.i.d)

The distribution of one random variable does not depend on the realization of another and each has identical distribution.

‹#›

Confidence Interval

Unbiased estimator

An estimator whose mean is equal to the population parameter it is used to estimate

Population standard deviation

The square root of the population variance

Population variance

The variance of a random variable over the entire population

‹#›

Data and Sample Statistics

In order to construct a confidence interval for the population mean and know its objective degree of support, we must know something about its standard deviation and its type of distribution

The assumption that a data sample is a random sample implies the standard deviation of the sample mean is

The spread of the sample mean gets smaller as the sample size increases

‹#›

Data and Sample Statistics

Assuming a random sample with reasonably large N (> 30) implies that the sample mean is normally distributed with the mean of µ and standard deviation of

This can be written as:

‹#›

Probability Sample Mean within 1.96 Standard Deviations of Population Mean

‹#›

Hypothesis Testing

Hypothesis test is the process of using sample data to assess the credibility of a hypothesis about a population

Making an assessment

Reject the hypothesis

Fail to reject the hypothesis

‹#›

Hypothesis Testing

Null hypothesis

The hypothesis to be tested using a data sample

Written as H0: µ = K, where K is the hypothesized value for the population mean

The objective is to determine whether the null hypothesis is credible given the data we observe.

If a sample of size N is a random sample, N is “large” (>30) and µ = K, then

‹#›

Probability Sample Mean within 1.96 Standard Deviations of Hypothesized Population Mean

‹#›

Hypothesis Testing

Steps in Hypothesis Testing:

State the null hypothesis

Collect the data sample and calculate the sample mean

Decide whether or not to reject the deduced distribution for the sample mean

Degree of support

Measure how many standard deviations the sample mean is from the hypothesized population mean

Z =

‹#›

Hypothesis Testing

To calculate Z, take the difference between the sample mean and the hypothesized population mean ()

Then take that difference and divide it by the standard deviation of the sample mean ()

t-stat is the difference between the sample mean and the hypothesized population mean () divided by the sample standard deviation (), or t =

‹#›

Hypothesis Testing

Test statistic

Any single value derived from a sample that can be used to perform a hypothesis test

p-value

The probability of attaining a test statistic at least as extreme as the one that was observed

‹#›

Graphical Illustration of a P-Value

‹#›

Hypothesis Testing

The t-stat is an observed value from a t-distribution, a distribution that resembles a normal distribution and is centered at zero

In excel a p-value can be calculated using the formula:

2 × (1-norm.s.dist(, true))

If the observed t-stat is very unlikely (has a low p-value), then reject this distribution and vice versa

If the p-value is less than the cutoff, reject, and fail to reject otherwise

‹#›

Hypothesis Testing

Cutoffs using p-values directly correspond to the degrees of support you chose for your inductive argument

If your chosen degree of support is D%, then the cutoff is

100 D%, or 1 D/100

Rejections will be incorrect 5% of the time using this rule because 5% of the time you will observe a p-value less than 0.05 even though the deduced distribution for the sample mean is correct

‹#›

Hypothesis Testing

Standard degrees of confidence used are 90%, 95%, and 99%, the standard cutoffs using p-values are 0.10, 0.05, and 0.01

Reject the distribution if the p-value is less than 0.10; fail to reject otherwise. This generates a degree of support of 90%.

Reject the distribution if the p-value is less than 0.05; fail to reject otherwise. This generates a degree of support of 95%.

Reject the distribution if the p-value is less than 0.01; fail to reject otherwise. This generates a degree of support of 99%.

‹#›

The Interplay Between Deductive and Inductive Reasoning in Active Predictions

The underlying reason for active predictions:

Forming the prediction uses deductive reasoning

Assume the causal relationship, when then implies the prediction

Estimating the causal relationship uses deductive and inductive reasoning

Deductive reasoning: Make assumptions that imply causality between X and Y and the distribution of an estimator for the magnitude of this causality in the population

Inductive reasoning: Using an observed data sample, build a confidence interval and/or determine whether to reject a null hypothesis for the magnitude of the population-level causality

‹#›

# image18.png

## Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
\$26
The price is based on these factors:
Number of pages
Urgency
Basic features
• Free title page and bibliography
• Unlimited revisions
• Plagiarism-free guarantee
• Money-back guarantee
On-demand options
• Writer’s samples
• Part-by-part delivery
• Overnight delivery
• Copies of used sources
Paper format
• 275 words per page
• 12 pt Arial/Times New Roman
• Double line spacing
• Any citation style (APA, MLA, Chicago/Turabian, Harvard)

# Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

### Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

### Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

### Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.