Use the attached document “Correlation and Regression” to complete the assignment.
This assignment uses a rubric. Please review the rubric prior to beginning the assignment to become familiar with the expectations for successful completion.
PSY-520 Graduate Statistics
Topic 3 – Benchmark – Correlation and Regression Project
Directions: Use the following information to complete the questions below.
Use the following data points that have a linear relationship:
Substance Abuse and Suicide: Percent of the Total U.S. Population | ||
X VARIABLE | Y VARIABLE | |
Year | Substance Use | Suicides |
1999 | 6.82 | 0.000105 |
2000 | 6.78 | 0.000104 |
2001 | 7.00 | 0.000107 |
2002 | 7.12 | 0.00011 |
2003 | 7.09 | 0.000108 |
2004 | 7.05 | 0.00011 |
2005 | 7.17 | 0.000109 |
2006 | 7.24 | 0.00011 |
2007 | 7.28 | 0.000113 |
2008 | 7.36 | 0.000116 |
2009 | 7.80 | 0.000118 |
2010 | 7.81 | 0.000121 |
2011 | 7.88 | 0.000123 |
2012 | 7.87 | 0.000125 |
2013 | 8.07 | 0.000126 |
2014 | 8.12 | 0.00013 |
2015 | 8.06 | 0.000133 |
2016 | 8.17 | 0.000134 |
2017 | 8.18 | 0.00014 |
2018 | 8.23 | 0.000142 |
In 500-750 words, address the following:
Identify the correlation coefficient for each of the possible pairings of variables. Describe the relationship in terms of strength (weak/strong) and direction (positive/negative).
Find a linear model of the relationship between the three variables of interest. Identify the predictor variables and the criterion variable.
Provide an output of the SPSS results and interpret the results using correct APA style. Be sure to include the following in your interpretation:
· Cause and effect concepts
· Independent/dependent variable relationships?
· Why is important to do random sampling?
· What is regression fallacy? How may it apply to the relationships discovered?
USEFUL NOTES FOR:
Correlation and Regression Project
Introduction
If you’re like me, then you’re probably familiar with the term “correlation coefficient” as it relates to statistics. A correlation coefficient measures the strength of association between two sets of data points; the higher the coefficient value, the stronger that relationship is. For example, if I told you that my car’s gas tank is always under half full when it’s been parked for a week or more and then next time I checked its level was empty—that would be an extremely strong correlation! But what does this mean? To answer that question, let’s look at some examples.
Calculate the mean, median, and mode for each variable.
Mean, median, and mode are all measures of central tendency. The mean is the sum of all values divided by the number of values; that is, if you have 30 students in your class and they each write five reports on a topic they studied in class (the value), then the average would be 5/.30 = 2/3. The median is what half the values are above or below—in our example it would be 2/3 because there were two numbers below 3 and three numbers above 3; while most people guessed that it was going to be lower than average but higher than average too (because when we look at these numbers individually they end up adding up to less than 10). And finally there comes our mode: this simply states which value occurs most commonly among those given!
Find the range for each variable.
The range is the difference between the largest and smallest values in your dataset.
The range is also known as ‘the spread.’ It’s the difference between any two values in your data, such as how many times you were paid by a certain company last month or how much money you spent on a certain item.
Find the variance and standard deviation for each variable.
Variance and standard deviation are two ways to measure dispersion. The variance is the average of squared differences from the mean, while the standard deviation is simply the square root of this.
Variance and standard deviation are both measures of spread; that is, they tell you how far each value varies from its adjacent neighbors on a scale. The more spread out your data set is (the higher its variance), then it will be harder to find patterns within it because there will often be outliers that do not belong in your pattern at all!
Graph all variables on a scatterplot. Pick 2 variables at random and find their correlation coefficient.
Use the scatterplot to visualize the relationship between two variables.
Find their correlation coefficient, which measures the strength of the relationship between two variables (in other words, how well one variable can be predicted by another).
The correlation coefficient range from -1 to 1; a positive number means that one variable increases as another decreases; for example, if you have a positive value for your height and weight measurements then it indicates that your height is higher than average while your weight is lower than average due to genetics or other factors such as dieting habits etc.; conversely, if you have negative values then one variable decreases as another increases so there may be an inverse relationship between them such as when someone loses weight but gains muscle mass at the same time – this would mean that losing muscle mass could lead towards gaining more fat around their waistline area instead!
Do correlations change when you remove outliers?
The answer is both yes and no. Correlations can change when you remove outliers, but they also tend to be higher if you don’t.
Correlations do change when you remove outliers; however, the amount of change varies depending on the type of data being analyzed. For example, in our previous examples we saw that removing one outlier (the one with the highest values) had little effect on our correlation between income and happiness levels because people with high incomes were more likely than others to report being happy—but this was only true for people who earned less than $100k per year (and therefore were considered “poor”). If we had looked at people who earned more than $100k per year but didn’t have any major expenses or debts like student loans or mortgages then eliminating this single outlier would have made our results even more stark: People earning over $100k had nearly twice as much happiness as those earning under $20k!
Don’t let numbers intimidate you!
When it comes to statistics, don’t let numbers intimidate you! They are a useful tool that can help us make better decisions. In fact, statistics are useful in everyday life as well as the classroom.
You might be surprised by how much you already know about statistics—and how much more there is for you to learn!
Conclusion
Don’t let numbers intimidate you! If you can understand the correlations between variables, then it will be easier for you to select the best variables for your project. Don’t let this discourage you from doing more learning about statistics; there are so many other interesting things that we haven’t covered here yet! Hopefully now that we’ve covered these basics, it will help open up new doors in your understanding of correlation and regression analysis.
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more
Recent Comments