Step 1: Find or collect a Dataset
For this project, you must find some sort of published, existing data. Possible sources include: almanacs, magazine, journal articles, textbooks, web resources, athletic teams, newspapers, reference materials, campus organizations, professors with experimental data, electronic data repositories, the sports pages or collect your own data from fellow students, neighbours or friends.
The dataset you select must have at least 25 cases. It also must have at least two categorical variables and at least two quantitative variables. Choose or collect a dataset that interests you!
Step 2: Analyse Your Data!
See the description below of what analysis should be included. Use technology to automate calculations and graphs.
Step 3: Write Your Report
Cut and paste all relevant computer output with your analysis. Be sure to include both computer output and your discussion of that output in every case. As you discuss each analysis, be sure to interpret what you are finding in the context of your particular data situation. Include all of the following.
How did you find or collect your data? (If you found the data, give a clear reference. If you collected the data, describe clearly the data collection process you used.) What are the cases? What are the variables? What population do you believe the sample might generalize to? Is the sample data from an experiment or an observational study?
Use one of these or come up with your own idea or find your own source. There are many sites reporting frequency counts from survey results.
- Frequency of smoking (never, occasionally, frequently), gender for students, age of the student and number of years smoking etc.
- Academic division (business, accounting, TESOL,...), whether the student has a Mac, PC, or neither, for students, age and number of trimesters completed.
- Whether a person plans to vote in the next election, political party affiliation (yes or no), age and number of years affiliated with the party.
The statistical analysis for the qualitative and quantitative variables is very useful in the process of checking and comparing different claims or hypotheses. Here, we want to check some claims and hypotheses regarding the qualitative and quantitative data. We have to use the descriptive statistics, inferential statistics or testing of hypothesis, graphical analysis for the variables included in the data set. By using the descriptive statistics we get the general idea about the nature of the data for the variables. By using testing of hypothesis we can conclude about the claim at the given level of significance. Also, we have to use graphical analysis which is useful in easy understanding of the concepts. Let us see this statistical analysis report in detail.
H1: There is no any significant difference in the average monthly income for the male and female.
H2: There is no any significant difference in the average monthly income for the persons with different education levels.
H3: There is no any significant difference in the average monthly expenditure for the male and female.
H4: There is no any significant difference in the average monthly expenditure for the persons with different levels of education
H5: The two categorical variables gender and education are independent from each other.
Statistical Analysis
In this topic, we have to see the descriptive statistics, inferential statistics and graphical analysis of the given data for the monthly income and expenditure of the persons. For the given study of statistical analysis, there are two qualitative variables such as gender and education; and there are two quantitative variables such as monthly income and monthly expenditure of the persons.
First of all we have to see the frequency distribution of the gender of the persons in the given data set. The required frequency distribution is given as below:
Gender |
|||||
Frequency |
Percent |
Valid Percent |
Cumulative Percent |
||
Valid |
Male |
28 |
56.0 |
56.0 |
56.0 |
Female |
22 |
44.0 |
44.0 |
100.0 |
|
Total |
50 |
100.0 |
100.0 |
There are 28 males and 22 females involved in the given data set. The frequency distribution for the variable education is given as below:
Education |
|||||
Frequency |
Percent |
Valid Percent |
Cumulative Percent |
||
Valid |
High School/Under-graduate |
15 |
30.0 |
30.0 |
30.0 |
Graduate |
16 |
32.0 |
32.0 |
62.0 |
|
Post-graduate |
19 |
38.0 |
38.0 |
100.0 |
|
Total |
50 |
100.0 |
100.0 |
There are 19 post graduate persons, 16 graduate persons and 15 under graduate persons in the given data set.
Now, we have to see the descriptive statistics for the two quantitative variables such as monthly income and monthly expenditure. The required descriptive statistics is summarised as below:
Descriptive Statistics |
|||||
N |
Minimum |
Maximum |
Mean |
Std. Deviation |
|
Monthly income in AU$ |
50 |
4096.00 |
7897.00 |
5940.5800 |
952.45903 |
Monthly expenditure in AU$ |
50 |
2718.00 |
6536.00 |
4612.1800 |
980.83593 |
Valid N (listwise) |
50 |
The average monthly income of the persons involved in the data set is given as $5940.58 with the standard deviation of $952.45903. The minimum monthly income is recorded as $4096 while the maximum monthly income is recorded as $7897. The average monthly expenditure is given as $4612.18 with the standard deviation of $980.83593. The minimum monthly expenditure is observed as $2718.00 while the maximum monthly expenditure is observed as $6536. The box plot for the monthly income of the persons is given as below
The box plots for the monthly incomes for the male and female is given as below:
From the above box plots, it is observed that the average income for the male is greater than female. There is more variation in the monthly income for females. The descriptive statistics for the monthly income for the male and females is given as below:
Group Statistics |
|||||
Gender |
N |
Mean |
Std. Deviation |
Std. Error Mean |
|
Monthly income in AU$ |
Male |
28 |
5929.9286 |
897.47563 |
169.60695 |
Female |
22 |
5954.1364 |
1039.62046 |
221.64783 |
Statistical Analysis
From the above table, it is observed that the average monthly income for the male is given as $5930 approximately while the monthly income for the female is given as $5954.
Now, we have to use the inferential statistics or testing of hypothesis for checking the claims regarding the variables included in the given data set.
Here, we want to check the hypothesis or clam whether there is any significant difference in the average monthly income of the male and female or not. For checking this hypothesis, we have to use two sample t-test for the population mean. We will consider 5% level of significance for this test. The null and alternative hypotheses for this test are given as below:
Null hypothesis: H0: There is no any significant difference in the average monthly income for the male and female.
Alternative hypothesis: Ha: There is a significant difference in the average monthly income for the male and female.
The required t test is given as below:
Independent Samples Test |
||||||
t-test for Equality of Means |
||||||
t |
df |
Sig. (2-tailed) |
95% Confidence Interval of the Difference |
|||
Lower |
Upper |
|||||
Monthly income in AU$ |
Equal variances assumed |
-.088 |
48 |
.930 |
-575.41678 |
527.00119 |
Equal variances not assumed |
-.087 |
41.679 |
.931 |
-587.57397 |
539.15838 |
For this test, the p-value is given as 0.93 which is greater than the given level of significance or alpha value 0.05. We know that if the p-value is less than the given level of significance or alpha value, then we reject the null hypothesis and if the p-value is greater than the given level of significance or alpha value, then we do not reject the null hypothesis. Here, we get p-value is greater than the given level of significance or alpha value 0.05, so we do not reject the null hypothesis that there is no any significant difference in the average monthly income of the male and female.
Now, we have to see the box plots for the comparison of the monthly income of the persons based on the education which is given as below:
Now, we have to check another claim or hypothesis whether there is any significant difference between the average monthly income for the persons with different education qualifications. For checking this claim, we have to use the one way analysis of variance or single factor ANOVA. The null and alternative hypothesis for this test is given as below:
Null hypothesis: H0: There is no any significant difference in the average monthly income for the persons with different education levels.
Alternative hypothesis: Ha: There is a significant difference in the average monthly income for the persons with different education levels.
We assume 5% level of significance for this test. The descriptive statistics for the monthly income of the persons based on the education level is given as below:
Descriptive Statistics |
|||||
Monthly income in AU$ |
|||||
N |
Mean |
Std. Deviation |
Minimum |
Maximum |
|
High School/Under-graduate |
15 |
4834.4 |
617.2999 |
4096 |
5936 |
Graduate |
16 |
6020.063 |
434.9119 |
5436 |
6967 |
Post-graduate |
19 |
6746.947 |
551.2485 |
6010 |
7897 |
Total |
50 |
5940.58 |
952.459 |
4096 |
7897 |
From the above table, it is observed that there is difference in the average monthly income of the persons based on the education level. The ANOVA table for this test is given as below:
ANOVA |
|||||
Monthly income in AU$ |
|||||
Sum of Squares |
df |
Mean Square |
F |
Sig. |
|
Between Groups |
3.081E7 |
2 |
1.540E7 |
53.075 |
.000 |
Within Groups |
1.364E7 |
47 |
290251.095 |
||
Total |
4.445E7 |
49 |
For this ANOVA test, we get the p-value as 0.00 which is less than the given level of significance or alpha value 0.05, so we reject the null hypothesis that there is no any significant difference in the average monthly income of the persons with different education levels. There is sufficient evidence to conclude that the average income for the persons with different education level is not same.
Now, we have to see the box plots for the overall monthly expenditure for the persons and box plots for the monthly expenditure for the male and female which is given as below:
Now, we have to check another claim whether there is any significant difference in the average monthly expenditure in male and female or not. For checking this claim we have to use two sample t test for the population mean. The null and alternative hypothesis for this test is given as below:
Null hypothesis: H0: There is no any significant difference in the average monthly expenditure of the male and female.
Alternative hypothesis: Ha: There is a significant difference in the average monthly expenditure of the male and female.
The descriptive statistics for the monthly expenditure for the male and females is given as below:
Group Statistics |
|||||
Gender |
N |
Mean |
Std. Deviation |
Std. Error Mean |
|
Monthly expenditure in AU$ |
Male |
28 |
4580.5000 |
898.24164 |
169.75171 |
Female |
22 |
4652.5000 |
1097.43041 |
233.97295 |
For this test we will assume 5% level of significance or alpha value.
The required two sample t test for the population mean is given as below:
Independent Samples Test |
||||||
t-test for Equality of Means |
||||||
t |
df |
Sig. (2-tailed) |
95% Confidence Interval of the Difference |
|||
Lower |
Upper |
|||||
Monthly expenditure in AU$ |
Equal variances assumed |
-.255 |
48 |
.800 |
-639.29274 |
495.29274 |
Equal variances not assumed |
-.249 |
40.252 |
.805 |
-656.10967 |
512.10967 |
For this test, we get the p-value as 0.80 which is greater than the given level of significance or alpha value 0.05, so we do not reject the null hypothesis that the average monthly expenditure for the male and female is same. There is sufficient evidence to conclude that the average monthly expenditure for the male and female is same.
The box plots for the monthly expenditure for the persons with different education levels are given as below:
From these box plots, it is observed that there is significant difference in the average monthly expenditure for the persons with different education levels.
The descriptive statistics for the monthly expenditure for the persons with different education levels is given as below:
Descriptive Statistics |
|||||
Monthly expenditure in AU$ |
|||||
N |
Mean |
Std. Deviation |
Minimum |
Maximum |
|
High School/Under-graduate |
15 |
3475.8000 |
656.47328 |
2718.00 |
4768.00 |
Graduate |
16 |
4660.6875 |
440.09419 |
4154.00 |
5572.00 |
Post-graduate |
19 |
5468.4737 |
519.70583 |
4788.00 |
6536.00 |
Total |
50 |
4612.1800 |
980.83593 |
2718.00 |
6536.00 |
Now, we have to check the claim whether there is any significant difference in the average monthly expenditure for the persons with different levels of education or not. For checking this claim or hypothesis we have to use ANOVA which is given as below:
ANOVA |
|||||
Monthly expenditure in AU$ |
|||||
Sum of Squares |
df |
Mean Square |
F |
Sig. |
|
Between Groups |
3.334E7 |
2 |
1.667E7 |
56.773 |
.000 |
Within Groups |
1.380E7 |
47 |
293624.225 |
||
Total |
4.714E7 |
49 |
For this ANOVA we get the p-value as 0.00 which is less than the given level of significance or alpha value 0.05. So, we reject the null hypothesis that there is no any significant difference between the average monthly expenditure in the persons with different education levels. There is sufficient evidence to conclude that the average expenditure of the persons with different education level is different.
Now, we want to check one more hypothesis or claim whether the gender of the person and education of the person is independent from each other or not. For checking this hypothesis or claim we have to use the Chi square test for independence of two categorical variables. Here, we have to check whether two categorical variables gender and education are independent from each other or not. The null and alternative hypothesis for this test is given as below:
Null hypothesis: H0: The two categorical variables gender and education are independent from each other.
Alternative hypothesis: Ha: The two categorical variables gender and education are not independent from each other.
We assume the 5% level of significance for this test.
The test statistic formula for this test is given as below:
Chi square = ∑[(O – E)^2/E]
Where O is the observed frequencies and E is the expected frequencies.
The expected frequency E is calculated as below:
E = Row total * column total / Grand total
The observed frequency table, expected frequency table and results for this test are summarized as below:
Observed Frequencies |
||||
Education |
||||
Gender |
Under Graduate |
Graduate |
Post Graduate |
Total |
Male |
8 |
10 |
10 |
28 |
Female |
7 |
6 |
9 |
22 |
Total |
15 |
16 |
19 |
50 |
Expected Frequencies |
||||
Education |
||||
Gender |
Under Graduate |
Graduate |
Post Graduate |
Total |
Male |
8.4 |
8.96 |
10.64 |
28 |
Female |
6.6 |
7.04 |
8.36 |
22 |
Total |
15 |
16 |
19 |
50 |
Data |
|
Level of Significance |
0.05 |
Number of Rows |
2 |
Number of Columns |
3 |
Degrees of Freedom |
2 |
Results |
|
Critical Value |
5.991464547 |
Chi-Square Test Statistic |
0.405132149 |
p-Value |
0.816632522 |
Do not reject the null hypothesis |
For the test, we get the p-value greater than the alpha value, so we do not reject the null hypothesis that the two categorical variables gender and education are independent from each other.
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2021). Statistical Analysis For Categorical And Quantitative Variables - Essay.. Retrieved from https://myassignmenthelp.com/free-samples/bus708-instruction-to-do-data-analysis-and-statistical/categorical-variable-and-a-quantitative-variable.html.
"Statistical Analysis For Categorical And Quantitative Variables - Essay.." My Assignment Help, 2021, https://myassignmenthelp.com/free-samples/bus708-instruction-to-do-data-analysis-and-statistical/categorical-variable-and-a-quantitative-variable.html.
My Assignment Help (2021) Statistical Analysis For Categorical And Quantitative Variables - Essay. [Online]. Available from: https://myassignmenthelp.com/free-samples/bus708-instruction-to-do-data-analysis-and-statistical/categorical-variable-and-a-quantitative-variable.html
[Accessed 19 August 2024].
My Assignment Help. 'Statistical Analysis For Categorical And Quantitative Variables - Essay.' (My Assignment Help, 2021) <https://myassignmenthelp.com/free-samples/bus708-instruction-to-do-data-analysis-and-statistical/categorical-variable-and-a-quantitative-variable.html> accessed 19 August 2024.
My Assignment Help. Statistical Analysis For Categorical And Quantitative Variables - Essay. [Internet]. My Assignment Help. 2021 [cited 19 August 2024]. Available from: https://myassignmenthelp.com/free-samples/bus708-instruction-to-do-data-analysis-and-statistical/categorical-variable-and-a-quantitative-variable.html.