RCH 8303, Quantitative Data Analysis 1

    Course Learning Outcomes for Unit III Upon completion of this unit, students should be able to:

    1. Perform statistical tests using software tools. 1.1 Describe the procedures to summarize and display data. 1.2 Report normality statistics.

    2. Explain results of statistical tests.

    2.1 Describe the process to determine whether data are normally distributed or not. 2.2 Demonstrate the procedures necessary to successfully create a histogram, bar chart, boxplot,

    Q-Q plot, and cross-tabulation test.

    3. Judge whether null hypotheses should be rejected or maintained. 3.1 Discuss how graphing data can help determine conclusions of our data. 3.2 Discuss differences between one-sided and two-sided hypotheses and when to use them. 3.3 Explain how to rule a rival hypotheses. 3.4 Discuss what contingency tables are and what they are used for.

    Course/Unit Learning Outcomes

    Learning Activity

    1.1, 1.2 Unit Lesson Chapter 5 Unit III Assignment 2

    2.1, 2.2, 3.1, 3.2, 3.3, 3.4

    Unit Lesson Unit III Assignment 1

    Required Unit Resources Chapter 5: Summarizing and Graphing Data

    Unit Lesson

    Introduction In Unit III, we now turn our focus to how a researcher can display and understand data using visual methods. Visualization of data is important to facilitate interpretation. Depending on the form or type of data the researcher needs to analyze, this will determine the type of data display methods that can be used. This unit will demonstrate various types of data display methods to allow researchers to better understand their data. From a researcher’s point of view, it is very important to view the data, and by doing so, the researcher is able to observe or make observations of their rather than simply relying on a numerical output. Also, the reader of the material is able to view the data as well and be able to form their opinions based on the data as well. This unit will focus on how to summarize and graph data; specifically, how to create a histogram, bar chart, boxplot, and a QQ Plot. In addition, instruction will focus on how to perform and report a cross-tabulation.

    Unit III Plan The Unit III Assignment will be in three parts.

    UNIT III STUDY GUIDE

    Summarizing and Graphing Data

    RCH 8303, Quantitative Data Analysis 2

    UNIT x STUDY GUIDE

    Title

    Part 1 of your assignment requires you to complete the Contingency Tables and Chi-Square Tests (ID 17630) module of the Collaborative Institutional Training Initiative (CITI) Program Essentials of Statistical (EOSA) located in Part 3. This module explains whether a contingency table can be analyzed using a chi-square test. What contingency tables are and the relationship between categorical variables are displayed. The module then demonstrates how to analyze relationships between categorical variables using chi-square tests. Part 2 will require you to construct a histogram, bar chart, boxplot, and QQ Plot. The results will be compiled and submitted in a single Microsoft Word file. Part 3 of your assignment is to perform a cross-tabulation, analyze the results, and report the result in APA format by submitting a single Microsoft Word file.

    What is a Histogram? Once data are collected, a researcher needs to be able to describe, summarize, and, potentially, detect patterns in the data they have recorded with meaningful numerical scales (McClave & Sincich, 2006). To do this, the researcher can utilize a histogram. A histogram allows a researcher to show the relationship between two variables that are continuous in nature (Gall et al., 2003). Huck (2004) notes that histograms are used to indicate how many times a score appears in a data set. The Essentials of Statistical Analysis (EOSA) module Distribution and Probability (ID 17613) presented in Unit I, provides an example of how a histogram is used to display data. A histogram displays data using values on on the X-axis (horizontal) and a Y-axis (vertical). R Commander provides an Option tab to define labels for the x and y axis and determine the width of the display bins. If you are not comfortable utilizing R and R commander you may use whatever statistical software program you choose. The answers you submit for your assignment must be correct regardless of the software you choose. R and R Commander make it easy and quick to construct a histogram. Using the data file “Duncan” provided by the text, one can quickly construct a histogram of any of the variables. Once the data file has been accessed, the process to create a histogram is very easy. Make sure when you access R that you also load R Commander. Type in library(Rcmdr) or see unit I for a refresher on how to gain access to R Commander. When R and R Commander have been loaded, selecting Graphs in the menu will present various options. Selecting Histogram allows us to utilize a data set to create a Histogram (Figure 1).

    RCH 8303, Quantitative Data Analysis 3

    UNIT x STUDY GUIDE

    Title

    Figure 1 Creating a Histogram From R Commander Menu System

    As depicted in Figure 2, once Histogram is selected, a user has two types of options. First, a user must select the variable to be displayed from the Data tab. In this case, the income variable was selected.

    RCH 8303, Quantitative Data Analysis 4

    UNIT x STUDY GUIDE

    Title

    Figure 2 Histogram Variable Selection

    A user could click “OK,” and the histogram would be created. This histogram would illustrate all income data regardless of groups/categories of data.

    RCH 8303, Quantitative Data Analysis 5

    UNIT x STUDY GUIDE

    Title

    However, if a user wanted to create a histogram of income by groups, such as type of income, the Groups tab could be accessed, and the type variables selected (Figure 3). Figure 3 Selection of Type From the Groups tab

    RCH 8303, Quantitative Data Analysis 6

    UNIT x STUDY GUIDE

    Title

    Selecting Options (Figure 4) allows the researcher to label the x- and y-axes and define the width of the bins of the data display. Figure 4 Histogram Options Selection tab

    RCH 8303, Quantitative Data Analysis 7

    UNIT x STUDY GUIDE

    Title

    Selecting a histogram of income by type would result in a three-histogram display (Figure 5). Figure 5 Histograms of Income by Occupation Type

    This approach may provide a researcher a better grasp of the data.

    What is a Bar Graph? Researchers need to create displays that are not misleading or over simplified, but informative and visually appealing (Gall et al., 2013). A bar graph is normally used to display the distribution of a categorical variable. Refer to Introduction to Statistics (ID 17609) for a refresher on categorical variables. Gall (2013) notes that a bar graph shows the relationship between variables. In a bar graph, the horizontal axis represents categories of a qualitative variable as opposed to a histogram where the horizontal axis represents a quantitative variable (Huck, 2004).

    RCH 8303, Quantitative Data Analysis 8

    UNIT x STUDY GUIDE

    Title

    Using the data set Nations, navigate to the Graphs menu and select Bar graph (Figure 6). Figure 6 Bar Graph Dialog Box

    RCH 8303, Quantitative Data Analysis 9

    UNIT x STUDY GUIDE

    Title

    Next, select region (Figure 7) and click “OK.” Figure 7 Bar Graph Variable Selection Sub-Menu

    RCH 8303, Quantitative Data Analysis 10

    UNIT x STUDY GUIDE

    Title

    The resulting graph depicts the frequency of records by region (Figure 8). Figure 8 Bar Graph Output Display

    What is a Boxplot? Visual representations of a data set are much more effective than numerical representations of data (Hartwig & Dearling, 1979). A boxplot displays the variability within a data set. McClave and Sincich (2006) note that boxplots are based on the quartiles of an existing data set and are very good for detecting outliers in data.

    RCH 8303, Quantitative Data Analysis 11

    UNIT x STUDY GUIDE

    Title

    McClave and Sincich remind the reader that boxplots are partitioned into four groups: The median, the interquartile range, the minimum and maximum, and outliers (Figure 9). Figure 9 Explanation of a Boxplot

    The EOSA modules Central Tendency and Variability (ID 17611) and Normal Distribution and Z-Scores (ID 17615) presented in Unit I have examples of how a boxplot is used.

    RCH 8303, Quantitative Data Analysis 12

    UNIT x STUDY GUIDE

    Title

    Using the data set Prestige, navigate to the Graphs menu and select Boxplot (Figure 10). Figure 10 Box Plot Dialog Box

    RCH 8303, Quantitative Data Analysis 13

    UNIT x STUDY GUIDE

    Title

    Next, choose the income variable and click “OK” (Figure 11). Figure 11 Box Plot Variable Selection Sub-Menu

    RCH 8303, Quantitative Data Analysis 14

    UNIT x STUDY GUIDE

    Title

    Note the boxplot identifies five outliers for further investigation (Figure 12). Figure 12 Box Plot Display With Visible Outliers

    RCH 8303, Quantitative Data Analysis 15

    UNIT x STUDY GUIDE

    Title

    Q-Q Plot Instead of simply viewing a histogram to interpret or determine whether your data are normally distributed, a Quantile-Quantile (Q-Q) plot can be used to determine normal distribution. A Q-Q plot allows visually comparing actual distributions to a theoretical distribution (Mayor, 2015). Using the data set Prestige, navigate to the Graphs menu and select Quantile-comparison plot (Figure 13). Figure 13 Quantile-Comparison Plot (Q-Q Plot) Menu Selection

    RCH 8303, Quantitative Data Analysis 16

    UNIT x STUDY GUIDE

    Title

    Next, choose the income variable to display and select “OK” (Figure 14). Figure 14 Q-Q Plot Variable Selection Option

    RCH 8303, Quantitative Data Analysis 17

    UNIT x STUDY GUIDE

    Title

    As depicted in Figure 15, the Q-Q Plot has identified data points that vary from not only a theoretical normal distribution (straight line), but from a 95% confidence interval (CI; dotted line). Finally, two outliers are identified by group. Figure 15 Q-Q Plot Display With Visible Outliers

    RCH 8303, Quantitative Data Analysis 18

    UNIT x STUDY GUIDE

    Title

    What is Cross Tabulation? A cross-tabulation, which is also known as a contingency table, uses a chi-square distribution to determine the association between variables. View page 35 in your textbook. The researcher would normally use this table to examine relationships between categorical variables. The Contingency Tables and Chi-Square Tests (ID 17630) module of the CITI Program EOSA located in part 3 will demonstrate this form of data analysis. Using the data set Nations, navigate to the Statistics menu, and select Contingency tables: Two-way table (Figure 16). Figure 16 Contingency Tables Dialog Box

    RCH 8303, Quantitative Data Analysis 19

    UNIT x STUDY GUIDE

    Title

    Choose the two variables, and select “OK” (Figure 17). Figure 17 Cross-Tabulation Variable Selection Sub-Menu

    RCH 8303, Quantitative Data Analysis 20

    UNIT x STUDY GUIDE

    Title

    Before selecting the “OK” button, click on the Statistics tab (Figure 18), and you will be able to select computer percentages and the type of hypothesis tests you are performing. Make sure the Chi-square test of independence has a check mark in it. Now you are ready to select OK. Figure 18 Two-Way Table Statistics Selection Menu

    RCH 8303, Quantitative Data Analysis 21

    UNIT x STUDY GUIDE

    Title

    The resulting output would be the statistical test. There are three values: the chi-square statistic (Χ2), the degrees of freedom (df), and the p-value (Figure 19). Figure 19 Cross-Tabulation Display With Pearson’s Chi-Squared Test Results

    To report the results of a Pearson chi-square test following the American Psychological Association (APA) Style Guide (7th ed.), a researcher would state:

    A chi-square test of independence was performed to examine the relationship between Total Fertility Rate (TFR) and region. The relation between these variables was significant, Χ2 (596) = 692.33, p = .004.

    References Gall, M. D., Gall, J. P., & Borg, W. R. (2003). Educational research: An introduction (7th ed.). Allyn and

    Bacon. Hartwig, F., & Dearling, B. E. (1979). Quantitative applications in the social sciences: Exploratory data

    analysis. SAGE Publications. Huck, S. W. (2004). Reading statistics and research. Allyn and Bacon. Mayor, E. (2015). Learning predictive analytics with R. Packt Publishing. McClave, J. T., & Sincich, T. (2006). Statistics (10th ed.). Prentice Hall.

    RCH 8303, Quantitative Data Analysis 22

    UNIT x STUDY GUIDE

    Title

    Learning Activities (Nongraded) Nongraded Learning Activities are provided to aid students in their course of study. You do not have to submit them. If you have questions, contact your instructor for further guidance and information. When studying APA formatting, pay particular attention to the sections that pertain to formatting for research and statistics. Review these sections as needed.

    • Course Learning Outcomes for Unit III
    • Required Unit Resources
    • Unit Lesson
      • Introduction
    • Unit III Plan
      • What is a Histogram?
      • What is a Bar Graph?
      • What is a Boxplot?
      • Q-Q Plot
      • What is Cross Tabulation?
      • References
    • Learning Activities (Nongraded)

                                                                                                                                      Order Now