Author: Tsakani Stella Rikhotso

  • 9012-2-12 SayPro Lesson INTRODUCTION

    We often have to gather information to establish the trends and reality of situations.

    Data tables assist us to organize this information logically so that it can be applied to the purpose it was intended for. Data tables are similar to a register or record of events or items that give us information and the information is given to use in rows and columns. A row is any horizontal collection of data while a column is any vertical collection of data.

    Example

    Number of

    students

    Male

    African

    56

    Coloured

    45

    Asian

    78

    White

    12

    Female 68 52 62 14
    Total 124 97 140 26

    2.2 MEASURES OF CENTRAL TENDENCY

    Measures of central tendency, or “location”, attempt to quantify what we mean when we think of as the “typical” or “average” score in a data set. The concept is extremely important and we encounter it frequently in daily life. For example, we often want to know before purchasing a car its average distance per litre of petrol. Or before accepting a job, you might want to know what a typical salary is for people in that position so you will know whether or not you are going to be paid what you are worth. Or, if you are a smoker, you might often think about how many cigarettes you smoke “on average” per day. Statistics geared toward measuring central tendency all focus on this concept of “typical” or “average.” As we will see, we often ask questions in psychological science revolving around how groups differ from each other “on average”. Answers to such a question tell us a lot about the phenomenon or process we are studying.

    2.2.1 Mean, Median, Mode, and Range

    Mean, median, and mode are three kinds of “averages”. There are many “averages” in statistics, but these are, I think, the three most common, and are certainly the three you are most likely to encounter in your pre-statistics courses, if the topic comes up.

    The “mean” is the “average” you’re used to, where you add up all the numbers and then divide by the number of numbers. The “median” is the “middle” value in the list of numbers. To find the median, your numbers have to be listed in numerical order, so you may have to rewrite your list first. The “mode” is the value that occurs most often. If no number is repeated, then there is no mode for the list.

    The “range” is just the difference between the largest and smallest values.

    Example

    Find the mean, median, mode, and range for the following list of values:

    13, 18, 13, 14, 13, 16, 14, 21, 13

    The mean is the usual average, so:

    (13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15

    Note that the mean isn’t a value from the original list. This is a common result. You should not assume that your mean will be one of your original numbers.

    The median is the middle value, so I’ll have to rewrite the list in order:

    13, 13, 13, 13, 14, 14, 16, 18, 21

    There are nine numbers in the list, so the middle one will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th number:

    13, 13, 13, 13, 14, 14, 16, 18, 21

    So the median is 14.

    The mode is the number that is repeated more often than any other, so 13 is the mode.

    The largest value in the list is 21, and the smallest is 13, so the range is 21 – 13 = 8.

    • Mean: 15
    • Median: 14
    • Mode: 13
    • Range: 8

    Note: The formula for the place to find the median is “( [the number of data points] + 1) ÷ 2”, but you don’t have to use this formula. You can just count in from both ends of the list until you meet in the middle, if you prefer. Either way will work.

     

    Example 2

    Find the mean, median, mode, and range for the following list of values:

    1, 2, 4, 7

    The mean is the usual average: (1 + 2 + 4 + 7) ÷ 4 = 14 ÷ 4 = 3.5

    The median is the middle number. In this example, the numbers are already listed in numerical order, so I don’t have to rewrite the list. But there is no “middle” number, because there is an even number of numbers. In this case, the median is the mean (the usual average) of the middle two values: (2 + 4) ÷ 2 = 6 ÷ 2 = 3

    The mode is the number that is repeated most often, but all the numbers appear only once. Then there is no mode.

    The largest value is 7, the smallest is 1, and their difference is 6, so the range is 6.

    ü  Mean: 3.5

    ü  Median: 3

    ü  Mode: none

    ü  Range: 6

    The list values were whole numbers, but the mean was a decimal value. Getting a decimal value for the mean (or for the median, if you have an even number of data points) is perfectly okay; don’t round your answers to try to match the format of the other numbers.

     

     

     

    Example 3

    Find the mean, median, mode, and range for the following list of values:

    8, 9, 10, 10, 10, 11, 11, 11, 12, 13

    The mean is the usual average:

    (8 + 9 + 10 + 10 + 10 + 11 + 11 + 11 + 12 + 13) ÷ 10 = 105 ÷ 10 = 10.5

     

     

    The median is the middle value. In a list of ten values, that will be the (10 + 1) ÷ 2 = 5.5th value; that is, I’ll need to average the fifth and sixth numbers to find the median:

    (10 + 11) ÷ 2 = 21 ÷ 2 = 10.5

    The mode is the number repeated most often. This list has two values that are repeated three times.

    The largest value is 13 and the smallest is 8, so the range is 13 – 8 = 5.

    • mean: 10.5
    • median: 10.5
    • modes: 10 and 11
    • range: 5

    While unusual, it can happen that two of the averages (the mean and the median, in this case) will have the same value.

    2.2.2 Frequency table

    A frequency table is the diagram that shows the number of times a particular incident took place.

    Example:

    In a learnership class, the following scores were achieved for an assessment of a

    learning programme by the 15 learners in the class:

    56% 29% 65% 74% 42%
    38% 92% 43% 98% 23%
    64% 81% 66% 68% 69%

     

    Exercise:

    A student has gotten the following grades on his tests: 87, 95, 76, and 88. He wants an 85 or better overall. What is the minimum grade he must get on the last test in order to achieve that average?

     

  • 9012-2-11 SayPro Lesson REPRESENT, ANALYSE AND INTERPRET DATA USING VARIOUS TECHNIQUES

    Specific Outcome

    Represent, analyse and interpret data using various techniques to investigate real-life and work problems

    Assessment Criteria

    On completion of this section you will be able to:

    • Graphical representations and numerical summaries are consistent with the data, are clear and appropriate to the situation and target audience.  (S0 2, AC 1)
    • Different representations of aspects of the data are compared to take a position on the issue.  (SO 2, AC 2)
    • Calculations and the use of statistics are correct and appropriate to the problem.  (SO 2, AC 3)
    • Interpretations of statistics are justified and applied to answer questions about the problem. . (SO 2, AC 4)
    • New questions that arise from the modelling of the data are discussed. (SO2, AC5)
  • 9012-2-10 SayPro Lesson ADING AND PLOTTING GRAPHS OF REAL-LIFE PROBLEMS

    Example:

    You are at home getting ready to go out to your stamp collecting club. You leave your house and jog the 1000m to the club. You arrive 5 minutes later. You exchange stamps and chat for 1 hour, then leave for home. It takes you 10 minutes. Plot a distance time-graph to represent your journey to and from the club.

    Solution:

    Drawing graphs of real-life problems

    1. Choose a suitable scale for each axis
    2. Decide how many points to plot.
    3. Draw the graph with suitable accuracy
    4. Provide a title and label the axis

    The journey to and from my stamp collecting club

    Exercise 

    1. Identify the graph that matches each of the following stories:
      1. I had just left home when I realized I had forgotten my books so I went back to pick them up.
      2. Things went fine until I had a flat tire.
      3. I started out calmly, but sped up when I realized I was going to be late.
        1. The graph at the right represents the typical day of a teenager. Answer these questions:
          1. What percent of the day is spent watching TV?
          2. How many hours are spent sleeping?
          3. What activity takes up the least amount of time?
          4. What activity takes up a quarter of the day?
          5. What two activities take up 50% of the day?
        1. What two activities take up 25% of the day?
          Answer these questions about the graph at the right:

          1. How many sets of data are represented?
          2. On approximately what calendar date does the graph begin?
          3. In what month does the graph reach its highest point?
        2. Answer these questions about the graph on the right:
          1. How many total miles did the car travel?
          2. What was the average speed of the car for the trip?
          3. Describe the motion of the car between hours 5 and 12?
          4. What direction is represented by line CD?
          5. How many miles were traveled in the first two hours of the trip?
          6. Which line represents the fastest speed?
          7. Exercise 

            1. Identify the graph that matches each of the following stories:
              1. I had just left home when I realized I had forgotten my books so I went back to pick them up.
              2. Things went fine until I had a flat tire.
              3. I started out calmly, but sped up when I realized I was going to be late.
            2. The graph at the right represents the typical day of a teenager. Answer these questions:
              1. What percent of the day is spent watching TV?
              2. How many hours are spent sleeping?
              3. What activity takes up the least amount of time?
              4. What activity takes up a quarter of the day?
              5. What two activities take up 50% of the day?

            3.    What two activities take up 25% of the day?
            Answer these questions about the graph at the right:

            a.    How many sets of data are represented?

            b.    On approximately what calendar date does the graph begin?

            c.     In what month does the graph reach its highest point?

            4.    Answer these questions about the graph on the right:

            a.    How many total miles did the car travel?

            b.    What was the average speed of the car for the trip?

            c.     Describe the motion of the car between hours 5 and 12?

            d.    What direction is represented by line CD?

            e.    How many miles were traveled in the first two hours of the trip?

            f.      Which line represents the fastest speed?

            5.    Answer these questions about the graph at the right:

            a.    What is the dependent variable on this graph?

            b.    Does the price per bushel always increase with demand?

            c.     What is the demand when the price is 5 per bushel?

            6.    The bar graph at right represents the declared majors of freshman enrolling at a university. Answer the following questions:

            a.    What is the total freshman enrollment of the college?

            b.    What percent of the students are majoring in physics?

            c.     How many students are majoring in economics?

            d.    How many more students major in polysci than in psych?

            7.    This graph represents the number of A’s earned in a particular college algebra class. Answer the following questions:

            a.    How many A’s were earned during the fall and spring of 1990?

            b.    How many more A’s were earned in the fall of 1991 than in the spring of 1991?

            c.     In which year were the most A’s earned?

            d.    In which semester were the most A’s earned?

            e.    In which semester and year were the fewest A’s earned?

  • 9012-2-9 SayPro Lesson SIMPLE RANDOM SAMPLING

    A simple random sample gives each member of the population an equal chance of being chosen.  It is not a haphazard sample as some people think!  One way of achieving a simple random sample is to number each element in the sampling frame (e.g. give everyone on the Electoral register a number) and then use random numbers to select the required sample.

    Random numbers can be obtained using your calculator, a spreadsheet, printed tables of random numbers, or by the more traditional methods of drawing slips of paper from a hat, tossing coins or rolling dice.

    The optimum sample is the one which maximises precision per unit cost, and by this criterion simple random sampling can often be bettered by other methods.

    Advantages

    Ideal for statistical purposes

    Disadvantages

    • Hard to achieve in practice
    • Requires an accurate list of the whole population
    • Expensive to conduct as those sampled may be scattered over a wide area

    1.3.1 RANDOM NUMBERS FROM A CALCULATOR OR SPREADSHEET

    Most electronic calculators have a RAN# function that produces a random decimal number between 0 and 1.  The formula =RAND ( ) in Excel achieves the same result, but to more decimal places.  So how can you use these to select a random sample?

    Suppose you wanted to select a random lottery number between 1 and 49.  There are two approaches.

    Firstly, you could multiply the electronic random number by 49 to get a random number between 0 and 49; Round this number up to the nearest whole number. For example, if the electronic random number is 0.497, when multiplied by 49 this gives 24.353, which you should round up to 25.

    Secondly, you could treat the electronic random number as a series of random digits and use the first two as your random number, ignoring any that are greater than 49.  For example, the electronic random number 0.632 has first two digits 63 and you ignore it, whereas 0.317 gives the random number 31.

     

    1.3.2 RANDOM NUMBER TABLES

    Random number tables consist of a randomly generated series of digits (0-9).  To make them easy to read there is typically a space between every 4th digit and between every 10th row.  When reading from random number tables you can begin anywhere (choose a number at random) but having once started you should continue to read across the line or down a column and NOT jump about.

    Here is an extract from a table of random sampling numbers:

    3680    2231    8846    5418    0498    5245    7071    2597

    If we were doing market research and wanted to sample two houses from a street containing houses numbered 1 to 48 we would read off the digits in pairs
    36    80    22    31    88    46    54    18    04    98    52    45    70    71    25    97
    and take the first two pairs that were less than 48, which gives house numbers 36 and 22.

    If we wanted to sample two houses from a much longer road with 140 houses in it we would need to read the digits off in groups of three:
    368    022    318    846    541    804    985    245    707    1 25    97
    and the numbers underlined would be the ones to visit: 22 and 125.

    Houses in a road usually have numbers attached, which is convenient (except where there is no number 13). In many cases, however, one has first to give each member of the population a number. For a group of 10 people we could number them as:

    0 Appleyard 5 Francis
    1 Banyard 6 Gray
    2 Croft 7 Hibbert
    3 Durran 8 Jones
    4 Entwhistle 9 Lillywhite

    By numbering them from 0 to 9 you need only use single digits from the random number table.   36802231884654180498524570712597. In this case the first digit is 3 and so Durran is chosen.

     

  • 9012-2-8 SayPro Lesson QUANTITATIVE AND QUALITATIVE DATA COLLECTION METHODS

    The Quantitative data collection methods rely on random sampling and structured data collection instruments that fit diverse experiences into predetermined response categories. They produce results that are easy to summarize, compare, and generalize.

    Quantitative research is concerned with testing hypotheses derived from theory and/or being able to estimate the size of a phenomenon of interest.  Depending on the research question, participants may be randomly assigned to different treatments.  If this is not feasible, the researcher may collect data on participant and situational characteristics in order to statistically control for their influence on the dependent, or outcome, variable. If the intent is to generalize from the research participants to a larger population, the researcher will employ probability sampling to select participants.

    Typical quantitative data gathering strategies include:

    • Experiments/clinical trials.
    • Observing and recording well-defined events (e.g., counting the number of patients waiting in emergency at specified times of the day).
    • Obtaining relevant data from management information systems.
    • Administering surveys with closed-ended questions (e.g., face-to face and telephone interviews, questionnaires etc).

    1.2.1 Interviews

    In Quantitative research (survey research), interviews are more structured than in Qualitative research. In a structured interview, the researcher asks a standard set of questions and nothing more.

    Face -to -face interviews have a distinct advantage of enabling the researcher to establish rapport with potential participants and therefore gain their cooperation. These interviews yield highest response rates in survey research. They also allow the researcher to clarify ambiguous answers and when appropriate, seek follow-up information. Disadvantages include being impractical when large samples are involved, time consuming and expensive.

    Telephone interviews are less time consuming and less expensive and the researcher has ready access to anyone on the planet that has a telephone.  Disadvantages are that the response rate is not as high as the face-to- face interview but considerably higher than the mailed questionnaire. The sample may be biased to the extent that people without phones are part of the population about whom the researcher wants to draw inferences.

    Computer Assisted Personal Interviewing (CAPI): is a form of personal interviewing, but instead of completing a questionnaire, the interviewer brings along a laptop or hand-held computer to enter the information directly into the database. This method saves time involved in processing the data, as well as saving the interviewer from carrying around hundreds of questionnaires. However, this type of data collection method can be expensive to set up and requires that interviewers have computer and typing skills.

    1.2.2 Questionnaires

    1. Paper-pencil-questionnaires can be sent to a large number of people and saves the researcher time and money. People are more truthful while responding to the questionnaires regarding controversial issues in particular due to the fact that their responses are anonymous. But they also have drawbacks. Majority of the people who receive questionnaires don’t return them and those who do might not be representative of the originally selected sample.
    2. Web based questionnaires: A new and inevitably growing methodology is the use of Internet based research. This would mean receiving an e-mail on which you would click on an address that would take you to a secure web-site to fill in a questionnaire. This type of research is often quicker and less Some disadvantages of this method include the exclusion of people who do not have a computer or are unable to access a computer. Also the validity of such surveys is in question as people might be in a hurry to complete it and so might not give accurate responses. Questionnaires often make use of Checklist and rating scales. These devices help simplify and quantify people’s behaviours and attitudes. A checklist is a list of behaviours, characteristics, or other entities that the researcher is looking for. Either the researcher or survey participant simply checks whether each item on the list is observed, present or true or vice versa. A rating scale is more useful when behaviour needs to be evaluated on a continuum. They are also known as Likert scales.
    3. Writing good survey questions

    Rules for writing good questions are given below:

    • Rule 1. Use correct spelling, punctuation and grammar.
    • Rule 2. Use specific questions. For example, “did you read a newspaper yesterday?” instead of “did you read a newspaper?”.
    • Rule 3. Use a short introduction to question of behaviours. In this way you cannot only refresh the memory of the respondent, but also explain what you mean with the concept you are using. For example, with wines, you may not only mean red or white wine, but liqueurs, cordials, sherries, tables wines and sparkling wines.
    • Rule 4. Avoid the use of technical terms and jargon. An exception to this rule is questions that are made for a specific group of respondents, who regularly use jargon, e.g., doctors, lawyers and researchers.
    • Rule 5. Avoid questions that do not have a single answer. For example, “do you like to walk and to bike to school?” Somebody who likes to walk, but does not like to cycle, cannot answer this question in the right way.
    • Rule 6. Avoid negative phrasing, e.g., “should the school not be improved?” This can lead to confusion and cost more effort to answer the question correctly.
    • Rule 7. Avoid words and expressions with multiple-meanings, like any and just.
    • Rule 8. Avoid stereotyping, offensive and emotionally loaded language
    1. Response formats

    Usually, a survey consists of a number of questions that the respondent has to answer in a set format. A distinction is made between open-ended and closed-ended questions. An open-ended question asks the respondent to formulate his own answer, whereas a closed-ended question has the respondent pick an answer from a given number of options. The response options for a closed-ended question should be exhaustive and mutually exclusive. Four types of response scales for closed-ended questions are distinguished:

    • Dichotomous, where the respondent has two options
    • Nominal-polychromous, where the respondent has more than two unordered options
    • Ordinal-polychromous, where the respondent has more than two ordered options
    • (bounded)Continuous, where the respondent is presented with a continuous scale

    A respondent’s answer to an open-ended question is coded into a response scale afterwards.

    Qualitative data collection methods play an important role in impact evaluation by providing information useful to understand the processes behind observed results and assess changes in people’s perceptions of their well-being. Furthermore, qualitative methods can be used to improve the quality of survey-based quantitative evaluations by helping generate evaluation hypothesis; strengthening the design of survey questionnaires and expanding or clarifying quantitative evaluation findings. These methods are characterized by the following attributes:

    • they tend to be open-ended and have less structured protocols (i.e., researchers may change the data collection strategy by adding, refining, or dropping techniques or informants)
    • they rely more heavily on interactive interviews; respondents may be interviewed several times to follow up on a particular issue, clarify concepts or check the reliability of data
    • they use triangulation to increase the credibility of their findings (i.e., researchers rely on multiple data collection methods to check the authenticity of their results)
    • usually their findings cannot be generalised to any specific population, rather each case study produces a single piece of evidence that can be used to seek general patterns among different studies of the same issue

    Regardless of the kinds of data involved, data collection in a qualitative study takes a great deal of time. The researcher needs to record any potentially useful data thoroughly, accurately, and systematically, using field notes, sketches, audiotapes, photographs and other suitable means. The data collection methods must observe the ethical principles of research.

    The qualitative methods most commonly used in evaluation can be classified in three broad categories:

    • in-depth interview
    • observation methods
    • document review