Choose a path through the site:

Common Types of Assessment Tools in Service-Learning Research

Several types of measurement procedures are common in research on service-learning: surveys and questionnaires, rating scales, interviews, focus groups, observational checklists, and rubrics for content analysis of student reflections. Instruments that were designed for course or program evaluation purposes can sometimes be adapted for research purposes. Each of the common research measurement tools is described below.

Surveys and Questionnaires

One of the most commonly-used research tools is the survey (also called a questionnaire). Surveys may be conducted in several ways: face-to-face, by telephone, by email, on the internet, or on paper. Surveys frequently incorporate rating scales, discussed below.

Many surveys that are used in service-learning research are self-report measures. In this type of survey, respondents report on their own attitudes, opinions, behaviors, behavioral intentions, feelings, or beliefs. They may ask about the occurrence of an event (e.g., "Were you nervous on your first day of service?"), the intensity (e.g., "How nervous?"), frequency (e.g., "How often did you tutor at the service site?"), and the degree of endorsement (e.g., "I was extremely nervous." "Strongly Agree," "Agree," "Disagree," "Strongly Disagree"), or about the likelihood (e.g., "How likely is it that you will be nervous at your next visit?" 'Very Likely" to 'Very Unlikely"). Self-report surveys are very useful for many research purposes because they obtain information directly from the respondent; however, researchers should keep in mind that self-report measures have several important drawbacks. One disadvantage is that they are subjective and may not coincide with ratings given by other sources of information (e.g., the instructor, an outside observer, another student, a staff member). Another drawback is that they may be subject to social desirability bias (the tendency for a person to give responses that are normative and present oneself in a good light).

Surveys and questionnaires may be composed of scales, which are intentionally designed coherent measures of a construct (e.g., trait, attribute) that combine multiple indicators. Although most questionnaires are composed of several scales, typically each individual scale is a multiple-item measure of only one construct. As such, a scale should display qualities consistent with the assumptions of being unidimensional and measuring only one construct (single factor structure, high coefficient alpha). Bringle, Phillips, and Hudson (2004) present a discussion of the characteristics of good scales, and they provide examples of scales that can be used in service-learning research.

Rating scales in surveys: Many surveys incorporate rating scales. Probably the most common format is known as a Likert-type response format, in which the study participant chooses an option from a list indicating his or her level of agreement or disagreement with a statement. For example:

I do not know what I would do without my cell phone.

  1. Strongly Disagree
  2. Disagree
  3. Agree
  4. Strongly Agree

Many variations on this type of rating item are possible, including the presence or absence of a neutral point and the number of choices (e.g., 4, 5, 6, 7). A less-commonly used scale format asks participants to give their opinion on some issue, experience, or product using a subjective rating scale. For example:

Rate the quality of the movie Jaws on a scale of 1 to 10, with 1 being low and 10 being high.

Yet another scale format requests participants to rank statements in order of preference or agreement. An example of this is the following:

Place a number next to each of the following to indicate your preference, with 1 being your first choice, 2 being your second choice, and so on.

  • ____ Working at a food pantry
  • ____ Tutoring children in an urban school
  • ____ Cleaning the grounds of a park
  • ____ Painting walls in a community center

Checklists in surveys: Checklists are another common element of surveys. Frequently these are seen with instructions such as "select one" or "check all that apply." For example:

Please indicate which activities you have participated in during the past year (check all that apply):

  • ___ Participating in community service through a course (service-learning)
  • ___ Volunteering for a service activity through campus, such as United Way Day of Caring
  • ___ Participating in a public debate, working on a political campaign, or assisting with voter registration
  • ___ Community involvement through a campus organization or club
  • ___ Community service or involvement as part of a financial aid package
  • ___ Service through another organization not connected to the university

A researcher may wish to have subjects indicate how often they have participated in particular activities, for example:

Please indicate how often you have participated in the following in the past year:

  • 1 = None/Never
  • 2 = Once each school year
  • 3 = Once or twice each semester
  • 4 = About once a month
  • 5 = Nearly every week or more
  • ___ Participating in community service through a course (service-learning)
  • ___ Volunteering for a service activity through campus, such as United Way Day of Caring
  • ___ Participating in a public debate, working on a political campaign, or assisting with voter registration
  • ___ Community involvement through a campus organization or club
  • ___ Community service or involvement as part of a financial aid package
  • ___ Service through another organization not connected to the university

Interviews and Focus Groups

An interview is another research tool that is especially useful at the exploration stage, or for qualitative research. Interviews can be conducted either in person or by telephone. They are similar to surveys, but are often used to assess information in more depth than would be possible in a survey. Interviewers usually have a protocol for asking questions and obtaining responses. Interview questions maybe open-ended (i.e., content determined by the interviewer) or structured (pre-determined content and order). Also, the responses can be open-ended (the respondent is free to say anything) or close-ended (pre-determined responses categories are chosen by the interviewer). Furthermore, interviews might be recorded (e.g., audio-taped or videotaped) for later analysis. Taped interviews can be transcribed for data analysis by judges.

Focus groups are interviews that are conducted in a group format. One of the biggest advantages of focus groups is that participants can interact and build on comments from each other (which may be offset by uneven participation). Another advantage is in saving time by conducting the interviews in groups rather than one-on-one. Shortcomings of focus groups include that the group format may suppress information from some respondents and the qualitative data analysis may be as time-intensive as analysis of interviews. Another disadvantage of focus groups is that participants may not have time or feel free to make completely honest comments in front of others. Focus groups also may not be as useful as interviews for getting in-depth information about a particular individual's experiences.

Interviews and focus group tapes generally must be transcribed for data analysis, often in the form of content analysis. Content analysis is described in the section "Reflection" below. In general, interviews are more expensive to conduct than surveys and questionnaires. There are expenses for training the interviewers, getting the respondents and the interviewer together, and the time for the interview. In addition, there is the risk that the interviewer's characteristics (e.g., gender, race, age) and paralinguistic cues will influence the respondent's answers. These shortcomings are attenuated or eliminated in written questionnaires.

Observation Rating Scales and Checklists

Sometimes a researcher may wish to have observations of behaviors made in a classroom or at a service site. This is especially useful for providing corroborative evidence to supplement information that students have supplied through surveys or reflections, or for corroborating information given by others (peers, community partners) at the service site.

One way to record observations is to keep a journal or log. This assessment method usually would be used to triangulate with other research data. To record observations in a more quantitative format, a researcher might choose to use a rating scale or checklist. An observational rating scale usually instructs the observer to rate the frequency, quality, or other characteristic of the behavior being observed, such as:

Number of times the tutor established eye contact with student X:
      0      1-2      3-5      10

Quality of nursing student's interactions with community health center staff:
      Low      Medium Low      Medium High      High

Behaviorally anchored rating scales detail the particular dimensions of action that a rater is to look for and requires the rater to determine the absence or frequency of behaviors that are indicative of the dimensions.

In an observational checklist or inventory an observer would make a checkmark on a list when a behavior was observed, for example:

  • ✓ Tutor established eye contact with student
  •     Tutor smiled at student
  • ✓ Tutor touched student in appropriate manner
  • ✓ Tutor used language appropriate to the age and abilities of the student
  • 3_ Total number of check marks

Document Review

A research project may require review of documents such as reflection products, course syllabi, faculty journals, meeting minutes, strategic plans, annual reports, or mission statements. These artifacts provide a rich source of information about programs and organizations. One limitation of documents and records is that they may be incomplete, inaccurate, or may vary in quality and quantity (Patton, 2002). Document review is usually associated with the qualitative approach to research, but depending on the research question, the researcher might utilize a rating scale, checklist, or rubric to summarize qualitative data from documents. Review of journals and reflections might also involve qualitative content analysis (described below). Gelmon, Holland, Driscoll, Spring, and Kerrigan (2001) provide examples of document review for service-learning faculty and institutional questions. Although their examples are designed for program evaluation purposes, they can be adapted for research purposes.

Reflection Products: Content Analysis and Rubrics

One of the most common tools used for service-learning assessment is student (or faculty) reflection products. Reflection can take many forms such as case studies, journals, portfolios, papers, discussion, presentations, and interviews. (For a discussion of reflection activities and how to structure reflection to enhance student learning, see www.compact.org/disciplines/reflection/) For research purposes, reflections are typically analyzed by one of two methods: content analysis, or rubrics.

Content analysis is a standard social science methodology for studying the content of communication. "Generally…content analysis is used to refer to any qualitative data reduction and sense-making effort that takes a volume of qualitative material and attempts to identify core consistencies and meanings…often called patterns or themes" (Patton, 2002, p. 453). In servicelearning research, much content analysis is informal in nature. In this technique, researchers develop a series of themes, categories, or coding frames. The process of discovering patterns, categories, or themes in the data is called inductive analysis or open coding. If a framework already exists, the process is called deductive analysis (Patton, 2002). Reflection products are coded against the categories, leading to deductions about common themes, issues processes, or ideas expressed, as well as student development along academic, social, or civic dimensions.

More formal content analysis is used when there are large amounts of data to be analyzed (see Eyler and Giles, 1999, for an example of theory-guided content analysis). Software programs have been developed, such as NVivo, to assist in content analysis. These programs are sometimes called "computer-assisted qualitative data analysis software", or CAQDAS. These programs code narratives based on keywords, themes, key phrases, or other salient features. The most widely available software programs are used on text materials, but programs such as NVivo can also be used to analyze audio, video, and other media. For more information on content analysis see:

Rubrics offer another way to analyze reflection products or other artifacts. A rubric is a scoring tool for subjective assessments, allowing for more standardized evaluation of products based on specified criteria. Rubrics can be either holistic (one-dimensional) or analytic, providing ratings along several dimensions. Table 3 presents an example of an analytic rubric developed by the IUPUI Center for Service and Learning. Rubrics usually occur in the form of a matrix, with the following characteristics:

  • Traits or dimensions that serve as the basis for judging products
  • Definitions or examples to illustrate the traits or dimensions
  • A scale of values on which to rate the traits
  • Standards or examples for each performance level

Researchers should have multiple people providing ratings of each reflection or artifact, and should establish the inter-rater reliability of any rubrics used in an investigation. Inter-rater reliability is the degree to which different observers or raters give consistent scores using the same instrument, rating scale, or rubric. Knowing the inter-rater reliability and using multiple raters helps to establish the credibility of the rubric being used and helps the investigator feel confidence in the results and conclusions coming from the research.

A good source for pre-made, editable rubrics is RubiStar (rubistar.4teachers.org). Other rubrics for evaluating student reflections in service-learning courses are available online:

Pre-existing Data Sources

Researchers also can conduct secondary data analysis on data that has been collected by another researcher. Research using pre-existing data should be guided by theory, focus on clear research questions or hypotheses, and be consistent with the constraints of the data (sampling, subject population, measurement, design). Although so far this technique has not been used extensively in service-learning research, NSLC has begun to compile sets of data for secondary data analysis. Two existing sources of data that are particularly relevant for service-learning researchers are:

Table 3. Sample scoring rubric for a student journal (Levels: Criteria)
(Developed by Stephen Jones, IUPUI Center on Service and Learning)

  • Reflective practitioner:
    • Clarity: The language is clear and expressive. The reader can create a mental picture of the situation being described. Abstract concepts are explained accurately. Explanation of concepts makes sense to an uninformed reader.
    • Relevance: The learning experience being reflected upon is relevant and meaningful to the student and course learning goals.
    • Analysis: The reflection moves beyond simple description of the experience to an analysis of how the experience contributed to student understanding of self, others, and/or course concepts. Analysis has both breadth (incorporation of multiple perspectives) and depth (premises and claims supported by evidence).
    • Interconnections: The reflection demonstrates connections between the experience and material from other courses, past experience, and/or personal goals.
    • Self-criticism: The reflection demonstrates ability of the student to question biases, stereotypes, preconceptions, and/or assumptions and define new modes of thinking as a result.
  • Aware practitioner:
    • Clarity: Minor, infrequent lapses in clarity and accuracy.
    • Relevance: The learning experience being reflected upon is relevant and meaningful to the student and course learning goals.
    • Analysis: The reflection demonstrates student's attempts to analyze the experience but analysis lacks depth and breadth.
    • Interconnections: The reflection demonstrates connections between the experience and material from other courses, past experience, and/or personal goals.
    • Self-criticism: The reflection demonstrates ability of the student to question biases, stereotypes, and preconceptions.
  • Reflection novice:
    • Clarity: There are frequent lapses in clarity and accuracy.
    • Relevance: Student makes attempts to demonstrate relevance, but the relevance is unclear to the reader.
    • Analysis: Student makes attempts at applying the learning experience to understanding of self, others, and/or course concepts but fails to demonstrate depth and breadth of analysis.
    • Interconnections: There is little to no attempt to demonstrate connections between the learning experience and previous personal and/or learning experiences.
    • Self-criticism: There is some attempt at self-criticism, but the self-reflection fails to demonstrate a new awareness of personal biases, etc.
  • Unacceptable:
    • Clarity: Language is unclear and confusing throughout. Concepts are either not discussed or are presented inaccurately.
    • Relevance: Most of the reflection is irrelevant to student and/or course learning goals.
    • Analysis: Reflection does not move beyond description of the learning experience(s).
    • Interconnection: No attempt to demonstrate connections to previous learning or experience.
    • Self-criticism: Not attempt at self-criticism.

 

Other sources of data for secondary analysis can be found at the following websites:

Many colleges and universities have campus-wide data from surveys of students, faculty, and staff that can be used for comparison purposes. For example, the IUPUI page on Student, Staff and Faculty Surveys (www.planning.iupui.edu/95.html) includes results from the National Survey of Student Engagement (NSSE) and Faculty Survey of Student Engagement (FSSE).