Putting it all together
Two variations of using these data for evaluation are provided here.
The Four Category version allows review committees or chairs/deans to place faculty into one of four categories for more discriminate evaluations for tenure and promotion and/or merit pay needs or for institutions that have other incentives in place to encourage improvement of instruction.
The Two Category variation is a simplified tool to determine if a faculty member is either Performing as Expected or Needs Improvement.
This dichotomous approach may fit the needs of some chairs or deans who are making annual reviews, for instance, and just want to ensure a faculty member is performing and will be retained.
A balanced approach
The model is designed so that student feedback via the SRI is taken into consideration, but other sources of evidence are considered so that a balanced view of faculty instructional effectiveness is obtained. SRI scores could be fairly low, placing an instructor in the lower comparison group, but the instructor is placed in a higher evaluation category based on other criteria. This might include, for example, the understanding that an instructor tried flipping their classrooms for the first time with mixed results. In their evaluation, they should be rewarded for trying even though SRI scores may not have shown a positive reaction from students. It is important to keep this example in mind when developing your own system, as a system that is too rigid in its approach may limit or completely stifle innovation in the classroom. In all cases, an instructor with very low SRI scores would need strong evidence of effort and improvement to receive a more favorable evaluation than one with high SRI scores. The result, then, is a fair and comprehensive evaluation that considers multiple data points. Student feedback is part of the formula but only one part of a multifaceted evaluation.
SRI Score: calculating an SRI score across time and courses
To determine which SRI category a faculty member is in for the selected period of time, gather the Converted Average Comparison Summary score for each course in the selected time period. For someone teaching four courses two semesters a year, this would be 24 scores for a typical three-year period. Then calculate the median score for this period of time (this simple spreadsheet is available for your use in calculating the median score). The median is the preferred measure since it is the midpoint of scores ranked from highest to lowest. Therefore, it is less likely to be affected by extreme scores. The mean, on the other hand, could be pulled in either direction by extreme scores. Since all instructors have the occasional course where things go awry, often in circumstances beyond their control, the median converted adjusted score is the best measure to summarize the converted average over a period of time.
This median score then, their representative Converted Average Comparison Summary score for the period of time being evaluated, is used to determine which of the five categories in which to place an instructor in the first part of the evaluation model: much lower, lower, similar, higher, or much higher than others in the comparison group. This chart shows you which median score range is associated with which category.
| Converted Score
|| Percent Distribution
| 37 or lower
||Much lower 10%
| 38 to 44
| 45 to 55
| 56 to 62
| 63 or above
||Much higher 10%
Other considerations. There should be a minimum of six courses in the review period with the goal of having as many courses as possible (see Technical Report 18). IDEA recommends the use of adjusted scores over raw scores because they take into account and adjust scores based on factors beyond the control of instructors. See Adjusted Scores at a Glance for more about adjusted scores.
Weighting courses. There may be times when a review of an instructor’s work in a particular course may be more important than other courses. For instance, if you have an important introductory course in your department that is vital for recruiting students to your major or ensuring large enrollments in the course is important to the department bottom line, then teaching this large course is a very important part of an instructor’s load compared to the other, smaller, upper level courses they teach. In such cases, you might consider ways of weighting some courses over others in your evaluations.