Evaluation Criteria

Self-Evaluation

As part of this process, faculty complete a self-evaluation that addresses the three criteria (Instructional Effort, Student Engagement, and Design for Learning) as well as analysis and responses, as appropriate, to student feedback. This may include explanations that some courses in the review period were taught for the first time or under extenuating circumstances that may have led to lower than desired SRI scores for instance. It also includes a thoughtful response to student ratings in specific courses and how the faculty member responded to the feedback. Faculty also provide evidence for changes to course materials, assignments, class activities and such in response to student feedback, student performance on assessments, suggestions from mentors or others and attempts to apply best practices in instruction. Effective self-evaluations include a reflective summary of overall improvements and lessons learned during the review period.

Peer Evaluation

This optional method of evaluation refers specifically to observation of classroom instruction conducted by fellow faculty and/or chairs and deans.   

 

Peer observation of instruction should be designed to assess instructor behaviors that support student learning including assessments of the three criteria of instructional effort, student engagement, and to the degree possible, course design elements such as class learning activities.Numerous rubrics and guidelines have been developed for conducting peer evaluations and are available online. They typically include observations of faculty preparedness and organization, engagement of students, effective use of learning strategies, and effective communication techniques among others and often encourage a pre-class review of class goals and conversation with the instructor. As with any data collection, the more the better. So for evaluation purposes, more than one observation—preferably from more than one observer—will yield the most useful information.


Review of Evidence

The chair, dean, and/or P&T committee reviews the self-evaluation, summaries of peer evaluations, and course materials provided (and may request copies of other materials such as exams and syllabi) to determine how well the instructor is meeting each of the three criteria. SRI data may also be reviewed to provide evidence for some criteria. Student comments, for example, may provide evidence for student engagement and trends in SRI scores over time may provide evidence for improvement. Institutions may develop their own rubrics to use in evaluating the criteria.

Methods of Evaluation

Instructional Effort

Faculty are assessed for their instructional effort including improvement over time in SRI scores, responding to feedback from students and peers by adjusting instruction, evidence of modifying instruction to improve student learning outcomes, evidence of trying new things, and whatever other evidence an institution believes is important to assess. In essence, this is a determination of whether nor not an instructor is engaging in reflective practice—the thoughtful consideration of evidence to improve student learning. The combination of these evidences are synthesized into a single assessment of instructional effort. Instructors provide a self-assessment as part of this criterion that is used in the evaluation. The chair or evaluation committee then reviews the available evidence to determine the appropriate placement of the faculty based on the evidence. 

Student Engagement

Student engagement affects learning, retention, and student commitment to programs, and this criterion is an attempt to measure the quality of an instructor’s contribution to engaging students in learning. Student engagement is a term often used to cover a wide array of concepts but is used here to address those instructor efforts which engage students with course content, and hence provide motivation for learning, as well as interpersonal connections between student and faculty and the larger learning community. Content-related engagement includes effective use of teaching strategies that motivate and draw students into learning including the instructor’s own interaction with students about course content that motivates and challenges them. It also includes the direct interactions that an instructor has with students that promote learning and student success. Is the instructor approachable and available to students during and outside of class, for instance?

Design for Learning

This criterion is an attempt to assess course design elements within control of the instructor including assessments, assignments, syllabi, and other elements important to a campus. Are courses purposefully designed to achieve the outcomes most desired in a course? Some institutions might look for evidence that courses are backward designed as a part of this assessment. For online courses, this assessment might include evaluations of course organization, designing for instructor presence and more. 

Other measures considered in evaluation criteria could include advising ratings; departmental or institutional learning outcome assessments; or other criteria of importance to the unit. For the criteria of interest, campuses could create rubrics, or borrow existing ones, to further operationalize and standardize the assessment of each.


Putting it all together

Two variations of using these data for evaluation are provided here. 

The Four Category version allows review committees or chairs/deans to place faculty into one of four categories for more discriminate evaluations for tenure and promotion and/or merit pay needs or for institutions that have other incentives in place to encourage improvement of instruction.

Four Category Evaluation Tool

The Two Category variation is a simplified tool to determine if a faculty member is either Performing as Expected or Needs Improvement.

Two Category Evaluation Tool

This dichotomous approach may fit the needs of some chairs or deans who are making annual reviews, for instance, and just want to ensure a faculty member is performing and will be retained. 


A balanced approach

The model is designed so that student feedback via the SRI is taken into consideration, but other sources of evidence are considered so that a balanced view of faculty instructional effectiveness is obtained. SRI scores could be fairly low, placing an instructor in the lower comparison group, but the instructor is placed in a higher evaluation category based on other criteria. This might include, for example, the understanding that an instructor tried flipping their classrooms for the first time with mixed results. In their evaluation, they should be rewarded for trying even though SRI scores may not have shown a positive reaction from students. It is important to keep this example in mind when developing your own system, as a system that is too rigid in its approach may limit or completely stifle innovation in the classroom. In all cases, an instructor with very low SRI scores would need strong evidence of effort and improvement to receive a more favorable evaluation than one with high SRI scores. The result, then, is a fair and comprehensive evaluation that considers multiple data points. Student feedback is part of the formula but only one part of a multifaceted evaluation.


SRI Score: calculating an SRI score across time and courses 

To determine which SRI category a faculty member is in for the selected period of time,Adjusted Score gather the Converted Average Comparison Summary score for each course in the selected time period. For someone teaching four courses two semesters a year, this would be 24 scores for a typical three-year period. Then calculate the median score for this period of time (this simple spreadsheet is available for your use in calculating the median score). The median is the preferred measure since it is the midpoint of scores ranked from highest to lowest. Therefore, it is less likely to be affected by extreme scores. The mean, on the other hand, could be pulled in either direction by extreme scores. Since all instructors have the occasional course where things go awry, often in circumstances beyond their control, the median converted adjusted score is the best measure to summarize the converted average over a period of time. 

This median score then, their representative Converted Average Comparison Summary score for the period of time being evaluated, is used to determine which of the five categories in which to place an instructor in the first part of the evaluation model: much lower, lower, similar, higher, or much higher than others in the comparison group. This chart shows you which median score range is associated with which category. 

  Converted Score      Percent Distribution   Label
 37 or lower Much lower 10% Much Lower
 38 to 44 Lower 20% Lower
 45 to 55 Similar 40% Similar
 56 to 62 Higher 20% Higher
 63 or above Much higher 10% Much Higher
     

Other considerations. There should be a minimum of six courses in the review period with the goal of having as many courses as possible (see Technical Report 18). IDEA recommends the use of adjusted scores over raw scores because they take into account and adjust scores based on factors beyond the control of instructors. See Adjusted Scores at a Glance for more about adjusted scores.  

Weighting courses. There may be times when a review of an instructor’s work in a particular course may be more important than other courses. For instance, if you have an important introductory course in your department that is vital for recruiting students to your major or ensuring large enrollments in the course is important to the department bottom line, then teaching this large course is a very important part of an instructor’s load compared to the other, smaller, upper level courses they teach. In such cases, you might consider ways of weighting some courses over others in your evaluations. 

Share

IDEA

301 South Fourth Street, Suite 200, Manhattan, KS 66502
Toll-Free: (800) 255-2757   Office: (785) 320-2400   Email Us

Connect
GuideStar Gold Participant