Behavioral Formative Assessment: Direct Behavior Rating Item Construction

[vc_row][vc_column width=”2/3″][vc_custom_heading text=”Behavioral Formative Assessment:
Direct Behavior Rating
Item Construction” font_container=”tag:h2|font_size:42|text_align:center”][/vc_column][vc_column width=”1/3″][vc_column_text css=”.vc_custom_1460228953244{padding-top: 2em !important;}”]Meagan Medley, PhD – Nicholls State University

Kristin Johnson, PhD & Ayesha Kurshid, PhD The Institute for Evidence-Based Reform (TIER)

Gary L. Cates, PhD – Illinois State University[/vc_column_text][/vc_column][/vc_row][vc_row][vc_column width=”2/3″][vc_empty_space][vc_custom_heading text=”Lit Review” font_container=”tag:h2|font_size:34|text_align:left”][vc_column_text]Direct Behavior Rating (DBR): An umbrella term to include multiple tools that use similar design and procedures (Chafouleas, 2011). Over the years, several themes have emerged to connect these tools and define each as part of the DBR category (e.g., Chafouleas, Riley-Tillman, & McDougal, 2002):

1. DBR includes at least 1+ behaviors rated by someone with frequent target student interaction in a school setting. Behaviors can be defined either narrowly or defined more globally as clusters.

2. The DBR observer (e.g., teacher) immediately rates one or more times during an instructional day. Rating periods vary from 10 minute to half/whole instructional days (e.g., Chafouleas, Sanetti, Kilgus, & Maggin, 2012).

3. The DBR data are communicated across stakeholders either within classrooms or other settings (.e.g., playground, bus) or between school and home.

DBR Research & Use: Studies on the DBR have spanned nearly 50 years with well over 70 research articles examining the efficacy of the DBR as an intervention. The DBRC has been widely used across age ranges (preschool to secondary), multiple settings (e.g., in patient, home, and school), situations (Chafouleas, Riley-Tillman, Sassu, LaFrance, & Patwa, 2007), and behaviors (Vannest, Davis, Davis, Mason, & Burke, 2010). Only recently have researchers started to examine the DBR as an assessment tool for formative evaluation.

The DBR merges aspects of the systematic direct observation and likert scale (3-10 points) by allowing for frequent observed rating scale assessment of behavior with intervals. Like ratings scales, DBRs use summative evaluation instead of an actual count.

Psychometric Properties: Two studies arrived at their items (behaviors) and measurement system differently. The sample sizes in the above mentioned studies were extremely small and procedures were vastly different in each study.

1. Item wording influenced the rating accuracy of DBR data for some, but not all, behavior targets. Those findings suggested rating accuracy across conditions for behaviors including compliance and disruption, but improved accuracy for ratings of academically engaged when worded positively versus negatively (Riley-Tillman et al., 2009). Limitations: sample of raters were undergraduate students

2. Negative wording for disruption and positive wording for academic performances were rated accurately using a 6 point scale based on Chafouleas and colleagues (2009) when measuring a DBR multiple items scale (DBR-MIS). (Volpe & Briesch, 2012). Limitations: sample of raters were 9 graduate students

Summary & Purpose: DBR is a flexible method of assessment that might take a variety of forms, including that of single-item scales (e.g., Chafouleas, Christ, RileyTillman, Briesch, & Chanese, 2007) and multiple-item scales (e.g., Fabiano, Vujnovic, Naylor, Pariseau, & Robins, 2009). All of the studies have promising results for the DBR being an adequate, reliable, and defensible measure. However, no studies to date have examined the different methodologies employed on a larger, more diverse sample size as well as comparing the different methodologies against one another.

Current Study: Examined item construction of a single and multiple item scale using negative versus positive wording for academic performance. School psychologists and teachers used 3 dimensions (criterion relatedness, treatment validity, and observability; Volpe & Briesch, 2012).

Research Questions:

1. What behaviors do teachers and school psychologists perceive as socially important, observable, and measurable of active engagement?

2. Do School Psychologists and teachers rate positively and negatively worded items the same on active engagement?

3. Do teachers at the elementary and secondary level rate items the same?[/vc_column_text][vc_empty_space][vc_empty_space][vc_single_image image=”547″][/vc_column][vc_column width=”1/3″][vc_empty_space][vc_empty_space][vc_empty_space][vc_custom_heading text=”Participants” font_container=”tag:h2|font_size:36|text_align:left”][vc_empty_space][vc_single_image image=”544″][vc_empty_space][vc_empty_space][vc_single_image image=”546″][/vc_column][/vc_row][vc_row][vc_column][/vc_column][/vc_row][vc_row][vc_column width=”1/2″][vc_custom_heading text=”Method” font_container=”tag:h2|font_size:36|text_align:left”][vc_column_text]Participants

Demographics are presented in table form. Participants were collected via online solicitation using Survey Monkey Southern states were primarily represented. ~75% from AL, MS, TN, LA, TX, SC, VA, KY, & FL. All other participants reside in 14 nonsouthern states.

Dependent Variables

Ratings on the 5-point Likert (strongly disagree (1) – strongly agree (5)) for measurability, observability and social importance on items related to active engagement. See table.

Independent Variables

  • Item Wording: Positive vs Negative
  • School Role: Teachers vs School Psychologists


  1. Paired Sample T-test between wordings
  2. Repeated Measures ANOVA between school role

[/vc_column_text][/vc_column][vc_column width=”1/2″][vc_custom_heading text=”Discussion” font_container=”tag:h2|font_size:36|text_align:left”][vc_column_text]Measurability

  • Directionality was significant.
  • However, teachers and school psychologists did not differ significantly in their rating.
  • Overall, positively worded items were rated higher than negatively rated items. Both psychologists and teachers’ ratings were similar.


  • No significant findings.

Social Importance

  • There was no difference between the ratings of positive and negatively worded items. School psychologists overall ratings were higher than teachers (regardless of directionality). Directionality X position interaction was significant.
  • This study expands the literature base to include practitioners rather than graduate and undergraduate participants.

Limitations and Future Research

Limitations within the participants include:

  • lack of urban sample,
  • minimal minority group in the sample and
  • limited previous use of DBRC overall by the sample.

[/vc_column_text][/vc_column][/vc_row][vc_row][vc_column][vc_text_separator title=”Results” el_class=”h2″][/vc_column][/vc_row][vc_row][vc_column width=”1/2″][vc_single_image image=”548″][/vc_column][vc_column width=”1/2″][vc_single_image image=”548″][/vc_column][/vc_row][vc_row][vc_column][vc_btn title=”Download as PDF” size=”lg” align=”center” button_block=”true” link=”||”][vc_tta_accordion][vc_tta_section title=”References” tab_id=”1460230362939-fe2487c6-b826″][vc_column_text]Chafouleas, S. M. (2011). Direct behavior rating: A review of the issues and research in its development. Education and Treatment of Children, 34, 575–591.

Chafouleas, S. M., Christ, T., Riley-Tillman, T. C., Briesch, A. M., & Chanese, J. A. M. (2007). Generalizability and dependability of Direct Behavior Ratings to measure social behavior of preschoolers. School Psychology Review, 36, 63–79.

Chafouleas, S. M., Riley-Tillman, T. C., & Christ, T. J. (2009). Direct behavior rating (DBR): An emerging method for assessing social behavior within a tiered intervention system. Assessment for Effective Intervention, 34, 195–200.

Chafouleas, S.M., Riley-Tillman, T. C., & McDougal, J. (2002). Good, bad, or in-between: How does the daily behavior report card rate? Psychology in the Schools, 39, 157–169.

Chafouleas, S. M., Riley-Tillman, T. C., Sassu, K. A., LaFrance, M. J., & Patwa, S. S. (2007). The consistency of Daily Behavior Report Cards in monitoring interventions. Journal of Positive Behavior Interventions, 9, 30–37.

Fabiano, G. A., Vujnovic, R., Naylor, J., Pariseau, M., & Robins, M. L. (2009). Psychometric properties of Daily Behavior Report Cards used to provide progress monitoring for students with attention-deficit/hyperactivity Disorder receiving special education. Assessment for Effective Intervention, 34, 231–241.

Riley-Tillman, T. C., Chafouleas, S. M., Christ, T., Briesch, A. M., & LeBel, T. J. (2009). The impact of item wording and behavioral specificity on the accuracy of Direct Behavior Ratings (DBRs). School Psychology Quarterly, 24, 1 – 12.

Riley-Tillman, T. C., Chafouleas, S. M., Briesch, A. M., Eckert, T. L. (2008). Daily behavior report cards and systematic direct observation: An investigation of the acceptability, reported training and use, and decision reliability among school psychologists. Journal of Behavior Education, 17, 313-327.

Volpe, R. J. & Briesch, A. M. (2012) Generalizability and dependability of single-item and multiple-item direct behavior rating scales for engagement and disruptive behavior. School Psychology Review, 41, 246-261.[/vc_column_text][/vc_tta_section][/vc_tta_accordion][/vc_column][/vc_row]