Law School Cafe

The Latest Change in the MBE

September 5th, 2016 / By Deborah J. Merritt

In the memo announcing results from the July 2016 MBE, Erica Moeser also notified law school deans about an upcoming change in the test. For many years the 200-question exam has included 190 scored items and 10 pre-test questions. Starting in February 2017, the numbers will shift to 175 scored items and 25 pre-test ones.

Pre-testing is an important feature of standardized exams. The administrator uses pre-test answers to gauge a question’s clarity, difficulty, and usefulness for future exams. When examinees answer those questions, they improve the design of future tests.

From the test-taker’s perspective, these pre-test questions are indistinguishable from scored ones. Like other test-makers, NCBE scatters its pre-test questions throughout the exam. Examinees answer each question without knowing whether it is a “real” item that will contribute to their score or a pre-test one that will not.

So what are the implications of NCBE’s increase in the number of pre-test items? The shift is relatively large, from 10 questions (5% of the exam) to 25 (12.5% of the exam). I have three concerns about this change: fair treatment of human research subjects, reliability of the exam, and the possible impact on bar passage rates. I’ll explore the first of these concerns here and turn to the others in subsequent posts.

(more…)

» Read the full text for The Latest Change in the MBE

Data, Bar Exam, Human Subjects, MBE No Comments Yet

Surprise: MBE Scores Rise in 2016

August 31st, 2016 / By Deborah J. Merritt

Erica Moeser, President of the National Conference of Bar Examiners, sent a memo to law school deans today. The memo reported the welcome, but surprising, news that the national mean score on the MBE was higher in July 2016 than in July 2015. Last year, the national mean was just 139.9. This year, it’s 140.3.

That’s a small increase, but it’s nonetheless noteworthy. LSAT scores for entering law students have been falling for several years. The drop between fall 2012 and fall 2013 was quite noticeable: Seventy percent of ABA-accredited law schools experienced a drop in the 25th percentile score of their entering class. At 19 schools, that score fell 3 points. At another five, it was 4 points.

LSAT scores correlate with MBE scores, so many observers expected July 2016 MBE scores to be lower than those recorded in 2015. Moeser, for example, has repeatedly stressed the link between LSAT scores and MBE ones. She recently declared: “What would surprise me is if LSAT scores dropped and bar pass rates didn’t go down.”

Moeser just received that surprise: Students who began law school in fall 2013 had lower LSAT scores than those who began a year earlier. The former students, however, beat the latter on the MBE after graduation.

So What Happened?

Unpacking this news will take more time and data. Moeser mentions in her memo that the mean MBE score increased in 22 jurisdictions, fell in 26, and remained stable in two. Teasing apart the jurisdictions will provide insights. School-specific results will be even more informative in exploring why the overall score rose.

For now, I offer four hypotheses in descending order of likelihood (from my perspective):

(more…)

» Read the full text for Surprise: MBE Scores Rise in 2016

Teaching, Bar Exam, MBE View Comments (3)

On the Bar Exam, My Graduates Are Your Graduates

May 12th, 2015 / By Deborah J. Merritt

It’s no secret that the qualifications of law students have declined since 2010. As applications fell, schools started dipping further into their applicant pools. LSAT scores offer one measure of this trend. Jerry Organ has summarized changes in those scores for the entering classes of 2010 through 2014. Based on Organ’s data, average LSAT scores for accredited law schools fell:

* 2.3 points at the 75th percentile
* 2.7 points at the median
* 3.4 points at the 25th percentile

Among other problems, this trend raises significant concerns about bar passage rates. Indeed, the President of the National Conference of Bar Examiners (NCBE) blamed the July 2014 drop in MBE scores on the fact that the Class of 2014 (which entered law school in 2011) was “less able” than earlier classes. I have suggested that the ExamSoft debacle contributed substantially to the score decline, but here I focus on the future. What will the drop in student quality mean for the bar exam?

Falling Bar Passage Rates

Most observers agree that bar passage rates are likely to fall over the coming years. Indeed, they may have already started that decline with the July 2014 and February 2015 exam administrations. I believe that the ExamSoft crisis and MBE content changes account for much of those slumps, but there is little doubt that bar passage rates will remain depressed and continue to fall.

A substantial part of the decline will stem from examinees with very low LSAT scores. Prior studies suggest that students with low scores (especially those with scores below 145) are at high risk of failing the bar. As the number of low-LSAT students increases at law schools, the number (and percentage) of bar failures probably will mount as well.

The impact, however, will not be limited just to those students. As I explained in a previous post, NCBE’s process of equating and scaling the MBE can drag down scores for all examinees when the group as a whole performs poorly. This occurs because the lower overall performance prompts NCBE to “scale down” MBE scores for all test-takers. Think of this as a kind of “reverse halo” effect, although it’s one that depends on mathematical formulas rather than subjective impressions.

State bar examiners, unfortunately, amplify the reverse-halo effect by the way in which they scale essay and MPT answers to MBE scores. I explain this process in a previous post. In brief, the MBE performance of each state’s examinees sets the curve for scoring other portions of the bar exam within that state. If Ohio’s 2015 examinees perform less well on the MBE than the 2013 group did, then the 2015 examinees will get lower essay and MPT scores as well.

The law schools that have admitted high-risk students, in sum, are not the only schools that will suffer lower bar passage rates. The processes of equating and scaling will depress scores for other examinees in the pool. The reductions may be small, but they will be enough to shift examinees near the passing score from one side to another. Test-takers who might have passed the bar in 2013 will not pass in 2015. In addition to taking a harder exam (i.e. a 7-subject MBE), these unfortunate examinees will suffer from the reverse-halo effect describe above.

On the bar exam, the performance of my graduates affects outcomes for your graduates. If my graduates perform less well than in previous years, fewer of your graduates will pass: my graduates are your graduates in this sense. The growing number of low-LSAT students attending Thomas Cooley and other schools will also affect the fate of our graduates. On the bar exam, Cooley’s graduates are our graduates.

Won’t NCBE Fix This?

NCBE should address this problem, but they have shown no signs of doing so. The equating/scaling process used by NCBE assumes that test-takers retain roughly the same proficiency from year to year. That assumption undergirds the equating process. Psychometricians recognize that, as abilities shift, equating becomes less reliable.* The recent decline in LSAT scores suggests that the proficiency of bar examinees will change markedly over the next few years. Under those circumstances, NCBE should not attempt to equate and scale raw scores; doing so risks the type of reverse-halo effect I have described.

The problem is particularly acute with the bar exam because scaling occurs at several points in the process. As proficiency declines, equating and scaling of MBE performance will inappropriately depress those scores. Those scores, in turn, will lower scores on the essay and MPT portions of the exam. The combined effect of these missteps is likely to produce noticeable–and undeserved–declines in scores for examinees who are just as qualified as those who passed the bar in previous years.

Remember that I’m not referring here to graduates who perform well below the passing score. If you believe that the bar exam is a proper measure of entry-level competence, then those test-takers deserve to fail. The problem is that an increased number of unqualified examinees will drag down scores for more able test-takers. Some of those scores will drop enough to push qualified examinees below the passing line.

Looking Ahead

NCBE, unfortunately, has not been responsive on issues related to their equating and scaling processes. It seems unlikely that the organization will address the problem described here. There is no doubt, meanwhile, that entry-level qualifications of law students have declined. If bar passage rates fall, as they almost surely will, it will be easy to blame all of the decline on less able graduates.

This leaves three avenues for concerned educators and policymakers:

1. Continue to press for more transparency and oversight of NCBE. Testing requires confidentiality, but safeguards are essential to protect individual examinees and public trust of the process.

2. Take a tougher stand against law schools with low bar passage rates. As professionals, we already have an obligation to protect aspirants to our ranks. Self interest adds a potent kick to that duty. As you view the qualifications of students matriculating at schools with low bar passage rates, remember: those matriculants will affect your school’s bar passage rate.

3. Push for alternative ways to measure attorney competence. New lawyers need to know basic doctrinal principles, and law schools should teach those principles. A closed-book, multiple-choice exam covering seven broad subject areas, however, is not a good measure of doctrinal knowledge. It is even worse when performance on that exam sets the curve for scores on other, more useful parts of the bar exam (such as the performance tests). And the situation is worse still when a single organization, with little oversight, controls scoring of that crucial multiple-choice exam.

I have some suggestions for how we might restructure the bar exam, but those ideas must wait for another post. For now, remember: On the bar exam, all graduates are your graduates.

* For a recent review of the literature on changing proficiencies, see Sonya Powers & Michael J. Kolen, Evaluating Equating Accuracy and Assumptions for Groups That Differ in Performance, 51 J. Educ. Measurement 39 (2014). A more reader-friendly overview is available in this online chapter (note particularly the statements on p. 274).

Permalink

Data, Student Body, Teaching, Bar Exam, MBE, NCBE View Comments (6)

Equating, Scaling, and Civil Procedure

April 16th, 2015 / By Deborah J. Merritt

Still wondering about the February bar results? I continue that discussion here. As explained in my previous post, NCBE premiered its new Multistate Bar Exam (MBE) in February. That exam covers seven subjects, rather than the six tested on the MBE for more than four decades. Given the type of knowledge tested by the MBE, there is little doubt that the new exam is harder than the old one.

If you have any doubt about that fact, try this experiment: Tell any group of third-year students that the bar examiners have decided to offer them a choice. They may study for and take a version of the MBE covering the original six subjects, or they may choose a version that covers those subjects plus Civil Procedure. Which version do they choose?

After the students have eagerly indicated their preference for the six-subject test, you will have to apologize profusely to them. The examiners are not giving them a choice; they must take the harder seven-subject test.

But can you at least reassure the students that NCBE will account for this increased difficulty when it scales scores? After all, NCBE uses a process of equating and scaling scores that is designed to produce scores with a constant meaning over time. A scaled score of 136 in 2015 is supposed to represent the same level of achievement as a scaled score of 136 in 2012. Is that still true, despite the increased difficulty of the test?

Unfortunately, no. Equating works only for two versions of the same exam. As the word “equating” suggests, the process assumes that the exam drafters attempted to test the same knowledge on both versions of the exam. Equating can account for inadvertent fluctuations in difficulty that arise from constructing new questions that test the same knowledge. It cannot, however, account for changes in the content or scope of an exam.

This distinction is widely recognized in the testing literature–I cite numerous sources at the end of this post. It appears, however, that NCBE has attempted to “equate” the scores of the new MBE (with seven subjects) to older versions of the exam (with just six subjects). This treated the February 2015 examinees unfairly, leading to lower scores and pass rates.

To understand the problem, let’s first review the process of equating and scaling.

Equating

First, remember why NCBE equates exams. To avoid security breaches, NCBE must produce a different version of the MBE every February and July. Testing experts call these different versions “forms” of the test. For each of the MBE forms, the designers attempt to create questions that impose the same range of difficulty. Inevitably, however, some forms are harder than others. It would be unfair for examinees one year to get lower scores than examinees the next year, simply because they took a harder form of the test. Equating addresses this problem.

The process of equating begins with a set of “control” questions or “common items.” These are questions that appear on two forms of the same exam. The February 2015 MBE, for example, included a subset of questions that had also appeared on some earlier exam. For this discussion, let’s assume that there were 30 of these common items and 160 new questions that counted toward each examinee’s score. (Each MBE also includes 10 experimental questions that do not count toward the test-taker’s score but that help NCBE assess items for future use.)

When NCBE receives answer sheets from each version of the MBE, it is able to assess the examinees’ performance on the common items and new items. Let’s suppose that, on average, earlier examinees got 25 of the 30 common items correct. If the February 2015 test-takers averaged only 20 correct answers to those common items, NCBE would know that those test-takers were less able than previous examinees. That information would then help NCBE evaluate the February test-takers’ performance on the new test items. If the February examinees also performed poorly on those items, NCBE could conclude that the low scores were due to the test-takers’ abilities rather than to a particularly hard version of the test.

Conversely, if the February test-takers did very well on the new items–while faring poorly on the common ones–NCBE would conclude that the new items were easier than questions on earlier tests. The February examinees racked up points on those questions, not because they were better prepared than earlier test-takers, but because the questions were too easy.

The actual equating process is more complicated than this. NCBE, for example, can account for the difficulty of individual questions rather than just the overall difficulty of the common and new items. The heart of equating, however, lies in this use of “common items” to compare performance over time.

Scaling

Once NCBE has compared the most recent batch of exam-takers with earlier examinees, it converts the current raw scores to scaled ones. Think of the scaled scores as a rigid yardstick; these scores have the same meaning over time. 18 inches this year is the same as 18 inches last year. In the same way, a scaled score of 136 has the same meaning this year as last year.

How does NCBE translate raw points to scaled scores? The translation depends upon the results of equating. If a group of test-takers performs well on the common items, but not so well on the new questions, the equating process suggests that the new questions were harder than the ones on previous versions of the test. NCBE will “scale up” the raw scores for this group of exam takers to make them comparable to scores earned on earlier versions of the test.

Conversely, if examinees perform well on new questions but poorly on the common items, the equating process will suggest that the new questions were easier than ones on previous versions of the test. NCBE will then scale down the raw scores for this group of examinees. In the end, the scaled scores will account for small differences in test difficulty across otherwise similar forms.

Changing the Test

Equating and scaling work well for test forms that are designed to be as similar as possible. The processes break down, however, when test content changes. You can see this by thinking about the data that NCBE had available for equating the February 2015 bar exam. It had a set of common items drawn from earlier tests; these would have covered the six original subjects. It also had answers to 190 new items; these would have included both the original subjects and the new one (Civil Procedure).

With these data, NCBE could make two comparisons:

1. It could compare performance on the common items. It undoubtedly found that the February 2015 test-takers performed less well than previous test-takers on these items. That’s a predictable result of having a seventh subject to study. This year’s examinees spread their preparation among seven subjects rather than six. Their mastery of each subject was somewhat lower, and they would have performed less well on the common items testing those subjects.

2. NCBE could also compare performance on the new Civil Procedure items with performance on old and new items in other subjects. NCBE won’t release those comparisons, because it no longer discloses raw scores for subject areas. I predict, however, that performance on Civil Procedure items was the same as on Evidence, Property, or other subjects. Why? Because Civil Procedure is not intrinsically harder than these other subjects, and the examinees studied all seven subjects.

Neither of these comparisons, however, would address the key change in the MBE: Examinees had to prepare seven subjects rather than six. As my previous post suggested, this isn’t just a matter of taking all seven subjects in law school and remembering key concepts for the MBE. Because the MBE is a closed-book exam that requires recall of detailed rules, examinees devote 10 weeks of intense study to this exam. They don’t have more than 10 weeks, because they’re occupied with law school classes, extracurricular activities, and part-time jobs before mid-May or mid-December.

There’s only so much material you can cram into memory during ten weeks. If you try to memorize rules from seven subjects, rather than just six, some rules from each subject will fall by the wayside.

When Equating Doesn’t Work

Equating is not possible for a test like the new MBE, which has changed significantly in content and scope. The test places new demands on examinees, and equating cannot account for those demands. The testing literature is clear that, under these circumstances, equating produces misleading results. As Robert L. Brennan, a distinguished testing expert, wrote in a prominent guide: “When substantial changes in test specifications occur, either scores should be reported on a new scale or a clear statement should be provided to alert users that the scores are not directly comparable with those on earlier versions of the test.” (See p. 174 of Linking and Aligning Scores and Scales, cited more fully below.)

“Substantial changes” is one of those phrases that lawyers love to debate. The hypothetical described at the beginning of this post, however, seems like a common-sense way to identify a “substantial change.” If the vast majority of test-takers would prefer one version of a test over a second one, there is a substantial difference between the two.

As Brennan acknowledges in the chapter I quote above, test administrators dislike re-scaling an exam. Re-scaling is both costly and time-consuming. It can also discomfort test-takers and others who use those scores, because they are uncertain how to compare new scores to old ones. But when a test changes, as the MBE did, re-scaling should take the place of equating.

The second best option, as Brennan also notes, is to provide a “clear statement” to “alert users that the scores are not directly comparable with those on earlier versions of the test.” This is what NCBE should do. By claiming that it has equated the February 2015 results to earlier test results, and that the resulting scaled scores represent a uniform level of achievement, NCBE is failing to give test-takers, bar examiners, and the public the information they need to interpret these scores.

The February 2015 MBE was not the same as previous versions of the test, it cannot be properly equated to those tests, and the resulting scaled scores represent a different level of achievement. The lower scaled scores on the February 2015 MBE reflect, at least in part, a harder test. To the extent that the test-takers also differed from previous examinees, it is impossible to separate that variation from the difference in the tests themselves.

Conclusion

Equating was designed to detect small, unintended differences in test difficulty. It is not appropriate for comparing a revised test to previous versions of that test. In my next post on this issue, I will discuss further ramifications of the recent change in the MBE. Meanwhile, here is an annotated list of sources related to equating:

Michael T. Kane & Andrew Mroch, Equating the MBE, The Bar Examiner, Aug. 2005, at 22. This article, published in NCBE’s magazine, offers an overview of equating and scaling for the MBE.

Neil J. Dorans, et al., Linking and Aligning Scores and Scales (2007). This is one of the classic works on equating and scaling. Chapters 7-9 deal specifically with the problem of test changes. Although I’ve linked to the Amazon page, most university libraries should have this book. My library has the book in electronic form so that it can be read online.

Michael J. Kolen & Robert L. Brennan, Test Equating, Scaling, and Linking:
Methods and Practices (3d ed. 2014). This is another standard reference work in the field. Once again, my library has a copy online; check for a similar ebook at your institution.

CCSSO, A Practitioner’s Introduction to Equating. This guide was prepared by the Council of Chief State School Officers to help teachers, principals, and superintendents understand the equating of high-stakes exams. It is written for educated lay people, rather than experts, so it offers a good introduction. The source is publicly available at the link.

Permalink

Data, Teaching, Bar Exam, Equating, MBE, NCBE No Comments Yet

The February 2015 Bar Exam

April 12th, 2015 / By Deborah J. Merritt

States have started to release results of the February 2015 bar exam, and Derek Muller has helpfully compiled the reports to date. Muller also uncovered the national mean scaled score for this February’s MBE, which was just 136.2. That’s a notable drop from last February’s mean of 138.0. It’s also lower than all but one of the means reported during the last decade; Muller has a nice graph of the scores.

The latest drop in MBE scores, unfortunately, was completely predictable–and not primarily because of a change in the test takers. I hope that Jerry Organ will provide further analysis of the latter possibility soon. Meanwhile, the expected drop in the February MBE scores can be summed up in five words: seven subjects instead of six. I don’t know how much the test-takers changed in February, but the test itself did.

MBE Subjects

For reasons I’ve explained in a previous post, the MBE is the central component of the bar exam. In addition to contributing a substantial amount to each test-taker’s score, the MBE is used to scale answers to both essay questions and the Multistate Performance Test (MPT). The scaling process amplifies any drop in MBE scores, leading to substantial drops in pass rates.

In February 2015, the MBE changed. For more than four decades, that test has covered six subjects: Contracts, Torts, Criminal Law and Procedure, Constitutional Law, Property, and Evidence. Starting with the February 2015 exam, the National Conference of Bar Examiners (NCBE) added a seventh subject, Civil Procedure.

Testing examinees’ knowledge of Civil Procedure is not itself problematic; law students study that subject along with the others tested on the exam. In fact, I suspect more students take a course in Civil Procedure than in Criminal Procedure. The difficulty is that it’s harder to memorize rules drawn from seven subjects than to learn the rules for six. For those who like math, that’s an increase of 16.7% in the body of knowledge tested.

Despite occasional claims to the contrary, the MBE requires lots of memorization. It’s not solely a test of memorization; the exam also tests issue spotting, application of law to fact, and other facets of legal reasoning. Test-takers, however, can’t display those reasoning abilities unless they remember the applicable rules: the MBE is a closed-book test.

There is no other context, in school or practice, where we expect lawyers to remember so many legal principles without reference to codes, cases, and other legal materials. Some law school exams are closed-book, but they cover a single subject that has just been studied for a semester. The “closed book” moments in practice are much fewer than many observers assume. I don’t know any trial lawyers who enter the courtroom without a copy of the rules of evidence and a personalized cribsheet reminding them of common objections and responses.

This critique of the bar exam is well known. I repeat it here only to stress the impact of expanding the MBE’s scope. February’s test takers answered the same number of multiple choice questions (190 that counted, plus 10 experimental ones) but they had to remember principles from seven fields of law rather than six.

There’s only so much that the brain can hold in memory–especially when the knowledge is abstract, rather than gained from years of real-client experience. I’ve watched many graduates prepare for the bar over the last decade: they sit in our law library or clinic, poring constantly over flash cards and subject outlines. Since states raised passing scores in the 1990s and early 2000s, examinees have had to memorize many more rules in order to answer enough questions correctly. From my observation, their memory banks were already full to overflowing.

Six to Seven Subjects

What happens, then, when the bar examiners add a seventh subject to an already challenging test? Correct answers will decline, not just in the new subject, but across all subjects. The February 2015 test-takers, I’m sure, studied just as hard as previous examinees. Indeed, they probably studied harder, because they knew that they would have to answer questions drawn from seven bodies of legal knowledge rather than six. But their memories could hold only so much information. Memorized rules of Civil Procedure took the place of some rules of Torts, Contracts, or Property.

Remember that the MBE tests only a fraction of the material that test-takers must learn. It’s not a matter of learning 190 legal principles to answer 190 questions. The universe of testable material is enormous. For Evidence, a subject that I teach, the subject matter outline lists 64 distinct topics. On average, I estimate that each of those topics requires knowledge of three distinct rules to answer questions correctly on the MBE–and that’s my most conservative estimate.

It’s not enough, for example, to know that there’s a hearsay exemption for some prior statements by a witness, and that the exemption allows the fact-finder to use a witness’s out-of-court statements for substantive purposes, rather than merely impeachment. That’s the type of general understanding I would expect a new lawyer to have about Evidence, permitting her to research an issue further if it arose in a case. The MBE, however, requires the test-taker to remember that a grand jury session counts as a “proceeding” for purposes of this exemption (see Q 19). That’s a sub-rule fairly far down the chain. In fact, I confess that I had to check my own book to refresh my recollection.

In any event, if Evidence requires mastering 200 sub-principles of this detail, and the same is true of the other five traditional MBE subjects, that was 1200 very specific rules to memorize and keep in memory–all while trying to apply those rules to new fact patterns. Adding a seventh subject upped the ante to 1400 or more detailed rules. How many things can one test-taker remember without checking a written source? There’s a reason why humanity invented writing, printing, and computers.

But They Already Studied Civil Procedure

Even before February, all jurisdictions (to my knowledge) tested Civil Procedure on their essay exams. So wouldn’t examinees have already studied those Civ Pro principles? No, not in the same manner. Detailed, comprehensive memorization is more necessary for the MBE than for traditional essays.

An essay allows room to display issue spotting and legal reasoning, even if you get one of the sub-rules wrong. In the Evidence example given above, an examinee could display considerable knowledge by identifying the issue, noting the relevant hearsay exemption, and explaining the impact of admissibility (substantive use rather than simply impeachment). If the examinee didn’t remember the correct status of grand jury proceedings under this particular rule, she would lose some points. She wouldn’t, however, get the whole question wrong–as she would on a multiple-choice question.

Adding a new subject to the MBE hit test-takers where they were already hurting: the need to memorize a large number of rules and sub-rules. By expanding the universe of rules to be memorized, NCBE made the exam considerably harder.

Looking Ahead

In upcoming posts, I will explain why NCBE’s equating/scaling process couldn’t account for the increased difficulty of this exam. Indeed, equating and scaling may have made the impact worse. I’ll also explore what this means for the ExamSoft discussion and what (if anything) legal educators might do about the increased difficulty of the MBE. To start the discussion, however, it’s essential to recognize that enhanced level of difficulty.

Permalink

Data, Teaching, Bar Exam, Civil Procedure, MBE, NCBE View Comments (2)

The Latest Change in the MBE

Surprise: MBE Scores Rise in 2016

On the Bar Exam, My Graduates Are Your Graduates

Equating, Scaling, and Civil Procedure

The February 2015 Bar Exam

About Law School Cafe

Around the Cafe

Subscribe

Categories

Recent Comments

Recent Posts

Monthly Archives

Participate

Past and Present Guests