Flutterby™! : Re-evaluating GPT-4’s bar exam performance 2024-05-31 21:29:59.423282+02

Re-evaluating GPT-4’s bar exam performance

2024-05-31 21:29:59.423282+02 by Dan Lyke 0 comments

Oh, look, OpenAI lied about the results of the ChatGPT taking the bar exam thing: Springer Link: Artificial Intelligence and Law: Re-evaluating GPT-4’s bar exam performance

Fourth, when examining only those who passed the exam (i.e. licensed or license-pending attorneys), GPT-4’s performance is estimated to drop to 48th percentile overall, and 15th percentile on essays. In addition to investigating the validity of the percentile claim, the paper also investigates the validity of GPT-4’s reported scaled UBE score of 298. The paper successfully replicates the MBE score, but highlights several methodological issues in the grading of the MPT + MEE components of the exam, which call into question the validity of the reported essay score.

Via

Comment policy

We will not edit your comments. However, we may delete your comments, or cause them to be hidden behind another link, if we feel they detract from the conversation. Commercial plugs are fine, if they are relevant to the conversation, and if you don't try to pretend to be a consumer. Annoying endorsements will be deleted if you're lucky, if you're not a whole bunch of people smarter and more articulate than you will ridicule you, and we will leave such ridicule in place.

Flutterby™ is a trademark claimed by