ChatGPT passed a management exam, but still sucks in math

ChatGPT just passed a university-level management exam. Despite amazing miscalculations, artificial intelligence passed the test.

Christian Terwiesch, a business professor at the Wharton School of Business in Pennsylvania, wanted to push ChatGPT, OpenAI’s intelligent chatbot, to its limits. To test the limits of artificial intelligence, the academic decided to give him a management exam. At the end of this exam, students can obtain a Master of Business Administration (MBA). This is a graduate degree offering job prospects in the fields of marketing, finance and human resource management.

Read also: “There is nothing revolutionary about it” – French AI pioneer Yann LeCun is not impressed with ChatGPT

The excellent explanations of ChatGPT

The business professor has conscientiously asked all the exam questions to ChatGPT… and the artificial intelligence didn’t fare too badly. According to the report published by Christian Terwiesch, the chatbot does an amazing job on basic operations management and process analysis questions, including those that are based on case studies”.

“Not only are the answers correct, but the explanations are excellent”explains Christian Terwiesch in a detailed report.

Having access to a huge database, artificial intelligence was able to generate appropriate responses, despite sometimes complex statements. The chatbot is good at aggregating a set of data and producing a cohesive summary. Note that some questions are several dozen lines long and require in-depth knowledge. In some of these questions, ChatGPT even received the maximum mark thanks to detailed and well-structured answers.

Surprising miscalculations

In parallel, ChatGPT has also committed gross miscalculations during the exam. In view of the AI’s relevant responses, the Wharton professor was surprised at the surprising errors in relatively simple calculations, at the level of 6th grade mathematics”. At first glance, most mathematical operations seemed logical and plausible. Upon closer examination, errors in reasoning become apparent.

In addition, the chatbot was less competent to more advanced process analysis questions ». Statements evoking process flows with multiple products » and issues regarding “the variability of demand » caused problems for ChatGPT. The problems with more complex causal effects also undermined the intelligence of the chatbot.

As part of the experiment, the professor took the liberty ofoccasionally offer a hint to ChatGPT, as he would during an oral examination with a student. Thanks to the clues provided, the AI ​​quickly managed to review its copy and correct its approximations:

“In cases where it initially failed to match the problem with the correct solution method, Chat GPT3 was able to correct itself after receiving an appropriate hint from a human expert.”

Christian Terwiesch’s observation confirms our own opinion about ChatGPT. In its current iteration, OpenAI’s chatbot is a valuable assistant to facilitate certain tasks, or unblock certain intellectual problems. However, it still does not fully replace the human intellect. This is also the case with DesignerBot, the AI ​​that creates PowerPoints for you.

As the business professor points out, the AI ​​is able to evolve quickly thanks to the given clues. Shortly after needing some nudge to solve a problem, ChatGPT instantly managed to generate the perfect answer, without needing a hint. In a few interactions, the chatbot would have evolved:

“Either he’s able to learn from past trades or I just got lucky.”

A successful exam

After analyzing the copy rendered by ChatGPT, the professor told him given a B grade. Thanks to this score, a student can avoid taking part in the course of operations management, generally obligatory within the establishment. Despite errors, the chatbot demonstrated sufficient understanding of the material. He therefore passed the exam, but without achieving amazing results:

“We allowed students to waive this course if they could demonstrate content proficiency in a waiver exam. The Chat GPT3 performance reported above would have been sufficient to pass the waiver exam”.

Moreover, the teacher had fun design exam questions using ChatGPT. Again, the chatbot was very effective. He even included small strokes of humor in the statements, like a human being.

Christian Terwiesch nevertheless identified two gaps in the questions. For some unknown reason, ChatGPT saw fit to include dispensable and anecdotal data in questions. These unnecessary elements can confuse the student tasked with finding a solution to a mathematical challenge. In other cases, the AI ​​overlooked critical information. De facto, it is impossible to answer the problem. Again, the operation needs the supervision of human intelligence.

Source :

Wharton