An experiment involving hundreds of office workers has found that artificial intelligence (AI) tools can greatly help but also hurt worker performance.
AI tools designed to operate at human levels have greatly expanded in popularity over the past year. These include OpenAI’s ChatGPT, Google’s Bard and Microsoft’s AI-powered search engine Bing.
Such tools, also known as chatbots or “generative AI,” are computer-powered systems. They are designed to interact smoothly with humans and perform high-level writing and creative work.
In recent months, these tools have demonstrated an ability to produce high quality work. This has led some technology experts to warn that generative AI systems could end up replacing workers in many industries.
This year, researchers at Harvard Business School and other organizations carried out an experiment. It aimed to test how well AI tools could help workers perform their usual duties, or tasks. It involved more than 700 business advisors, called consultants, from Massachusetts-based Boston Consulting Group.
Harvard Business School recently published results from the experiment in a “working paper.” The main findings suggest that AI tools like ChatGPT can greatly improve worker performance.
For example, researchers found that, on average, workers who used OpenAI’s latest ChatGPT 4 tool completed 12 percent more tasks than non-ChatGPT users. Tasks carried out with help from the AI technology were completed 25 percent faster. And the team found the quality of work performed by consultants using ChatGPT 4 increased about 40 percent.
Work tasks used in the study covered four main areas: creativity, analytical thinking, writing and persuasiveness. The team gave examples of worker tasks in each of these areas.
One example for creativity was: “Propose at least 10 ideas for a new shoe targeting an underserved market or sport.” For writing, consultants were asked to “write a press release with marketing copy” for a new product. To show persuasiveness, workers were told to write a letter to employees that explained why a particular product would beat competitors.
Harvard Business School’s Fabrizio Dell’Acqua was the paper’s lead writer. He told technology website VentureBeat he thinks the results were especially important because they showed that AI tools can help even highly educated, experienced workers.
“The fact that we could boost the performance of these highly paid, highly skilled consultants, from top, elite MBA institutions…I would say that’s really impressive,” Dell’Acqua said.
However, the paper also noted areas where the performance of consultants using ChatGPT 4 dropped. The researchers said this was especially true with tasks the AI tool was not good at completing.
Of tasks the AI was good at, the experiment showed it “significantly improved human performance,” the paper said. But for tasks ChatGPT 4 was not right for, “humans relied too much on the AI, and were more likely to make mistakes.”
The researchers reported that consultants who used AI for tasks it was not well equipped for “were 19 percent less likely to produce correct solutions compared to those without AI.”
The experiment also showed how consultants used the AI tool differently to improve their work. The researchers said some workers purposely divided the tasks with some being completely carried out by the AI tool and others the workers themselves carried out completely. Other workers chose to use AI for all tasks, while “continually interacting with the technology.”
The team suggests one of the biggest barriers to companies effectively using AI is not knowing which tasks can be completed best with the technology. Finding this out will require businesses to carry out thoughtful research and training efforts in order to find the right mix of AI and human-level work.
I’m Bryan Lynn.