November 23, 2024
GPT-4 Better at Financial Analysis than Human Analysts, Says Research
GPT-4 can outperform human analysts when it comes to predicting the future on the basis of financial statement analysis, claimed a new research paper. The paper, which has been published in a preprint journal found in its tests that GPT-4 gave superior results compared to human counterparts in the short-term period (ranging between one month to six months). It achieve...

GPT-4 can outperform human analysts when it comes to predicting the future on the basis of financial statement analysis, claimed a new research paper. The paper, which has been published in a preprint journal found in its tests that GPT-4 gave superior results compared to human counterparts in the short-term period (ranging between one month to six months). It achieved 60.31 percent accuracy in its predictions compared to 56.7 percent of human analysts. However, the paper did not suggest that the AI model could replace humans.

Research paper’s objective

Published in the preprint journal Social Science Research Network (SSRN), the 54-page paper titled “Financial Statement Analysis with Large Language Models” attempted to find out the role conventional artificial intelligence (AI) models can play in analysing the financial statements of an organisation and predicting its performance in the stock market in the near future.

Such analysis has always been understood to be very complicated as a wide range of factors can influence the performance of companies. Even as some financial firms use artificial neural networks (ANN) to assist individuals in their analysis, large language models (LLMs) have not been used for this. The researchers wanted to see if a state-of-the-art (SOTA) LLM such as GPT-4 can be a valuable addition to this or not.

What did the GPT-4 research paper find?

Researchers fed GPT-4 anonymised and standardised corporate financial statements (to prevent biases emerging from mentioning the company’s name). Next, the researchers used two methods to test the capabilities of the LLM. The first was designing a simple prompt that directed the chatbot to analyse the statements and predict future earnings. The second was to use a “chain-of-thought’ (CoT) prompt that taught the AI model to mimic financial analysts.

The CoT method asked GPT-4 to identify notable trends, compute key financial ratios, and to form expectations about future earnings. While the simple prompt did not fetch noteworthy results, the CoT prompts achieved 60.31 percent which was higher than the average human analyst’s performance.

GPT-4 vs human analyst
Photo Credit: Research paper: Financial Statement Analysis with Large Language Models

“We find that an LLM excels in a quantitative task that requires intuition and human-like reasoning. The ability to perform tasks across domains points towards the emergence of Artificial General Intelligence,” the paper stated.

However, the researchers were quick to point out that GPT and human analysts are complementary instead of the former replacing the latter. Specifically, the paper claimed that LLMs have an advantage in areas where humans tend to show bias and disagreement. Humans, similarly, add value when the analysis requires additional contextual information that is not likely to be available within the financial data.


Affiliate links may be automatically generated – see our ethics statement for details.