Alejandro Lopez-Lira, a finance professor at the University of Florida, says that large language models may be useful when forecasting stock prices.
He used ChatGPT to parse news headlines for whether they’re good or bad for a stock, and found that ChatGPT’s ability to predict the direction of the next day’s returns were much better than random, he said in a recent unreviewed paper.
The experiment strikes at the heart of the promise around state-of-the-art AI: With bigger computers and better datasets — like those powering ChatGPT — these A.I models may display “emergent abilities,” or capabilities that weren’t originally planned when they were built.
If ChatGPT can display the emergent ability to understand headlines from financial news and how they might impact stock prices, it could could put high-paying jobs in the financial industry at risk. About 35% of financial jobs are at risk of being automated by AI, Goldman Sachs estimated in a March 26 note.
“The fact that ChatGPT is understanding information meant for humans almost guarantees if the market doesn’t respond perfectly, that there will be return predictability,” said Lopez-Lira.
But the specifics of the experiment also show how far so-called “large language models” are from being able to do many finance tasks.
For example, the experiment didn’t include target prices, or have the model do any math at all. In fact, ChatGPT-style technology often makes numbers up, as Microsoft learned in a public demo earlier this year. Sentiment analysis of headlines is also well understood as a trading strategy, with proprietary datasets already in existence.
Lopez-Lira said he was surprised by the results, and said they suggest that sophisticated investors aren’t using ChatGPT-style machine learning in their trading strategies yet.
“On the regulation side, if we have computers just reading the headlines, headlines will matter more, and we can see if everyone should have access to machines such as GPT,” said Lopez-Lira. “Second, it’s certainly going to have some implications on the employment of financial analyst landscape. The question is, do I want to pay analysts? Or can I just put textual information in a model?”
How the experiment worked
In the experiment, Lopez-Lira and his partner Yuehua Tang looked at over 50,000 headlines from a data vendor about public stocks on the NYSE, Nasdaq, and a small cap exchange. They started in October 2022 — after the data cutoff date for ChatGPT, meaning that the engine hadn’t seen or used those headlines in training.
Then, they fed the headlines into ChatGPT 3.5 along with the following prompt:
“Forget all your previous instructions. Pretend you are a financial expert. You are a financial expert with stock recommendation experience. Answer “YES” if good news, “NO” if bad news, or “UNKNOWN” if uncertain in the first line. Then elaborate with one short and concise sentence on the next line.”
Then they looked at the stocks’ return during the following trading day.
Ultimately, Lopez-Lira found that the model did better in nearly all cases when informed by a news headline. Specifically, he found a less than 1% chance the model would do as well picking the next day’s move at random, versus when it was informed by a news headline.
ChatGPT also beat commercial datasets with human sentiment scores. One example in the paper showed a headline about a company settling litigation and paying a fine, which had a negative sentiment, but the ChatGPT response correctly reasoned it was actually good news, according to the researchers.
Lopez-Lira told CNBC that hedge funds had reached out to him to learn more about his research. He also said it wouldn’t surprise him if ChatGPT’s ability to predict stock moves decreased in the coming months as institutions started integrating this technology.
That’s because the experiment only looked at stock prices during the next trading day, while most people would expect the market could have already priced the news in in the seconds after it became public.
“As more and more people use these type of tools, the markets are going to become more efficient, so you would expect return predictability to decline,” Lopez-Lira said. “So my guess is, if I run this exercise, in the next five years, by the year five, there will be zero return predictability.”