Third-Party Research & Methodology Only

This section shares summaries of third-party academic research and descriptions of quantitative models. The content represents the findings of the original researchers, not the opinions or recommendations of Foxholm Financial. Foxholm Financial does not publish hypothetical or backtested performance metrics on its quantitative research pages. All content is restricted to methodology, signal construction, factor logic, and risk architecture. SEC rules require that investment advisers not present misleading performance data, and our methodology-only approach reflects that standard and the firm's fiduciary obligations.

Sentiment Analysis / NLP Signals

Natural Language Processing Alternative Data Signal Generation
Robert Stowe
Robert Stowe, AAMS® Investment Advisor

Sentiment analysis applies natural language processing (NLP) to extract trading signals from text data: news articles, earnings call transcripts, regulatory filings, and social media posts. The core premise is that the tone and content of human language carry information about future asset prices that traditional numerical data does not capture.

Unlike fundamental factors (price, volume, earnings) that arrive as structured numbers, text data is unstructured. A company's quarterly earnings report contains numbers, but it also contains management's choice of words, the tone of analyst questions, and the framing of forward guidance. NLP techniques convert this unstructured text into quantitative scores that can be incorporated into systematic trading models.

Conceptual Framework

Sentiment analysis for financial markets rests on a behavioral finance observation: markets do not instantly incorporate all available information. New information arrives through text (news, filings, social media), and it takes time for investors to read, interpret, and act on it. A model that can systematically process text faster than human readers can potentially identify price-relevant information before it is fully reflected in market prices.

The academic foundation traces to research on how textual information affects asset prices. Tetlock (2007), in a study published in The Journal of Finance, demonstrated that the pessimism level in a Wall Street Journal column predicted next-day stock market returns, establishing that media tone carries measurable predictive content.

Data Sources

Sentiment models draw from several categories of text, each with distinct characteristics and signal properties:

  • News articles: Financial wire services and major publications produce thousands of articles daily. News sentiment captures market-moving events as they happen. The signal is typically short-lived because news is widely disseminated and quickly priced in.
  • Earnings call transcripts: Quarterly conference calls between company management and analysts contain both prepared remarks and spontaneous Q&A. Research shows that the tone of the Q&A section (where management answers unprepared questions) is more informative than the scripted opening statement. These signals tend to have a longer half-life because parsing an hour-long transcript takes human analysts time.
  • Regulatory filings: SEC filings (10-K annual reports, 10-Q quarterly reports, 8-K current reports) contain legally required disclosures. Changes in risk factor language, unusual wording in management discussion sections, and shifts in accounting footnotes can signal evolving business conditions before they appear in headline numbers.
  • Social media: Platforms like StockTwits, Reddit (particularly r/wallstreetbets), and X (formerly Twitter) generate high-frequency, noisy signals. Social media sentiment can capture retail investor positioning but requires aggressive filtering to separate signal from noise. The signal-to-noise ratio is substantially lower than institutional text sources.

NLP Methods

Three generations of NLP techniques are used in financial sentiment analysis, each with different complexity and accuracy tradeoffs:

  • Dictionary-based methods: The simplest approach counts positive and negative words in a document using a predefined word list. The most widely used financial dictionary is the Loughran-McDonald Master Dictionary, developed specifically for financial text. General-purpose sentiment dictionaries (like the Harvard General Inquirer) misclassify financial terms: "liability," "risk," and "capital" are negative in everyday language but neutral in finance. The Loughran-McDonald dictionary corrects for this domain-specific vocabulary.
  • Machine learning classifiers: Supervised models trained on labeled financial text. A human annotator marks thousands of sentences as positive, negative, or neutral, and an algorithm (Naïve Bayes, support vector machine, or random forest) learns the patterns. These models capture context that word lists miss: "revenue exceeded expectations" and "revenue fell short of expectations" contain many of the same words but have opposite meanings. The cost is the need for labeled training data.
  • Transformer-based models: Large language models such as BERT and FinBERT (a version of BERT fine-tuned on financial text) represent the current state of the art. These models understand word meaning in context: "Apple" in a financial document refers to a company, not a fruit. FinBERT, introduced by Huang, Wang, and Yang (2023), achieves significantly higher accuracy on financial sentiment classification tasks compared to both dictionary methods and traditional machine learning classifiers.

Signal Construction Pipeline

A sentiment-based trading signal is constructed through a five-step pipeline, from raw text collection to a numerical score that can be integrated with other quantitative signals.

Step 1
Text Collection
Step 2
Preprocessing
Step 3
Sentiment Scoring
Step 4
Signal Aggregation
Step 5
Signal Integration

Text Collection and Preprocessing

Raw text is collected from data vendors, news APIs, or SEC EDGAR (the SEC's public filing database). Each document must be associated with a specific company and timestamped accurately. Timestamp precision matters because a sentiment signal derived from an article published after market close has different trading implications than one published during trading hours.

Preprocessing removes boilerplate text (legal disclaimers, standard headers), normalizes formatting, and segments documents into analyzable units. For earnings calls, this typically means separating the prepared remarks from the Q&A. For SEC filings, it means isolating specific sections (Management Discussion and Analysis, Risk Factors) from standardized tables and exhibits.

Scoring and Aggregation

The chosen NLP method assigns a sentiment score to each text unit. For dictionary methods, this is often a ratio: (positive word count minus negative word count) divided by total word count. For machine learning models, it is a probability output (e.g., 0.82 probability of positive sentiment).

Individual document scores are then aggregated into a company-level or asset-level signal. Common aggregation approaches include the simple average of recent scores, an exponentially weighted average that gives more weight to recent documents, or the change in sentiment relative to a rolling baseline. The change-based approach often produces stronger signals because it captures shifts in tone rather than absolute levels.

Integration with Other Signals

Sentiment scores rarely serve as standalone trading signals. They are typically combined with traditional quantitative factors (momentum, value, quality) in a multi-factor framework. The combination works because sentiment captures a different type of information: while price-based momentum measures what investors are doing, text-based sentiment measures what investors and analysts are saying and thinking.

Research by Jegadeesh and Wu (2013) in the Review of Financial Studies found that word-based sentiment has predictive content for stock returns that is distinct from and additive to traditional financial variables.

Risk Architecture

Sentiment-based strategies face a unique set of risks that differ from those of traditional quantitative models.

Signal Decay

Sentiment signals tend to decay quickly. News-based signals may lose most of their value within hours. Earnings call signals typically decay over days to weeks. As more market participants adopt NLP tools, the speed of signal decay increases because the information gets priced in faster. A sentiment model that worked in 2015 may be substantially less effective today due to this crowding effect.

Model Risk

NLP models can misinterpret language in ways that are not immediately obvious. Sarcasm, negation, and domain-specific jargon create classification errors. A sentence like "the company's risk management is not as weak as competitors feared" contains a positive message but is structured with negative words ("risk," "weak," "feared") that simpler models may misclassify. Transformer-based models handle these cases better than dictionaries, but no model is immune to misinterpretation.

Data Quality Risk

Text data sources can change without warning. News providers alter their formatting, SEC EDGAR updates its filing structure, and social media platforms change their APIs or content policies. Any of these changes can break a production sentiment pipeline or subtly alter the signal without triggering obvious errors. Continuous monitoring of data quality is essential but operationally demanding.

Known Limitations

Limitations to Consider

  • Language is ambiguous: Even the most advanced NLP models make classification errors. Financial language is particularly challenging because the same words carry different meanings in different contexts. Accuracy rates for financial sentiment classification typically range from 75% to 85%, meaning 15% to 25% of documents are scored incorrectly.
  • Survivorship bias in training data: Models trained on historical text may not generalize to new language patterns. The vocabulary of financial markets evolves: terms like "meme stocks," "SPAC," and "yield curve inversion" were uncommon or nonexistent in earlier training data. Models must be periodically retrained to remain effective.
  • Manipulation risk: Social media sentiment is particularly vulnerable to manipulation. Coordinated posting campaigns can artificially inflate or deflate sentiment scores. News sentiment is less susceptible but not immune, as press releases are inherently biased toward positive framing.
  • Latency requirements: For news-based signals, milliseconds matter. A sentiment model that processes an article five seconds after publication may arrive too late for the signal to have value, particularly for liquid, large-cap stocks where algorithmic traders react almost instantly.
  • Backtesting challenges: Historical text data is harder to obtain and reconstruct than price data. Point-in-time text databases (where you see only what was available at each historical moment) are expensive and incomplete. Without them, backtests suffer from look-ahead bias.

Practical Considerations

Data Costs and Infrastructure

Production-grade sentiment analysis requires significant data infrastructure. News feeds from providers like Refinitiv or Bloomberg cost tens of thousands of dollars annually. SEC filing data from EDGAR is free but requires substantial engineering to parse and maintain. Social media data access has become increasingly restricted and expensive as platforms monetize their APIs.

Processing requirements add to the cost. Running FinBERT or similar transformer models across thousands of documents daily requires GPU-accelerated computing infrastructure. Dictionary-based methods are computationally inexpensive but sacrifice accuracy. The tradeoff between model complexity and operational cost is a key design decision.

Combining Sentiment with Fundamental Signals

The most robust implementations use sentiment as one input among several rather than as a standalone signal. A multi-factor approach might weight sentiment at 10% to 20% of the overall signal, alongside momentum, value, and quality factors. This diversification across signal types reduces the impact of any single signal's decay or failure.

Sentiment signals tend to be most valuable at extremes. A company with extremely negative news sentiment combined with strong fundamental quality scores may represent a buying opportunity if the market has overreacted to short-term negative headlines. The interaction between sentiment and other factors often provides more information than either alone.

Universe Considerations

Sentiment analysis works best for stocks with substantial text coverage. Large-cap stocks covered by dozens of analysts and hundreds of news articles per month provide ample data for reliable sentiment scores. Small-cap and micro-cap stocks often have too little coverage for meaningful text analysis, creating a coverage bias in the signal. For thinly covered names, a single article can dominate the sentiment score, making the signal unreliable.

Further Reading

Meet with a Fiduciary Advisor

Foxholm Financial is a fee-only registered investment adviser serving Georgia. We bring quantitative rigor to every client engagement. Explore our services or get in touch to discuss how we can help.

Institutional Clients

Are you an institution or FinTech firm? Learn about our Quantitative Consulting Services.

Disclaimer

This content is for educational and informational purposes only and does not constitute an offer to sell or a solicitation of an offer to buy any securities. Nothing herein constitutes investment advice or recommendations tailored to your individual situation. All investments involve risk, including the potential loss of principal. Past performance is no guarantee of future results. Information presented is believed to be factual and up-to-date, but Foxholm Financial does not guarantee its accuracy and it should not be regarded as a complete analysis of the subjects discussed. Before making investment decisions, consult with a qualified financial advisor who can evaluate your specific circumstances.

On This Page