We propose an approach to measuring the state of the economy via textual analysis of business news. From the full text content of 800,000 Wall Street Journal articles for 19842017, we estimate a topic model that summarizes business news as easily interpretable topical themes and quantifies the proportion of news attention allocated to each theme at each point in time. We then use our news attention estimates as inputs into statistical models of numerical economic time series. We demonstrate that these text-based inputs accurately track a wide range of economic activity measures and that they have incremental forecasting power for macroeconomic outcomes, above and beyond standard numerical predictors. Finally, we use our model to retrieve the news-based narratives that underly shocks in numerical economic data.
Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)
Washington University in St. Louis – John M. Olin Business School
University of Chicago – Booth School of Business
Monthly Topic Attention (Theta).csv reports WSJ attention allocation to news topics in each month from 1984-2017, which are the estimated monthly theta parameters from the model.
Word Weights By Topic (Phi).csv reports estimated word probabilities within each topic, which is the collection of estimated phi parameters from the model.
Scaled Word Weights By Topic (Phi-tilde).csv reports estimated word probabilities within each topic scaled by the average word probability across all topics, which is the collection of estimated phi-tilde parameters from the model. Re-scaling the phis helps identify the unique thematic content of each topic.
Scaled Word Weights By Topic (Phi-tilde).csv(63MB)
. Estimated with hierarchical agglomerative clustering, the dendrogram illustrates how our 180 topics cluster into an intuitive hierarchy of increasingly broad metatopics. Click on topics at the far right to see topic detail.
Click on topic name(s) to see how attention to topics changes over time and to see a word cloud of key terms for each topic. Mouse over the time series plot to see headlines of articles closely associated with the topic each month.