Content area

Abstract

The first factor explaining stock price co-movements is the equity risk premium: Stocks co-move because they are exposed to equity risk. The most popular method investors use to classify stocks is by type of business. This is useful for comparison purposes, e.g., to assess the fundamental data of a company with respect to its peers, and it is also supposed to predict stock price co-movements. The underlying idea is that each industrial sector responds to macroeconomic factors, economic policies, and news in a different and unique way. In this article, this publication focuses on the US domestic equity market and try to answer the following questions: Are industries still a predominant factor in explaining US stock market co-movements? And how do they compare to data-driven categorizations? To answer these questions, they derive a data-driven categorization of stocks, which relies exclusively on stock price time-series, and compare it with the standard categorization.

Full text

Turn on search term navigation
 

The first factor explaining stock price co-movements is the equity risk premium: Stocks co-move because they are exposed to equity risk. But what is the second factor? The most popular method investors use to classify stocks is by type of business. This is useful for comparison purposes, e.g., to assess the fundamental data of a company with respect to its peers, and it is also supposed to predict stock price co-movements. The underlying idea is that each industrial sector responds to macroeconomic factors, economic policies, and news in a different and unique way. Many studies point out the importance of industries (Held [2009], Blitzer and Maitland [2009], Cavaglia et al. [2004] and Boillat et al. [2002]), suggesting that starting in the 1990s they replaced countries in developed international markets as a second factor explaining stock price co-movements. For the importance of countries before the 1990s, the interested reader can refer to Heston and Rouwenhorst [1995].

The economic explanation of this empirically observed phenomenon relies on the globalization and increased internationalization of large companies (Cavaglia et al. [2004] and Boillat et al. [2002]). It comes as no surprise that portfolio managers, risk managers, and asset allocators use industry classifications to drive investment decisions, to control portfolio risk, and to perform strategic asset allocation. The recent rise in assets allocated to sector exchange-traded funds (ETFs) (Lydon [2013]) clearly shows that market participants endorse industry groupings.

In this article, we focus on the U.S. domestic equity market and try to answer the following questions: Are industries still a predominant factor in explaining U.S. stock market co-movements? And how do they compare to data-driven categorizations? To answer these questions, we derive a datadriven categorization of stocks, which relies exclusively on stock price time-series, and compare it with the standard categorization. The latter relies on the use of revenues as a key measure of company business activity, although often earnings and market perception are also used. Previous studies analyzed the efficacy of industry classifications and compared different providers (Horrell and Meraz [2009] and Nadig and Crigger [2011]). Cluster analysis was used in Arnott [1980] to derive a multiple-factor risk model. More recently, it was used in Chan et al. [2007] to check the quality of industry classifications, reaching the conclusion that cluster analysis is as good as industry classification. The objective of this article is to extend those studies, analyzing the forecasting power of sectors and providing insight into the relation between cluster analysis and industry classification.

Anticipating our results, we find that data-driven categorizations of stocks show a consistent overlap with industries, suggesting that the latter are a predominant factor in explaining stock price co-movements. Surprisingly, not all sectors have the same forecasting power, and the cluster analysis appears to be able to identify exactly those sectors with higher predictive power. This sheds new light on the results reported by Chan et al. [2007]: Cluster analysis clearly endorses the use of industry classification. We conclude that although datadriven categorizations based on cluster analysis result in a somewhat better partitioning of the considered stocks, the difference is statistically not significant and in the end the stability of industry classifications is a strong argument for their use from the investment and analysis perspective.

DATA AND METHODOLOGY

The dataset contains data for U.S. stocks from December 1999 to October 2013. The investable universe is given by the FTSE US index, which includes approximately 650 stocks from the largest U.S. companies in terms of free floats.

Among available classification schemes, the Industry Classification Benchmark (ICB), supported by the ICB Database and maintained by FTSE International Limited (FTSE [2013]), and the Global Industry Classification Standard (GICS), developed by MSCI and Standard & Poor's, have been shown to provide superior industry classifications for fundamental analysis and evaluation purposes (Bhojraj et al. [2003] and Eberhart [2004]). Both are widely used among industry professionals.

With the exception of the consumer sectors, ICB and GICS classifications are very close. For the consumer sector, GICS takes a market-oriented approach, whereas ICB uses a production-oriented approach. Therefore, GICS differentiates consumer industries into discretionary and staples, whereas ICB divides into goods and services.

In the course of the current study, we use as proxy for the industry categorization the ICB classification. This latter is organized in four hierarchical levels and comprises currently 10 industries, 19 supersectors,' 41 sectors, and 114 subsectors. Our dataset has pointin-time data starting from 2006. As the classification changes over time, the use of point-in-time data allows reducing the risk of forward-looking bias. We exclude stocks with missing data, i.e., ICB classification or prices, and stocks that are the target of an acquisition, after its public announcement.

To assess whether industries are a predominant factor in explaining stock price co-movements, we analyze the results of a data-driven categorization of stocks. We use cluster analysis, which groups a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other groups. We use only stock price timeseries to detect similarities. The flowchart of the main algorithm, hierarchical clustering, is reported in Exhibit A1 of the Appendix. The robustness of the results has been checked using a K-mean cluster algorithm. The interested reader can refer to Bramer [2013] for a global overview of data mining techniques or to Everitt et al. [2011] for a practical introduction to cluster analysis.

Every three months, we partition the investable universe in N clusters, using daily data on 252 trading days. For comparison purposes, N is chosen to match the number of industries.

View Image -

ARE INDUSTRIES A PREDOMINANT FACTOR IN EXPLAINING STOCK PRICE CO-MOVEMENTS?

If industries were the predominant factor explaining stock price co-movements, we would expect a large overlap between the data-driven categorization based on cluster analysis and industries. Using the methodology described in the previous section, we perform a rolling historical categorization of the investable universe and compute the exposure of each cluster to ICB industry. Because we are considering 10 ICB industries, we use N=10 clusters. Our algorithm orders clusters randomly, so every month we reorder the clusters based on the similarity with previous clusters. We expect our reordering algorithm not to work accurately in the more noisy clusters.

View Image - Exhibit 1Mapping Between Clusters and ICB Industries

Exhibit 1Mapping Between Clusters and ICB Industries

Exhibit 1 reports the mapping between the clusters and the ICB industries with averaged historical percent weights, evidencing the strong connection between cluster and ICB industry categorizations. In particular, Utilities, Oil & Gas, Financial, Industrial and Technology are predominantly regrouped in distinct clusters. The only industries not detected are Basic Materials, which is split between Industrial and Oil & Gas, and Telecommunications. It is interesting to note the presence of a second cluster containing financial stocks. Although cluster Number 9 is strongly dominated by the Banks supersector, cluster Number 1 shows at times exposures to Insurance, Real Estate, and Financial Services supersectors.

These results are robust with respect to the time window length used to estimate the cluster composition and to the use of a different cluster algorithm, namely K-mean.

CAN DATA-DRIVEN CLASSIFICATIONS DO BETTER?

We measure ex post intragroup and intergroup correlations and compare the results obtained by grouping by industries with those obtained by grouping according to clusters. As shown in Exhibit 2, intragroup correlation is very close for the ICB industry classification and for our data-driven categorization. Exhibit 2 evidences the rise in correlation experienced in stock markets, starting from the latest financial crisis.

View Image - Exhibit 2Intragroup Correlation for the Industry and Data-driven ClassificationExhibit 3Difference Between Intragroup and Intergroup Ex Post Correlation, Using Industry and Data-Driven Classification

Exhibit 2Intragroup Correlation for the Industry and Data-driven ClassificationExhibit 3Difference Between Intragroup and Intergroup Ex Post Correlation, Using Industry and Data-Driven Classification

Exhibit 3 shows the difference between intragroup and intergroup correlations, using industry and data-driven categorizations when correlations are computed on one year of daily data (Panel A) and three years of monthly data (Panel B). The higher the difference, the higher is the explanatory power of the classifications in differentiating stock price movements. Interestingly, the data-driven classification improves only marginally over the industry classification. This is mainly due to the partitioning in more homogeneous but smaller clusters, the average median cluster size being of 39 securities versus 46 for the industries. Moreover, the difference is statistically not significant. Finally, the attribution of stocks to clusters is very volatile, resulting in a less-robust datadriven classification when compared with a very stable industry classification.

View Image - Exhibit 4Difference Between Intragroup and Intergroup Correlations by ICB IndustryExhibit 5Main Results for the Second ICB Level

Exhibit 4Difference Between Intragroup and Intergroup Correlations by ICB IndustryExhibit 5Main Results for the Second ICB Level

View Image -

Exhibit 4 shows the difference between intragroup and intergroup correlations by ICB industry, demonstrating that historically, some sectors have contributed more to predict stock price co-movements. For the datadriven categorization, we use the mapping reported in Exhibit 1. Values are averaged over the whole period.

The use of the industry classification at the second ICB level (supersectors) yields similar results: Data-driven classification is strongly connected to industry classification. Exhibit 5 shows the mapping obtained when using the second ICB level (Panel A) and the difference between intragroup and intergroup correlations by ICB supersectors (Panel B). Values are averaged over the whole period. As expected, the difference between intragroup and intergroup correlations is higher, being on average 0.090 and 0.097 when using supersectors, versus 0.077 and 0.084 when using industries, respectively, for industry and data-driven classifications.

CONCLUSION

We find that data-driven categorizations of large U.S. stocks show a consistent overlap with industries, supporting the idea that these latter are a predominant factor in explaining stock price co-movements. Interestingly, the forecasting power of clusters, defined as the capability to predict future stock correlation, is strongly connected with their ability to identify sectors. In the period under study, the data-driven categorization was statistically not better than industry classifications, whereas the stability of the latter is a strong argument for their use from the investment and analysis perspective. Industry categorizations make sense, and they have been a useful tool to regroup U.S. stocks: Our results clearly support the choice made by market participants to endorse industry groupings as a meaningful way to analyze the market, to make investment decisions, and to control portfolio risk.

Sidebar
Footnote

ENDNOTE

1 As per announcements by ICB/FTSE, during the period under study, two amendments were made to the supersectors. The supersector Equity Investment Instruments and Non-Equity Investment Instruments are part of Financial Services from 2006, whereas Real Estate is a supersector from 2007.

References

REFERENCES

Arnott, R.D. "Cluster Analysis and Stock Price Co-movement." Financial Analysts Journal, Vol. 36, No. 6 (1980), pp. 56-62.

Bhojraj, S., C.M.C. Lee, and D.K. Oler. "What's My Line? A Comparison of Industry Classification Schemes for Capital Market Research ."Journal of Accounting Research, Vol. 41, No. 5 (2003), pp. 745-774.

Blitzer, D., and M. Maitland. "Sectors in the World of Global Investing." Journal of Indexes, Vol. 12, No. 5 (2009), pp. 34-39.

Boillat, P., N. Skowronski, and N. Tuchschmid. "Cluster Analysis: Application to Sector Indices and Empirical Validation." Financial Markets and Portfolio Management, Vol.16, No. 4 (2002), pp. 467-486.

Bramer, M. Principles of Data Mining, 2nd ed. New York: Springer, 2013.

Cavaglia, S, J. Diermeier, V. Moroz, and S. De Zordo. "Investing in Global Equities." The Journal of Portfolio Management, Vol. 30, No. 3 (2004), pp. 88-94.

Chan, L.K. C., J. Lakonishok, and B. Swaminathan. "Industry Classifications and Return Co-movement." Financial Analysts Journal, Vol. 63, No. 6 (2007), pp. 56-70.

Eberhart, A.C. "Equity Valuation Using Multiples." The Journal of Investing, Vol. 13, No. 2 (2004), pp. 48-54.

Everitt, B. S., S. Landau, M. Leese, and D. Stahl. Cluster Analysis, 5th ed. London: Wiley Series in Probability and Statistics, 2011.

FTSE. "A Guide to Industry Classification Benchmark (Equity)." FTSE Guide, Version 1.6, (November 2013), pp. 1-9.

Held,J. "Why It Is (Still) All About Sectors." Journal of Indexes, Vol. 12, No. 5 (2009), pp. 10-17.

Heston, S.L., and K. G. Rouwenhorst. "Industry and Country Effects in International Stock Returns." The Journal of Portfolio Management, Vol. 21, No. 3 (1995), pp. 53-58.

Horrell, G., and R. Meraz. "Test-Driving Industry Classifications." Journal of Indexes, Vol. 12, No. 5 (2009), pp. 26-33.

Lydon, T. "Fidelity Sector ETFs Could Impact ETF Landscape." ETF Trends, October 25, 2013: www.etftrends. com/2013/10/fidelity-sector-etfs-could-impact-etflandscape/.

Nadig, D., and L. Crigger. "Signal From Noise." Journal of Indexes, Vol. 14, No. 2 (2011), pp. 40-55.

To order reprints of this article, please contact Dewey Palmieri at [email protected] or 212-224-36 75.

AuthorAffiliation

DANIELE LAMPONI

is a portfolio manager and researcher in Geneva, Switzerland,

daniele.lamponi® gmail.com

View Image - Appendix

Appendix

Copyright Euromoney Institutional Investor PLC Summer 2014