On characterizing scale effect of Chinese mutual funds via text mining
SIGNAL PROCESSING
DOI:
10.1016/j.sigpro.2015.05.018
出版年:
JUL 2016
摘要
This paper investigates the correlation between mutual funds' scale and return in China by text mining on the sheer volume of online financial reports. We crawl the webpages of all Chinese open-end mutual funds from a well-known financial website, which are then parsed to obtain time-series data of fund scales" and returns. We argue that with long-tail distribution of fund scales, to examine the correlation directly in an individual level is not appropriate; rather, we should consider it in a group level by scales and take different market conditions into consideration. To illustrate this, we start with a data-fitting test to demonstrate that the tail of fund scale fits best in a distribution between Power-Law and Log-Normal. Hence, to categorize mutual funds by equal scale could lead to fund groups in substantially different sizes, and the subsequent results are thus prone to bias. We therefore introduce K-means clustering for fund categorization, which enables reliable examination of correlations between fund scale and return. Empirical study unveils some interesting findings on the scale effect of funds under different market conditions. These findings highlight the uniqueness of emerging markets while providing interesting guidelines for exploiting big data analytics for financial studies. (C) 2015 Elsevier B.V. All rights reserved.