Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

On Clustering Financial Time Series - Beyond Correlation

Financial correlation matrices are noisy. Most of their coefficients are meaningless. RMT advocates that the intrinsic dimension is much lower than O(N^2). Clustering can help to reduce the dimension. But, it can also work on other information than mere correlation...

  • Loggen Sie sich ein, um Kommentare anzuzeigen.

  • Gehören Sie zu den Ersten, denen das gefällt!

On Clustering Financial Time Series - Beyond Correlation

  1. 1. ON CLUSTERING FINANCIAL TIME SERIES GAUTIER MARTI, PHILIPPE DONNAT AND FRANK NIELSEN NOISY CORRELATION MATRICES Let X be the matrix storing the standardized re- turns of N = 560 assets (credit default swaps) over a period of T = 2500 trading days. Then, the empirical correlation matrix of the re- turns is C = 1 T XX . We can compute the empirical density of its eigenvalues ρ(λ) = 1 N dn(λ) dλ , where n(λ) counts the number of eigenvalues of C less than λ. From random matrix theory, the Marchenko- Pastur distribution gives the limit distribution as N → ∞, T → ∞ and T/N fixed. It reads: ρ(λ) = T/N 2π (λmax − λ)(λ − λmin) λ , where λmax min = 1 + N/T ± 2 N/T, and λ ∈ [λmin, λmax]. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 λ 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 ρ(λ) Figure 1: Marchenko-Pastur density vs. empirical den- sity of the correlation matrix eigenvalues Notice that the Marchenko-Pastur density fits well the empirical density meaning that most of the information contained in the empirical corre- lation matrix amounts to noise: only 26 eigenval- ues are greater than λmax. The highest eigenvalue corresponds to the ‘mar- ket’, the 25 others can be associated to ‘industrial sectors’. CLUSTERING TIME SERIES Given a correlation matrix of the returns, 0 100 200 300 400 500 0 100 200 300 400 500 Figure 2: An empirical and noisy correlation matrix one can re-order assets using a hierarchical clus- tering algorithm to make the hierarchical correla- tion pattern blatant, 0 100 200 300 400 500 0 100 200 300 400 500 Figure 3: The same noisy correlation matrix re-ordered by a hierarchical clustering algorithm and finally filter the noise according to the corre- lation pattern: 0 100 200 300 400 500 0 100 200 300 400 500 Figure 4: The resulting filtered correlation matrix BEYOND CORRELATION Sklar’s Theorem. For any random vector X = (X1, . . . , XN ) having continuous marginal cumulative distribution functions Fi, its joint cumulative distribution F is uniquely expressed as F(X1, . . . , XN ) = C(F1(X1), . . . , FN (XN )), where C, the multivariate distribution of uniform marginals, is known as the copula of X. Figure 5: ArcelorMittal and Société générale prices are projected on dependence ⊕ distribution space; notice their heavy-tailed exponential distribution. Let θ ∈ [0, 1]. Let (X, Y ) ∈ V2 . Let G = (GX, GY ), where GX and GY are respectively X and Y marginal cdf. We define the following distance d2 θ(X, Y ) = θd2 1(GX(X), GY (Y )) + (1 − θ)d2 0(GX, GY ), where d2 1(GX(X), GY (Y )) = 3E[|GX(X) − GY (Y )|2 ], and d2 0(GX, GY ) = 1 2 R dGX dλ − dGY dλ 2 dλ. CLUSTERING RESULTS & STABILITY 0 5 10 15 20 25 30 Standard Deviation in basis points 0 5 10 15 20 25 30 35 Numberofoccurrences Standard Deviations Histogram Figure 6: (Top) The returns correlation structure ap- pears more clearly using rank correlation; (Bottom) Clusters of returns distributions can be partly described by the returns volatility Figure 7: Stability test on Odd/Even trading days sub- sampling: our approach (GNPR) yields more stable clusters with respect to this perturbation than standard approaches (using Pearson correlation or L2 distances).

×