The document discusses using Groovy for quantitative finance applications. It motivates Groovy due to its dynamic programming capabilities, integration with Java/JEE ecosystems, and developer community. It describes using a distributed cache with property listeners to manage invalidation across nodes. It also discusses exposing the cache through a Grails application for reporting. Finally, it argues that future of finance involves distributed computing with domain-specific languages.
GR8Conf 2009: Groovy in Fiance Case Study by Jonathan Felch
1. Groovy Finance:
Grid Computing
and Computational Finance
Jonathan Felch
2. Agenda
− Introduction: Speaker and Project
− Motivations for Groovy in Quantitative Finance
Business Problem
Math / Data / Business / Languages
Questions Regarding Groovy Performance
− Data Grid and Dynamic Programming
Master / Worker Model, MapReduce, and ForkJoin
Property Listeners, Dependency Graphs, Invalidation
− Grails: Exposing the cache to the world
− Revolution: Distributed Computing and DSLs
3. Introduction
Jonathan Felch
− NASDAQ Software Architect 1997-99
− Lehman Brothers Global e-Commerce Architect 1999-2000
− Venture Capital Associate @ GS / BCG JV 2000-2001
− Quantitative Systems @ Syntax Capital 2005-2006
− VP Quantitative Systems @ Credit Suisse 2006-2009
− Independent Work in Financial DSLs / Data Grids 2009+
4. Groovy Finance: A Case Study
Started at Syntax Capital Management in 2005
− Capital Structure Arbitrage
− Almost Every Asset Class
− Focus on Credit Markets, Macro and Single Name
Matured at an Investment Bank 2007-2009
− Groovy To The Rescue
− Grails and Integration
− Meta-Programming and A Distributed Cache
− Unexpected Upside: A Quant Finance DSL
5. Groovy Finance: Architecture
Industrial Strength Back-End In Java
Enterprise Software Integration
− Data Acquisition Focus / Logical Focus
SOAP, JMS, JDBC, CSV files, ZIP files, delimited files
Bloomberg, Markit Partners, OPRA, Ivy, Barra, Imagine
− Data Delivery Solutions
Excel Plug-In
Reports in HTML, PDF, CSV, JSON, XML, JMS
DSL Console with JSyntaxPane
6. Agenda
− Introduction: Speaker and Project
− Motivations for Groovy in Quantitative Finance
Business Problem
Math / Data / Business / Languages
Questions Regarding Groovy Performance
− Data Grid and Dynamic Programming
− Grails: Exposing the cache to the world
− Revolution: Distributed Computing and DSLs
7. What is Quant Finance ?
A quant designs and implements software and
mathematical models for the pricing of
derivatives, assessment of risk, or predicting
market movements
8. What's The Problem: The Math
Quant Finance Models are Wrong
− Even The Best Models Fail, Failure Is Expensive
Assumption of Quantitative Finance Are Wrong
− Market Are Social Phenomena, Not Random Walks
Quant Finance Models Change
− In Bad Times, They Change A Lot
Coding Almost everything in C++ takes forever
Coding Everything Else in VBA doesn't scale
9. What's The Problem: The Market
Market Structures Drive Financial Data
− Different Calendars, Different Measures
− Equities and Government Auctions are Transparent
Also options, some bonds, some preferred
− Exotic and Credit Markets are Illiquid, No
Transparency
Some of products are not really 'securities'
Identifiers are ridiculous, don't work, unclear
− ISIN, CUSIP, SEDOL, Tickers, ADRs, …
− Observation Dates and Effective Dates
10. What's The Problem: The Data
Lots of Data, Lots of Math, Lots of Products
− Credit Market Subset
1500 Companies / 2500 Curves / 10 Indices & Tranches
10,000 Liquid Bonds / 2,000 Liquid Converts / 2,000 Loans
1500 Liquid Equities / 169 point vol surface to start
− Derivatives and Quant strategies have many metrics
for each time series observation
Securities can't be compared on price
Relative values metrics are complex and there are many
11. What's The Problem: The Traders
Great Trades Come From Half-Baked Ideas
− Fully Baked Ideas Have Already Been Priced In
Traders Do Not Know What They Want
Whatever They Think They Want, They Wanted
It Yesterday
Whatever They Want Today, They Will Never
Use Again
12. Languages of Quant Finance
Commonly used languages of Quant Finance
− C++ (The Dominant Industrial Strength Language)
− VBA
− Matlab, SAS, STATA, S, and R
− C#
− Java (Most limited to Fixed Income and Web)
Up and Coming / Research Languages of
Interest to Quant Finance
− Fortress, Scala, Groovy, Python, F#, and Erlang
14. Why Groovy ?
Pith, Speedy Development Cycle
− Made for half baked idea
Dynamic Programming
− Meta-Programming, DSL Support
Java / JEE / Enterprise
− Massive Ecosystem, Performance Fall back
Community
− Grails, Griffon, lots of small projects
15. Other Tools That Helped Out
Google Collections and MapReduce
VisualNumeric's JSML / UJMP / Weka
Terracotta
JSR 166 / Fork-Join
R Language
16. Groovy Performance: Overview
JIT and Operator Overloading
− Number Crunching and Monte Carlo
− GroovyMatrix
Smart Programming
− Algorithms versus Language
− Grid Computing
− Caching
'Thought Experiments' before production code
− Groovy 'gets the idea right' faster than C++
18. Agenda
− Introduction: Speaker and Project
− Motivations for Groovy in Quantitative Finance
Business Problem
Math / Data / Business / Languages
Project Background 2005-2007
− Data Grid and Dynamic Programming
Master / Worker Model, MapReduce, and ForkJoin
Property Listeners, Dependency Graphs, Invalidation
− Grails: Exposing the cache to the world
− Revolution: Distributed Computing and DSLs
19. Databases versus Caching
Traditional Model: Hibernate
− Data Model = Database plus Cache of POJOs
All Objects of the name class share structure
No (Persistent) Dynamic Properties on 1st class objects
All first class objects (query-able) lived in the database
Our Model: All POJOs → Tuples or Nodes
− Tuples of same class may 'grow' existing structure
− Tuples do not all have to come from data
Questions about what does and does not belong in
database
Query Language = Gpath / Xpath + Hibernate
Includes dynamic properties and calculated values
20. Distributed Cache and
MetaProgramming I
Terracotta for the shared memory and
synchronization
− Integration point for Hibernate and Hibernate
Cache
− Integration point for Groovy Data Adapters
All First Class Objects are decomposed from
Java or Groovy objects to a 'Tuple'
− Not perfectly named, but a simple data structure
than implements Map and List
− Usable with XPATH
− Small Set of Primitives optimized for Terracotta
21. Distributed Cache and
Meta-Programming II
Everything is a Property
− Data and methods
− Behavior follows a mathematical model
− Property listeners manage invalidation
Missing Methods / Missing Properties
− Widely used calculations and method results stored
as property values so avoid redundant calculation
− Calculated values are never stored in the database
22. Distributed Cache and
Meta Programming III
Tuple Class
− Much like a Groovy Class
− Joins objects like associations / relations in
Hibernate
− Defines raw types / names / converters
− Defines property finders / chained finders /
methods
Missing Methods / Missing Properties
− 'Missing' Events Trigger Chainable Closures
− Missing Events invoker 'Finders' and 'Builders'
23. Distributed Cache and
Meta Programming IV
Do We Even Want A Database ??
− Sometimes Accessing Data Remotely Works Just
As Well
− Sometimes Pulling Data from Flat Files On
Demand works Just As Well
− Sometimes Calculating from Values from old inputs
makes more sense than persisting it (normal forms)
'Active' Cache As a Integration Space
− Miss in a cache request trigger a chain of closures
24. Agenda
− Speaker and Project
− Motivations for Groovy in Quantitative Finance
Business Problem
Math / Data / Business / Languages
Questions Regarding Groovy Performance
− Data Grid and Dynamic Programming
Master / Worker Model, MapReduce, and ForkJoin
Property Listeners, Dependency Graphs, Invalidation
− Grails: Exposing the cache to the world
− Revolution: Distributed Computing and DSLs
25. Grails As A Integration Hub
Controller requests arrive via JSON, XML, JMS
− R Language Client: JSON → Grails
− Excel Client: JSON → Grails Over HTTP
− Excel RTD: JSON → Grails over JMS
− SwingX Table (Real-Time) JSON → Grails via JMS
− SwingX Table JSON → Grails via HTTP
Affiliated Business Units:
− XML Web Services from dot Not
− Matlab dot Net
26. Cache Logic: Reporting
def r = builder.reportWith(Bond.class, ”bond.cdsBasis < 0”) {
attr {
expression = 'bond.ticker'
name = 'Tkr'
}
attr {
expression = 'bond.coupon'
name = 'Tkr'
}
attr {
expression = 'bond.maturity'
name = 'Tkr'
}
attr {
expression = 'bond.spot?.price?.last'
name = 'Px'
}
attr {
expression = 'bond.spot?.yield?.last'
name = 'Px'
}
attr {
expression = 'bond.spot?.zspread?.last'
name = 'Px'
}
attr {
expression = 'bond.spot?.cdsBasis?.last'
name = 'Px'
}
}
30. Agenda
− Speaker and Project
− Motivations for Groovy in Quantitative Finance
Business Problem
Math / Data / Business / Languages
Questions Regarding Groovy Performance
− Data Grid and Dynamic Programming
Master / Worker Model, MapReduce, and ForkJoin
Property Listeners, Dependency Graphs, Invalidation
− Grails: Exposing the cache to the world
− Revolution: Distributed Computing and DSLs
31. The Revolution I
Technology And Finance
IBM
PDP-11 SPARC EMAIL WEB XML GRID
PC
The Network Is The Computer
− We Can't Agree On Which End Of The Byte
Comes First (Big Endian / Little Endian)
− We Can't Agree On Character Set and Line
Delimiters (EBCIDEC, ASCII, Unicode)
− We Can't Agree How to Share Files
− We Can't Agree How To Share Code
32. The Future Is Multi Core
Performance And Scalability
− Calculations and Queries: Once or Many ?
− Weak HashMaps and Distributed Cache
− Actors and The Erlang / Scala Metaphor
−
Computational Versus Data Grid
− Should Data Move Around The Network or Code ?
− Joins Across Data Stores
− Joins Across Computation Efforts
33. The Future Is Multi Core
DASEL = DSL + Financial Ecosystem
One DSL or Many ?
− Report Generation
− Simulation
− Portfolio Generating Functions
Uses
− Analytic Trading
− Risk Analytics
− Opportunity Hunting / Discovery of Dislocations
34. Victory Means Never Having To Say ....
Moving Code Around The Network
− Moving code to where the data is
− Not moving data to where the code is
Security And Strength
− The Power Of The Virtualized Runtime
The Future Is Multi-Core
− Version II of Moore's Law and the End of C++
35. Demo: (Time Permitting)
Groovy Matrix
A High Performance Example of
Groovy Threads and a Math DSL