For many securities organizations, post-trade processing is expensive, cumbersome, and time-consuming. This is in part due to the massive volumes of data required for processing a trade and the limited agility of the technology many organizations rely on today. In order to create efficiencies and move faster, many Financial Services organizations are working with AWS to implement post-trade solutions built with AWS’ storage services (S3 and Glacier) and big data capabilities (Athena, EMR, Redshift, and QuickSight ). In this session, AWS will walk through a trade capture and regulatory reporting solution that utilizes the aforementioned AWS services. We will also provide guidance around obtaining data-driven insights (from pixels to pictures), bolstering encryption with Amazon KMS, and maintaining transparency and control with Amazon CloudWatch and Amazon CloudTrail (which also helps meet SEC Rule 613 that requires the creation of comprehensive consolidated audit trails).
3. Today’s regulatory reporting landscape
Financial institutions face challenges capturing, cleaning, organizing, and reporting for an
array of regulators and regulatory frameworks along with new expectations of fine-grained,
n-dimensional reporting with data lineage and governance controls.
EMA
PRA
Treasury
FDIC
FFIECBASEL
Dodd-Frank
NMSMiFID II
BCBS 239
CCAR
ESMA
RDA
FR Y-9C
4. Regulatory reporting challenges
What is required
• Platform with limitless storage and cost-effective archiving
• Tools for effective data management and transformation with security and controls
• Ability to scale up and down to meet market conditions
Diversity of
formats
Volume of
data
Single record of
truth
Stringent SLAs
(and fines)
Evolving
requirements
5. Legacy data architecture challenges
Causes of complexity
• No lineage back to source data
• Data stored in multiple disconnected data silos
• High level of operational complexity
• Competing resources for same data infrastructure
and DB servers
• Complex ETL processes
Effect: lack of agility
• Slow to onboard new data sources
• Slow to adapt to data format changes
• Slow to build new types of reports
• Slow to share data across teams and with regulators
6. Case Study: SEC Rule 613
The Consolidated Audit Trail will track orders throughout their lifecycle
and identify the broker-dealers handling them, thus allowing regulators to
more efficiently track activity in Eligible Securities throughout the U.S. markets.
The primary goal of Security Exchange Commission (SEC) Rule 613 is to
improve the ability of the SEC and the Self-Regulatory Organizations (SROs)
to oversee trading in the US securities markets.
Beyond existing framework
• Unique identify each customer
• Listed options, market making quotes, post-trade allocations
• Improved granularity and timestamps
Refer to SEC Rule 613, available at: https://www.sec.gov/News/PressRelease/Detail/
PressRelease/1365171483188 for more details.
8. On-prem
application
Data lake ETL Outbound
store
Data
warehouse
Visualization
Ad hoc
queries
Send post-
trade data in
FIX format Send
post-
trade
data FIX
format
Process
ETL
Write
processed
data
Load from
store
Query
Analysis and
insight
Exchange
data
Trade reporting sequence of events
9. AWS big data platform
Amazon EMR Amazon
EC2
Amazon
Glacier
Amazon
S3
AWS Import/Export
Amazon Kinesis
AWS Direct Connect
Amazon
Redshift
Amazon
DynamoDB
AWS Database
Migration Service
Collect Orchestrate Store Analyze
AWS Lambda
AWS IoT
AWS Data Pipeline
Amazon Kinesis
Analytics
Amazon
SNS
AWSSnowball
Amazon
SWF
Amazon Athena
AWS Glue
Amazon Machine
Learning
Amazon
QuickSight
Amazon
Aurora
10. Encryption ComplianceSecurity
• AWS Identity and Access
Management (IAM) policies
• Bucket policies
• Access Control Lists (ACLs)
• Private VPC endpoints to
Amazon S3
• SSL endpoints
• Server Side Encryption
(SSE-S3)
• S3 Server Side
Encryption with
provided keys (SSE-C,
SSE-KMS)
• Client-side encryption
• Buckets access logs
• Lifecycle Management
Policies
• ACLs
• Versioning & MFA deletes
• Certifications – HIPAA,
PCI, SOC 1/2/3 etc.
AWS security controls for big data
11. Region
Multi-part
upload of
encrypted
data
S3 Data
Lake
Transient EMR
Clusters for ETL
Cleansed,
Formatted,
Split,
Compressed
Output
Internal App
On-prem
On-prem HSM
(optional)
Amazon CloudWatch Alarm AWS CloudTrail
Amazon Glacier
(WORM
storage)
AWS KMS
Example CAT reporting architecture on AWS
BYO Key
S3 Data
Warehouse
Transient EMR
Clusters for Event
Sequencing
CAT
output
12. Region
Multi-part
upload of
encrypted
data
S3 Data
Lake
Transient EMR
Clusters for ETL
Cleansed,
Formatted,
Split,
Compressed
Output
Amazon
Athena
Internal App
On-prem
On-prem HSM
(optional)
Amazon
QuickSight
Alarm
Example trade reporting architecture on AWS
BYO Key
S3 Data
Warehouse
Amazon CloudWatch AWS CloudTrailAWS KMS
Amazon Glacier
(WORM
storage)
13. Amazon Athena: cost example
12 queries
Each query scanned 739GB (nonoptimized data)
12 x 739 = 8,868 GB scanned = 8.868 TB
Cost = $5/TB
Total Cost: 8.868 x 5 = $44.34
14. Solution features overview
Generate
Billion+
FIX msgs
S3
Data
Lake
(source
of truth)
S3
Output
Area
Optimized
formats
CSV
Parquet
Multiple
Tools to
Process
Data
EMR
Cluster
Spark
ETL
jobs
Amazon
QuickSight
visualizations
BI Tools
Amazon
Athena
31. “For our market
surveillance systems, we
are looking at about 40%
[savings with AWS], but
the real benefits are the
business benefits: We
can do things that we
physically weren’t able to
do before, and that is
priceless.”
- Steve Randich, CIO
Case study: Re-architecting analytics on AWS
What FINRA needed
• Infrastructure for its market surveillance platform
• Support of analysis and storage of approximately 75
billion market events every day
Why they chose AWS
• Fulfillment of FINRA’s security requirements
• Ability to create a flexible platform using dynamic
clusters (Hadoop, Hive, and HBase), Amazon EMR,
and Amazon S3
Benefits realized
• Increased agility, speed, and cost savings
• Estimated savings of $10-20m annually by using AWS
32. Market surveillance
FINRA uses Amazon EMR and Amazon S3 to process up to 75
billion trading events per day and securely store over 5
petabytes of data, attaining savings of $10-20mm per year.
33. • Nasdaq implements an S3 data lake + Amazon Redshift
data warehouse architecture
• Most recent two years of data is kept in the Amazon
Redshift data warehouse and snapshotted into S3 for
disaster recovery
• Data between two and five years old is kept in S3
• Presto on EMR is used to ad hoc query data in S3
• Transitioned from an on-premises data warehouse to
Amazon Redshift and S3 data lake architecture
• Over 1,000 tables migrated
• Average daily ingest of over 7B rows
• Migrated off legacy data warehouse to AWS (start to
finish) in 7 man-months
• AWS costs were 43% of legacy budget for the same
data set (~1100 tables)
Case study: Moving mountains (of data) on AWS