Trading & Market Data Hub
Centralized platform ingesting 8+ dealer data feeds with Excel/PDF/email parsing, reducing analyst prep time by 50%.
Overview
Trading desks receive market data from multiple dealers in inconsistent formats — Excel spreadsheets, PDFs, emails, and CSV files with different column layouts, spread conventions, and rating scales. This platform centralizes ingestion, normalizes the data, and presents a unified view that eliminates hours of manual data preparation.
Technical Approach
graph TD
A[Dealer Feeds] --> B[Format Detection]
B --> C1[Excel Parser]
B --> C2[PDF Parser]
B --> C3[Email Parser]
B --> C4[CSV Parser]
C1 --> D[Normalization<br/>Engine]
C2 --> D
C3 --> D
C4 --> D
D --> E[Spread Convention<br/>Harmonization]
E --> F[Rating<br/>Harmonization]
F --> G[Time-Series<br/>Aggregation]
G --> H[React<br/>Dashboard]
Multi-Format Ingestion: The platform automatically detects incoming file formats and routes them to specialized parsers. Each parser handles format-specific challenges — merged cells in Excel, table extraction from PDFs, attachment handling from emails.
Normalization Engine: Dealer-specific column mappings, spread conventions (OAS, DM, Z-spread), and rating scales (Moody’s, S&P, Fitch) are harmonized to a common schema. Configurable mapping files allow quick onboarding of new dealers.
Time-Series Aggregation: Historical data points are aligned to a common timeline, enabling cross-dealer comparison and trend analysis. Analysts can track how dealer pricing evolves over time for specific securities.
Key Results
- Ingests data from 8+ dealer feeds across 4 file formats
- Reduces analyst data preparation time by approximately 50%
- Harmonizes 3 rating scales and multiple spread conventions automatically
- Historical time-series enables cross-dealer pricing comparison
Technologies Used
Python React PySpark openpyxl Material-UI