Rutvik Joshi | Data Science Portfolio
New York Metropolitan Area
Data Scientist at Guggenheim Investments building ML/NLP pipelines and full-stack analytics applications for structured credit and fixed income markets. I work across the full data lifecycle — from raw document extraction and entity resolution to interactive dashboards that surface actionable insights for portfolio managers and traders.
Areas of Expertise
NLP & Large Language Models — Entity resolution at scale using fuzzy matching and LLM disambiguation, knowledge graph construction from unstructured financial news, and document intelligence pipelines.
Distributed Computing & Data Engineering — PySpark pipelines for portfolio analytics, multi-format data extraction (Excel, PDF, CSV, email) with lossless verification, and automated batch processing for high-volume financial datasets.
Full-Stack Applications — React 18 + TypeScript dashboards with 19+ analytical views, deal management platforms, and market data hubs that centralize multi-dealer feeds.
Financial Analytics — Statistical outlier detection (Z-score), portfolio concentration metrics (HHI), month-over-month/quarter-over-quarter delta analysis, and GAAP vs. market value comparison for structured credit products.
Technical Skills
| Domain | Technologies |
|---|---|
| ML & NLP | Python, PySpark, scikit-learn, PyTorch, TensorFlow, Hugging Face, GPT-4, Llama-2, FuzzyWuzzy |
| Data Engineering | PySpark, pandas, openpyxl, xlrd, extract_msg, gzip, Decimal |
| Frontend | React 18, TypeScript, Tailwind CSS, Material-UI, Recharts, MUI X-Charts |
| Visualization | Plotly, Matplotlib, Seaborn, NetworkX |
| Infrastructure | Git, Docker, Linux, MySQL, MongoDB, AWS |
latest posts
| Jan 20, 2024 | Osdag Internship 2024 |
|---|---|
| Nov 1, 2023 | Decision tree model |
| Oct 31, 2023 | ML Development Process |
| Oct 25, 2023 | Data Structures & Algorithms |
| Oct 15, 2023 | Hashing |