Rutvik Joshi | Data Science Portfolio

Data Scientist & ML Engineer

prof_pic.jpeg

New York Metropolitan Area

Data Scientist at Guggenheim Investments building ML/NLP pipelines and full-stack analytics applications for structured credit and fixed income markets. I work across the full data lifecycle — from raw document extraction and entity resolution to interactive dashboards that surface actionable insights for portfolio managers and traders.

Areas of Expertise

NLP & Large Language Models — Entity resolution at scale using fuzzy matching and LLM disambiguation, knowledge graph construction from unstructured financial news, and document intelligence pipelines.

Distributed Computing & Data Engineering — PySpark pipelines for portfolio analytics, multi-format data extraction (Excel, PDF, CSV, email) with lossless verification, and automated batch processing for high-volume financial datasets.

Full-Stack Applications — React 18 + TypeScript dashboards with 19+ analytical views, deal management platforms, and market data hubs that centralize multi-dealer feeds.

Financial Analytics — Statistical outlier detection (Z-score), portfolio concentration metrics (HHI), month-over-month/quarter-over-quarter delta analysis, and GAAP vs. market value comparison for structured credit products.

Technical Skills

Domain Technologies
ML & NLP Python, PySpark, scikit-learn, PyTorch, TensorFlow, Hugging Face, GPT-4, Llama-2, FuzzyWuzzy
Data Engineering PySpark, pandas, openpyxl, xlrd, extract_msg, gzip, Decimal
Frontend React 18, TypeScript, Tailwind CSS, Material-UI, Recharts, MUI X-Charts
Visualization Plotly, Matplotlib, Seaborn, NetworkX
Infrastructure Git, Docker, Linux, MySQL, MongoDB, AWS

latest posts

Jan 20, 2024 Osdag Internship 2024
Nov 1, 2023 Decision tree model
Oct 31, 2023 ML Development Process
Oct 25, 2023 Data Structures & Algorithms
Oct 15, 2023 Hashing