// frameworks & tooling
Forty-three frameworks. Documented.
NLP, machine learning, Azure ML & cloud, MERN + API integration, Graph RAG. Click any framework to see what it does, how I use it, and the code I ship with it.
Pipelines that turn raw, messy human language into structured signal — embeddings, intent, entities and grounded answers.
spaCy
Industrial-strength NLP pipelines — tokenization, NER, dependency parsing, custom components.
→ structured extraction from PDFs & ticketsHugging Face Transformers
200k+ pre-trained models. Fine-tune BERT, T5, Llama, Mistral on your own corpus.
→ domain-tuned classifiers & embeddingsLangChain
Orchestrates LLMs, tools, memory and retrievers into agentic workflows that actually ship.
→ multi-step AI agents & tool useNLTK
Classic NLP toolkit — tokenizers, stemmers, corpora, the foundation behind every linguistics course.
→ research, prototyping, educationOpenAI & Anthropic
Frontier LLMs — GPT-4 / Claude — wired with prompt caching, tool use and structured outputs.
→ production reasoning & generationBERT & RoBERTa
Bidirectional transformers for embeddings, classification and span-level question answering.
→ semantic search, sentiment, intentWhisper & TTS
Speech-to-text and natural voice synthesis — true to "Vaaani" (वाणी = voice).
→ voice agents, multilingual transcriptionSentence-Transformers
Sentence-level embeddings for semantic similarity, clustering and dense retrieval.
→ vector search backboneDify
Open-source LLM app platform — visual workflows, RAG, agent definitions, all self-hostable.
→ no-code AI workflows for clientsOpenHands
Open-source autonomous AI software developer (formerly OpenDevin) — runs in a sandboxed VM.
→ agentic coding & refactor automationFrom classical regression to billion-parameter transformers — picked per problem, not per hype cycle.
PyTorch
Dynamic graphs, research-grade DL. My default for fine-tuning, distillation and custom models.
→ fine-tunes, custom transformersTensorFlow / Keras
Production deep learning, mobile (TFLite) and edge inference for Android apps.
→ on-device models, classifiersscikit-learn
Classical ML done right — pipelines, cross-validation, ensembles, feature engineering.
→ tabular baselines, MVPsXGBoost & LightGBM
Gradient boosting champions for tabular data — leaderboard winners that ship to prod.
→ churn, fraud, lead scoringPandas + NumPy
The data backbone — vectorized math, dataframes, joins, the daily bread of every ML project.
→ ETL, EDA, feature storesJAX
Functional, JIT-compiled, autodiff. Used when raw throughput on TPU/GPU matters.
→ research-scale trainingMLflow + Weights & Biases
Experiment tracking, model registry, reproducibility. So you trust what's running in prod.
→ MLOps, model lineageOptuna & Ray Tune
Hyperparameter search at scale — Bayesian, Hyperband, distributed.
→ wringing the last 3% of accuracyEnterprise-grade ML, hosted on Azure (or AWS / GCP) with auth, audit, region pinning and compliance baked in.
Azure Machine Learning
Managed training, AutoML, model registry, real-time endpoints — the workhorse for enterprise builds.
→ regulated industries (BFSI, health)Azure OpenAI Service
GPT-4 class models inside your Azure tenancy — your data stays in your region, with SLAs.
→ private LLMs for enterprisesAzure Cognitive Services
Pre-built APIs for vision, speech, language and decision — wire them up in hours, not weeks.
→ OCR, translation, form parsingAzure Functions
Serverless inference triggers — webhooks, queues, cron — pay-per-call AI workers.
→ event-driven AI pipelinesAWS SageMaker
Train, host and monitor models with autoscaling endpoints — when the customer's stack lives on AWS.
→ multi-tenant inference at scaleGoogle Vertex AI
End-to-end ML on GCP — Gemini integration, BigQuery ML, AutoML Tables.
→ data-warehouse-native AIDatabricks
Lakehouse + MLflow + Unity Catalog — when data, training and governance live in one place.
→ enterprise data + AI fusionDocker + Kubernetes
Containerized AI workers with auto-restart, health checks and zero-downtime deploys.
→ portable production deploysThe web body around the AI brain — typed APIs, secure auth, realtime updates, and a clean React UI for non-technical operators.
MongoDB
Flexible document store — perfect for nested AI artifacts (chats, embeddings, metadata).
→ chat history, RAG corpusExpress.js
Fast, minimal Node API layer — middleware pipeline, REST + JSON, easy to test.
→ public + internal APIsReact + Next.js
Component-driven UI, SSR/ISR, server actions — modern React the way it ships in 2026.
→ dashboards, marketing sitesNode.js
Async runtime, event loop, streams — perfect glue between your AI workers and your customers.
→ webhooks, queues, schedulersREST + GraphQL
Schema-first APIs — REST for simplicity, GraphQL for typed contracts and bandwidth control.
→ public APIs, mobile clientsJWT + OAuth 2.0
Auth done properly — refresh tokens, scopes, RBAC, SSO with Google / Microsoft / GitHub.
→ multi-tenant, role-based accessWebSockets + Webhooks
Realtime chat streams, typing indicators, event push from third-party SaaS.
→ live chat, agent updatesLibreChat
Open-source ChatGPT clone built on the MERN stack — multi-model (OpenAI, Anthropic, Mistral, Ollama), self-hostable.
→ branded internal chat for customer teamsStripe · Twilio · SendGrid
Payments, SMS/WhatsApp, transactional email — battle-tested integrations across every build.
→ checkout, OTP, alerts, dripsBeyond plain vector search — knowledge graphs + retrieval give your LLM relationships, not just nearest neighbors. The unfair advantage for complex domains.
Graph RAG
The Vaaani specialty. Entities + relations form a graph; the LLM walks it instead of stumbling through vector noise. Higher precision, fewer hallucinations.
→ legal, medical, scientific QANeo4j
Native graph database — Cypher queries, APOC algorithms, GDS for centrality & community detection.
→ knowledge graph storageLlamaIndex
Document indexing and hybrid retrieval — combine vectors, keywords, summaries and graph hops.
→ multi-doc question answeringPinecone · Weaviate · Qdrant
Production vector databases — millisecond ANN search across millions of embeddings.
→ semantic search at scaleNetworkX + igraph
Graph algorithms — PageRank, community detection, shortest paths — to enrich the retrieval layer.
→ entity importance scoringtext-embedding-3 / Voyage
Best-in-class embedding models for high-fidelity semantic encoding.
→ retrieval quality & recallMicrosoft GraphRAG
Reference implementation of community-summary-based GraphRAG — adapted to your domain ontology.
→ enterprise knowledge basesReranking + Eval
Cohere rerank, Ragas, TruLens — measure retrieval precision so RAG actually works in prod.
→ continuous quality monitoring