
Hello, I'm Pavan Pandya. I'm driven by curosity and a passion for solving real world problems through technology.
About me
My journey with technology started with a simple curiosity—one that grew into a lifelong passion. Movies like "The Internship" and "Iron Man" fueled my dream of creating something extraordinary, like JARVIS. Today, I’m pursuing a Master’s in Computer Science and building AI applications that push the boundaries of what's possible. My work is driven by a desire to solve complex puzzles, innovate, and use technology to make life better. Collaboration is key to my approach, and I’m always excited to connect with others who share my passion.
When I'm not coding, you’ll find me experimenting in the kitchen or taking long walks to clear my head. Those moments help me reconnect, find inspiration, & often lead to my best idea.
My projects
Itinera - Full Stack AI Travel Planner Application
Developed a full-stack travel planning app that provides personalized recommendations and real-time data synchronization, enhancing the user experience with AI-driven features and secure cloud-based services.
- React
- Gemini AI
- TailwindCSS
- Firebase
Optimized NSFW Content Detection for Real-Time Moderation
Developed an NSFW content detection model using ResNet18, achieving 95.1% accuracy and optimizing model size and inference speed for real-time moderation on resource-constrained devices.
- PyTorch
- CNN
- Deep Learning
- Quantization
MediLink - Patient and Insurance Management System
As a full-stack developer, I developed a healthcare management platform enabling patients, doctors, and insurance providers to efficiently manage appointments, medical records, and insurance plans.
- Django
- Next.js
- Render
- PostgreSQL
- Docker
DocQnA – Intelligent PDF Querying LLM System
I led the creation of a PDF-based question-answering system using retrieval-augmented generation (RAG), integrating Apache Cassandra for data management and Streamlit for a user-friendly interface.
- Apache Cassandra
- Astra DB
- LangChain
- Streamlit
- Hugging Face
Sales & Customer Data Analysis Dashboard
I developed Tableau dashboards for sales and customer analysis, enhancing trend identification and customer segmentation with interactive, data-driven insights.
- Tableau
- ETL
- SQL
- Data Collection
Kidney Disease Classification
I improved kidney disease classification accuracy with a deep-learning model and streamlined the deployment process using AWS, Docker and creating an efficient CI/CD pipeline.
- Python
- Deep Learning
- MLFlow
- DVC
- AWS
- Docker
- GitHub Actions
Unveiling Trends - A Cloud-Driven Data Engineering Project
I created a custom YouTube data scraper and built interactive QuickSight dashboards to analyze and visualize trending topics, supporting informed decision-making.
- AWS
- S3
- Glue
- Lambda
- Athena
- QuickSight
- Python
Uber Data Analysis Pipeline using GCP
I developed a GCP data pipeline to analyze NYC taxi trip data, enhancing processing efficiency and operational effectiveness through insightful visualizations with Looker.
- Python
- BigQuery
- Data Extraction
- Data Transformation
- MageAI
- Looker
My skills
- Python
- JavaScript
- C
- C++
- PHP
- HTML
- CSS
- Node.js
- Express.js
- React
- Flask
- Django
- LangChain
- RESTful APIs
- PyTorch
- TensorFlow
- Scikit-learn
- Pandas
- NumPy
- NLTK
- Matplotlib
- Seaborn
- Selenium
- Huggingface
- Regex
- Tailwind CSS
- Streamlit
- SQL
- PostgreSQL
- MySQL
- SQLite
- NoSQL
- MongoDB
- Apache Cassandra
- DynamoDB
- Redis
- AWS
- Azure
- Docker
- GitHub Actions
- CI/CD
- Kubernetes
- Git
- Linux
- Elasticsearch
- Hadoop
- Apache Spark
- Postman
- Swagger
- Tableau
- Render
- Vercel
- Jira
- Agile
- Scrum
- SDLC
- Unit Testing
- Integration Testing
My experience
Marchup Inc.
Software Developer Intern
San Jose, CA
- Architected a scalable, real-time AI chatbot for student counseling using Flask microservices, Azure OpenAI (dual-mode static/generative LLM architecture), and asynchronous programming, achieving <1s response time and reducing LLM API costs through caching and prompt engineering.
- Engineered and optimized a data-driven user recommendation engine leveraging Elasticsearch, vector embeddings, and cosine similarity to deliver personalized user suggestions based on interests, location, and interaction history, resulting in increase in user engagement.
- Led the migration from Lucene Search to AWS OpenSearch, architecting automated reindexing workflows with AWS Lambda and implementing monitoring pipelines via CloudWatch, boosting search response times and ensuring zero data loss during transition.
- Orchestrated containerized deployment using Docker and AWS ECS Fargate, implementing CI/CD pipelines, automated scaling, and comprehensive testing frameworks, enabling seamless, zero-downtime deployments, and maintaining service reliability with 99.9% uptime.
- Drove Agile development processes, driving cross-functional collaboration in a team of 5 dev. Conducted daily stand-ups, evaluated frameworks, performed rigorous performance testing, and aligned technical solutions with business needs, resulting in high-quality product delivery.
Dhirubhai Ambani Institute of Information and Communication Technology
Data Science Intern
Gandhinagar, India
- Extracted over 1.5M+ tweets using Twitter API and Selenium, identifying 157K relevant tweets related to inflation through advanced topic modeling techniques (BERTopic, LDA), laying the groundwork for targeted economic trend analysis.
- Achieved 87% sentiment trend classification accuracy by applying advanced text preprocessing and BERT embeddings, optimizing performance through model fine-tuning with manual annotation of 300 tweets and hyperparameter tuning.
- Presented key findings to researchers, illustrating the influence of tweet sentiment on people's buying patterns & inflation trends.
Hate Speech and Offensive Content Identification
Data Science Intern
Ahmedabad, India
- Enhanced model performance by 15% by fine-tuning a BERT-based multilingual model using pseudo-labeling techniques, leveraging data for hate speech detection, and optimizing it with few-shot learning for low-resource languages.
- Increased annotation throughput by 40% by developing a REST API backend that streamlined participant submissions and introduced dynamic dashboard filtering by task accuracy and categories, enabling faster and more efficient workflows.
- Ensured seamless platform functionality by performing unit and integration testing and enforcing adherence to a CI/CD pipeline, while improving user experience through UI enhancements and content updates on the front end.
Contact me
Please contact me directly at pavanpandya.iu@gmail.com or through this form.