Machine learning engineer with expertise in building efficient pipelines and conducting in-depth data analysis. Focused on creating scalable solutions that turn raw data into valuable insights, and committed to improving model accuracy while streamlining processes for optimal performance across diverse projects and applications.
I am a Software Engineering graduate with massive interest in Data Space. As an freelance ML Engineer, I am helping clients to design, develop, and deploy machine learning solutions that address complex business problems and generate value for the organization. I am skilled in Tableau, Power BI and python tools for data insights.
Deep Learning
Edge Computing
Driver Drowsiness Detection is a system that monitors driver behavior using a camera mounted on the dashboard. It is an application of computer vision utilizing an edge device to monitor the driver.
Data Analysis
Data Visualization
This project, conducted in collaboration with KravitzLab, Washington University St. Louis, that involved experiments on MOUSERAT using Pallidus MR1 to collect activity data. The study focused on temporal analysis to achieve desired insights.
Machine Learning
Data Analysis
An application of Deep Learning to forecast future stock volumes and prices.
Machine Learning
The project was completed as a part of internship at research lab for Software Defined Optical Networks. The task was to predict the state of the control switches when light with certain wavelngths pass through a specialized photonic switch.
Data Analysis
Data Visualization
A Streamlit app that uses the Google Places API to analyze and visualize reviews of various business places across different locations.
NEAT-AI
The project aimed at building an AI-agent that learns to play the famous chrome-dino game.
Data Analysis
Data Visualization
The F.O.A.M dashboard is a tool that provides insights into the past and current U.S. government contracts, to helps contractors and businesses identify new opportunities by analyzing relevant data and competitors info.
Data Analysis
Data Dashboards
This project, in collaboration with Ammonite LLC, aimed to deliver real-time insights and analytics for improved decision-making. The solution helps optimize inventory management and trading activities for enhanced operational efficiency in the logistics sector.
Data Analysis
Dashboards
This project was developed for an eBay seller store, leveraging their sales data to generate meaningful insights.
NLP
Supervised Fine-Tuning
This project was completed while learning about LLMs in the course Generative AI with LLMs taught by DeepLearning.ai.
Deep Learning
MLOps
AWS
This project is designed for chest cancer classification using the keras Model API and MLflow for experiment tracking.
MLOps
Data Analysis
This project is designed to predict house prices using an end-to-end machine learning (ML) pipeline. It incorporates MLOps principles to streamline the development, deployment, and maintenance of the model.
Working with Huda on our data visualization project was a breath of fresh air! 😊 Her attention to detail and professionalism exceeded all expectations, and her cooperation and deep understanding of our needs were second to none. Always the best and most efficient worker - we're already collaborating on more projects!
Very very professional, very dedicated to excellent work, very good understanding of data and translating into meaningfull dashboards
She got my attention with her first response, and even though I had started the work with another, I still decided to go with her. I was able to understand the plot and also make changes, and she was there to assist with all my questions. I just learned about stream lit thanks to her. I highly recommend this freelancer.
Amazing performance and very high-quality delivery. Communication is great, engagement from her side is super and she does not stop until we get to the desired target. She masters streamlit and plotly which has been incredible valuable for me. We will keep working together for sure :-)
This is by the best person on Fiverr I have worked with! Extremely talented, and goes the extra mile to ensure you are happy with the order. 10/10 experience and will highly recommend you to pick this person. In my case she took a excel work and transferred it into python. Couldn’t be more happy :-)
This was an amazing developer. She is great at communication, answering questions, and she delivered an outstanding dashboard for me!
she is very passionate about work she does more work what we expect very kind and humble she has full idea about concepts she responds very quickly she helps once completed the project also
Did above an beyond to give me the result I wanted. Highly recommend !
It has been a pleasure to work with Hudiye. Very attentive to all questions. She takes her work seriously and diligently.
Driver Drowsiness Detection is a system that monitors driver behavior using a camera mounted on the dashboard. It is an application of computer vision utilizing an edge device to monitor the driver. A buzzer is used to alert the drowsy driver.
Despite the maturity of drowsiness detection technology, existing solutions are often too expensive or unreliable for widespread use. The growing number of road users highlights the need for an affordable and dependable system to reduce accidents caused by driver inattentiveness, thereby protecting both drivers and other road users.
The project involves developing a driver drowsiness detection system using an edge device with a camera and deploying a neural network model to monitor driver behavior. Key steps include:
The project utilizes a combination of hardware and software technologies:
Data for model training and testing is gathered from Kaggle and sources like UTA-LRDD. The data gathered was very diverse and cover various scenarios. This data is transformed to have infrared effect and is pre-processed to extract relevant features such as facial landmarks. Additional data is collected via accelerometer readings to detect accidents.
The model is built using TensorFlow and converted to TensorFlow Lite to optimize for edge device deployment. It employs convolutional neural networks (CNNs) for detecting driver drowsiness by analyzing facial features. OpenCV and Dlib libraries are used for image processing and facial landmark detection.
The system architecture consists of an edge device with a camera and accelerometer, running the drowsiness detection model. The architecture follows a procedural programming approach for the device and an MVC architecture for the web portal. The device continuously monitors driver behavior and triggers alerts if drowsiness is detected. The web portal allows users to download updates and report incidents.
Implementation includes integrating the hardware components, developing the machine learning model, and setting up the web portal. The system undergoes rigorous testing using Pytest to ensure reliability and performance. Real-world testing is conducted to validate the system's effectiveness in detecting driver drowsiness and preventing accidents. Continuous feedback and updates will be provided to enhance system safety and functionality.
This project, conducted in collaboration with KravitzLab, Washington University St. Louis, that involved experiments on MOUSERAT using Pallidus MR1 to collect activity data. The study focused on temporal analysis to achieve desired insights.
The Temporal Analysis of MR1 Data project was conducted for KravitzLab, focusing on the activity tracking data of subjects over time. The analysis involved identifying temporal patterns, testing hypotheses, and comparing day versus night activity levels using a Streamlit-based Python app featuring interactive Plotly charts.
The project was developed using Python with libraries such as Pandas for data manipulation, Plotly for interactive visualizations, and Streamlit for building the web app. The environment was set up to handle and visualize large datasets efficiently.
The project's scope included:
Data was sourced from the Pallidus MR1 activity tracking system, provided by KravitzLab. The dataset contained timestamped activity readings for multiple subjects, recorded at regular intervals.
The analysis included:
Key hypotheses tested were:
Statistical tests such as t-tests and ANOVA were employed to validate these hypotheses.
❯
The Stock Insights project aims to leverage machine learning to forecast future stock values, thereby assisting investors and businesses in making informed decisions. By analyzing historical stock data of top-performing companies on Pakistan Stock Exchange (PSX), the future volume and opening prices of stocks are predicted.
Stock market forecasting is a challenging task due to the volatile and unpredictable nature of stock prices. This project addresses the need for a reliable and accurate forecasting method by using Long Short-Term Memory (LSTM) neural networks, which are well-suited for time series prediction.
The project focuses on forecasting two key stock indicators: volume and opening price. The process involves several steps:
The project is developed using the following tools and technologies:
Historical data of top-performing PSX companies is collected by scraping the PSX Data Portal. The data includes variables such as Open, High, Low, Close, and Volume.
The project utilizes LSTM neural networks, which are particularly effective for time series prediction. The data is preprocessed and transformed into a 3-D shape required by LSTM models. Two separate models are trained:
The performance of the models is evaluated using Mean Squared Error (MSE) on the test set. The loss curves and accuracy metrics for both volume and open price forecasts are analyzed to assess the models' effectiveness. Regularization techniques, such as L2 regularization and dropout, are employed to improve model performance and reduce overfitting.
Future predictions are plotted to provide a clear insight into the stock's potential direction. The project demonstrates that while LSTM models can capture the overall trend in stock prices, achieving precise day-to-day predictions remains challenging due to the inherent volatility of the stock market.
❯
The project was completed as a part of internship at research lab for Software Defined Optical Networks. The task was to predict the state of the control switches when light with certain wavelngths pass through a specialized photonic switch.
The peojcts aims at building a model for Photonics Switching Systems employs the XGBoost algorithm with chain classifiers to optimize the control of optical switches. This approach aims to predict the precise control parameters required to achieve desired switching states, thereby improving the efficiency and reliability of photonics systems in high-speed communication networks.
Photonics switching systems require precise control to manage the routing of light signals in communication networks. Traditional control methods can be complex and inefficient. This project addresses the challenge by developing an ML-based inverse model that learns from data to predict the necessary control actions, thereby simplifying the control process and improving system performance.
The project scope includes data collection, model training, evaluation, and deployment. Key steps involve:
The project is developed using Python, with libraries such as XGBoost for model implementation and Scikit-learn for preprocessing and evaluation. Development was done in Jupyter notebook and VS Code, with version control managed through Git.
Data is sourced from simulations and real-world photonics switching systems, including control inputs, system states, and resulting behaviors. Data gathering involves recording extensive operational data, which is then preprocessed through normalization, noise reduction, and splitting into training, validation, and test sets to ensure robust model training.
The model for controlling photonics switching systems is treated as a multilabel binary classification task, with each output being binary (0 or 1). The Classifier Chain approach from the Scikit-learn library is utilized, with XGBoost as the base estimator; to address the multi-label nature of control parameters, allowing sequential prediction of each control variable based on previous predictions.
Hyperparameter | Description | Value |
---|---|---|
n_estimators | Number of Trees | 140 |
max_depth | Maximum Tree Depth | 5 |
reg_lambda | L2 Regularization | 1 |
Metric | Description | Score |
---|---|---|
Hamming Loss | Fraction of incorrectly predicted labels | 0.0401 (4%) |
Precision | Ratio of correctly predicted labels to total labels | 95% |
Recall | Ratio of correctly predicted positive labels to total positive labels | 96% |
Macro F1-Measure | Harmonic mean of precision and recall | 95.6% |
BizReview Analyzer is a Streamlit web application designed to analyze business reviews and provide insights. It allows users to visualize business locations on a map, explore review details, and conduct market comparison analytics based on location and business type.
BizReview Analyzer is a web application built with Streamlit, aimed at delivering in-depth insights into business performance through the analysis of customer reviews. Users can visualize business locations on an interactive map, view detailed lists of businesses and reviews, and conduct market analysis based on geographic locations and business categories.
This project focuses on helping users make data-driven decisions by offering location-based business insights. It includes features such as mapping business locations, analyzing customer reviews, and comparing market performance across different regions and business types, making it useful for business owners and market analysts.
BizReview Analyzer features four core functionalities:
to fetch business locations and reviews.
using Python requests and pandas for data handling.
using geopandas, folium, and plotly to create interactive maps and analytical charts.
to display business data on a map, list reviews, and perform analytics.
The project is developed using Python 3 in a Streamlit environment. It requires access to the Google Places API for data gathering. Key frameworks include:
The application retrieves live data from the Google Places API. This includes business information such as location, reviews, and ratings. The data is filtered and presented based on user selections like business type, country, and city, enabling dynamic insights.
BizReview Analyzer features four core functionalities:
Visualizes business locations on an interactive map.
Displays detailed information about businesses and their reviews.
Provides statistical insights and visualizations of customer reviews.
Compares the performance of various businesses based on location and review data.
The primary stakeholders for the app include business owners, market analysts, and investors who seek to gain insights into business performance based on customer reviews and location data. It also serves data-driven decision-makers in industries like retail, hospitality, and services, helping them understand market trends, customer sentiment, and competitive positioning in specific geographic areas.
❯
NEAT Chrome Dino Game is an AI-driven project where an agent is trained using the NEAT algorithm to play the Chrome Dino game autonomously.
NEAT Chrome Dino Game is an AI-powered Python application where the AI agent learns to play the popular Chrome Dino game. The AI model is trained using the NEAT (NeuroEvolution of Augmenting Topologies) algorithm, and after successful training, it can autonomously achieve high scores.
The project aims to develop an AI model that learns to play and improve at the Chrome Dino game using NEAT. The goal is for the agent to consistently score 1000+ points, demonstrating the success of the NEAT algorithm in reinforcement learning.
BizReview Analyzer features four core functionalities:
Developed the Chrome Dino game from scratch using Pygame, applying OOP principles.
Integrated the NEAT algorithm to evolve an AI agent that learns to play the game.
Trained the AI model, saving the best-performing agent.
Tested and refined the agent to ensure it reaches a score of 1000+.
The project is done in Python 3 with following key tools:
The game was built in Pygame, using object-oriented principles. Key classes include:
The game simulates the original Chrome Dino game, with randomized obstacles and a scoring system.
The NEAT algorithm was integrated to train an AI agent to control the Dino’s actions
based on game state observations. The NEAT library was used to evolve neural networks, optimizing the agent to survive longer and score higher.
The AI agent was trained over multiple generations using the NEAT algorithm. The fitness function was designed to reward the agent for surviving longer(scoring higher) and avoiding obstacles. After training, the best-performing agent was saved for future use.
The F.O.A.M dashboard is a tool that retrieves and consolidates past and current U.S. government contracts. It helps contractors and businesses identify new opportunities and craft winning proposals by analyzing relevant data and competitor successes.
The Future Opportunity Assessment Manager (F.O.A.M) is a dashboard designed to consolidate new government contract opportunities and past competitor successes into a single, accessible location. The dashboard aims to aid users in spotting relevant contracts and assist in crafting proposals using previously successful strategies. The project is developed using Python and various libraries, with a focus on data manipulation, analysis, and interactive web application development.
The scope of the F.O.A.M project includes developing a user-friendly dashboard for accessing and analyzing government contract data, providing filters and visualizations to identify relevant opportunities, and implementing features to display and analyze past competitor successes. Key steps involve:
The F.O.A.M dashboard is built using Python, for its robust libraries for data manipulation and web application development. Key technologies include:
Data gathering for the F.O.A.M project involves
The F.O.A.M dashboard is designed to be user-friendly and interactive, allowing users to analyze government contract opportunities and competitor information using various filters. It features various sections, including current opportunities, competitor analysis, and forecast recompetes, each equipped with filters, KPIs and visualizations to aid in data interpretation.
This section allows users to explore and filter current government contract opportunities. It provides detailed insights and visualizations on different types of opportunities, agencies, and posting dates, helping users identify and prioritize potential contracts.
The Competitor Info section offers detailed information about past government contracts awarded to competitors. It includes key metrics and visualizations showing the number and value of past awards by recipient and agency, helping users understand competitive landscape and strategies.
In the Forecast Recompetes section, users can analyze upcoming contract recompetes and expiring contracts. Visualizations display award amounts and timelines, aiding users in planning and preparing for future contract opportunities.
The primary stakeholders for the F.O.A.M dashboard include
These users benefit from the dashboard's ability to consolidate and analyze contract data, helping them identify new opportunities and understand competitor strategies.
❯
This project was developed for an eBay seller store, leveraging their sales data to generate meaningful insights.
The "Business Performance Analytics" project is designed to provide a comprehensive and interactive dashboard for the Chief Financial Officer (CFO) of a company. This dashboard aims to offer deep insights into the company’s financial health, sales performance, customer behavior, demand elasticity, and marketing effectiveness. By consolidating data from various departments of the company, the dashboard enables informed decision-making and strategic planning to drive business growth and efficiency.
The scope of the Business Performance Analytics project includes the development of a multi-functional dashboard with the following key aspects:
The primary objectives of the Business Performance Analytics project are:
The project is developed using the following technologies and tools:
The Business Performance Analytics dashboard is structured into six main sections, each containing specific charts and figures to deliver targeted insights:
Provides a high-level summary of the company’s financial performance, including key metrics and trends.
Includes visualizations for income statements, debt and equity analysis, cash flow, and profit/loss over time.
Focuses on detailed analysis of sales performance, revenue, and costs.
Offers insights into revenue trends, product performance, cost breakdowns, and sales volumes.
Delivers insights into customer behavior, value, and segmentation.
Includes metrics for customer lifetime value, customer acquisition costs, sales by customer groups, and conversion rates.
Analyzes the impact of price changes on product demand.
Provides insights into price elasticity, product pricing strategies, and the relationship between pricing and sales volumes.
Evaluates the effectiveness of various marketing channels.
Includes analysis of advertising performance, conversion rates, customer journey funnels, and average order value by marketing channels.
Offers detailed financial reports and expense breakdowns.
Includes visualizations for cash flow, expense categorization, and overall financial health.
❯
This project was completed while learning about LLMs in the course Generative AI with LLMs taught by DeepLearning.ai.
This project focuses on enhancing daily-life dialogue summarization by fine-tuning the Flan-T5 model. The objective is to explore Parameter Efficient Fine-Tuning (PEFT) techniques to improve the model's performance in generating concise and accurate summaries of everyday conversations.
The scope of this project encompasses the implementation and comparison of two fine-tuning approaches:
The project aims to achieve better summarization results with reduced computational overhead and training time.
Using the Hugging Face "knkarthick/dialogsum" dataset.
Loading the base Flan-T5 model and tokenizer.
Conducting initial tests to establish a baseline performance.
Preparing the dataset by formatting dialogue-summary pairs into the
required prompt structure, i-e:
Prompt:
Summarize the following conversation.
Chris: This is his part of the conversation.
Antje: This is her part of the conversation.
Summary:
Training the model with the entire dataset using standard fine-tuning techniques.
Introducing and training the model with LoRA adapters.
Assessing performance through qualitative human evaluations and quantitative metrics like ROUGE.
The project is developed using the following technologies and tools:
The primary data source for this project is the "knkarthick/dialogsum" dataset available on Hugging Face. This dataset contains various dialogue-summary pairs, which are essential for training and evaluating the summarization model.
The model building process began with loading the Hugging Face
"knkarthick/dialogsum" dataset and initializing the Flan-T5 base model along
with its tokenizer. We first conducted zero-shot inferencing to establish a
performance baseline. The dataset was then preprocessed by formatting
dialogue-summary pairs into structured prompts suitable for the model. For full
fine-tuning, the Flan-T5 model was trained using the entire dataset with
specific parameters like learning rate, epochs, and weight decay. Subsequently,
we implemented Parameter Efficient Fine-Tuning (PEFT) using the LoRA method,
which involved adding low-rank adapter layers to the base model and freezing its
parameters, allowing only the adapters to be trained. This dual approach allowed
us to compare the efficacy of traditional fine-tuning against the more efficient
PEFT method.
Below is the summary of approach-specific model training.
Trained the Flan-T5 model on the entire dataset.
Training parameters such as learning rate, number of epochs, and weight decay configured as follow:
Training Parameter | Description | Value |
---|---|---|
Learning Rate | Rate of learning updates | 0.00001 |
Epochs | Number of training steps | 15 |
Weight Decay | Regularization term | 0.01 |
Max Steps | Maximum training steps | 10 |
Added with LoRA adapter layers to the base model.
Freezing the underlying model parameters and only training the adapter layers.
PEFT-specific training parameters configured as:
PEFT-Training Parameter | Description | Value |
---|---|---|
Rank | Low-rank matrix size | 32 |
LoRA alpha | Scaling factor | 32 |
Target Modules | Modules to adapt | q, v |
LoRA Dropout | Dropout rate | 0.05 |
Grneral Model Training parameters configured as:
Training Parameter | Description | Value |
---|---|---|
Leanring Rate | Rate of learning updates | 32 |
Epochs | Number of training cycles | 32 |
Max Steps | Maximum training steps | 150 |
Qualitatively, the generated summaries be evaluated by humans.
Baseline Summary | Person1 teaches Person2 how to upgrade software and hardware in Person2's system. |
Origional Model | Person1: Have you considered upgrading your system? |
Fine-Tuned model | Person1: I'm thinking of upgrading my computer. |
Quantitative evaluation was done using ROUGE score.
ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | |
---|---|---|---|---|
Origional Model | 0.2111 | 0.0613 | 0.1799 | 0.1800 |
Full Fine-Tuned Model | 0.2329 | 0.0724 | 0.2015 | 0.2015 |
Qualitatively, the generated summaries be evaluated as follows.
Baseline Summary | Person1 teaches Person2 how to upgrade software and hardware in Person2's system. |
Origional Model | Person1: I'm thinking of upgrading my computer. |
PEFT model | Upgrade your computer. |
For Quantitative evaluation, ROUGE score was used.
ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | |
---|---|---|---|---|
Origional Model | 0.2409 | 0.1177 | 0.2200 | 0.2213 |
PEFT Model | 0.3007 | 0.1434 | 0.2475 | 0.2506 |
The PEFT-enhanced model demonstrated superior performance over the fully fine-tuned model, achieving higher ROUGE scores and producing more accurate and concise summaries. With a 5.98% improvement in ROUGE-1, the PEFT approach proves to be more efficient and effective for dialogue summarization.
Chest Cancer Classification using MLFlow and CI/CD is a deep learning project that classifies chest cancer from CT scan images. It utilizes MLFlow for experiment tracking, employs pre-trained models, and integrates CI/CD for seamless deployment on AWS EC2 using GitHub Actions and Docker.
Chest Cancer Classification using MLFlow is a deep learning project focused on detecting chest cancer from CT scan images. The project incorporates MLFlow for experiment tracking, CI/CD for deployment, and utilizes pre-trained models like VGG16, MobileNet, and ResNet50 for classification.
The project aims to streamline chest cancer detection by training deep learning classifiers using chest CT scan images. It also practices CI/CD principles for seamless deployment and robustness in machine learning applications.
The project uses a Kaggle chest CT scan dataset for classification tasks. Data ingestion is managed using the Strategy pattern, supporting local, Kaggle, and Google Drive data sources, that can be configured in YAML configuration file.
Key technologies and tools used in building the project include:
House Price Prediction is a comprehensive ML project that predicts house prices using the Ames Housing dataset from Kaggle. It employs MLflow for experiment tracking, integrates a Flask API for real-time predictions, and features a Streamlit app for user interaction.
House Price Prediction is an end-to-end machine learning project that predicts house prices based on input features. It leverages MLflow for experiment tracking and model management, and integrates a Flask API with a Streamlit app for real-time predictions.
The project aims to predict house prices using a comprehensive ML pipeline that includes data preprocessing, model training, and deployment. It focuses on using design patterns for robust code, with a strong emphasis on data analysis and feature engineering.
The project utilizes the Ames Housing dataset from Kaggle, which contains comprehensive information about housing features and sale prices. It has rich set of attributes, including both numerical and categorical features that provide insights into various aspects of the properties.
Key technologies and tools used in building the project include:
The model training pipeline uses MLflow to track experiments, log metrics, and manage model versions. The trained model is registered in MLflow, allowing for easy versioning and comparison of different models.
The trained model is served through a Flask API with an endpoint /predict for real-time predictions. The API uses an inference pipeline to load the model from MLflow and respond to prediction requests.
A Streamlit app provides a user-friendly interface for inputting data and receiving predictions. The app communicates with the Flask API to get predictions and display results to the user.
The project was done in collaboration with Ammonite LLC, a shipping company that offers intermodal logistics services to traders, retailers, freight forwarders, shipping lines, and other groups. The goal is to provide company with real-time insights and analytics to make informed decisions regarding inventory management and trading activities.
The Inventory Management App for Ammonite LLC is a web-based application designed to streamline the analysis and presentation of container shipping, inventory management, and trading activities. Utilizing the power of the Streamlit framework, this app integrates multiple data sources to provide real-time insights and analytics, helping Ammonite LLC optimize their operations and decision-making processes.
The project aims to develop a comprehensive dashboard that covers various aspects of inventory management, including tracking inventory levels, analyzing sales and costs, monitoring inventory movement, updating sales port information, analyzing trading prices, and providing commodity prices and relevant news.
The app is built using Python. Key technologies & tools include:
The Inventory Management App features a comprehensive dashboard with multiple tabs, each providing crucial insights and functionalities for effective inventory and trading management. These tabs includes:
Offers a snapshot of key metrics, helping users quickly grasp the overall status of inventory levels and financial performance.
Provides detailed analysis of inventory sales and cost structure.
Tracks the movement of inventory, ensuring accurate stock management and timely replenishment.
Provides updates on sales activities across different ports, facilitating better logistics planning and coordination.
Offers real-time analysis of commodity prices, aiding users in making informed trading decisions.
Integrates data from yfinance to provide up-to-date information on commodity prices and market trends.
Includes a geo-political calendar to keep users informed about global events that might impact operations.
Consolidates relevant news and updates from the company's Telegram channel, keeping users informed about recent developments and strategic insights.
All combined, the App provide a holistic view of operations, helping users make well-informed decisions, optimize inventory levels, enhance sales strategies, and stay updated on market trends and global events.