Huda S.

Driver Drowsiness Detection using onboard Camera

Driver Drowsiness Detection is a system that monitors driver behavior using a camera mounted on the dashboard. It is an application of computer vision utilizing an edge device to monitor the driver. A buzzer is used to alert the drowsy driver.

SRS Report Source Code

Problem Statement

Despite the maturity of drowsiness detection technology, existing solutions are often too expensive or unreliable for widespread use. The growing number of road users highlights the need for an affordable and dependable system to reduce accidents caused by driver inattentiveness, thereby protecting both drivers and other road users.
Project Scope and Steps
The project involves developing a driver drowsiness detection system using an edge device with a camera and deploying a neural network model to monitor driver behavior. Key steps include:
- Designing the hardware setup with a camera and accelerometer.
- Developing and training the machine learning model.
- Implementing the detection and alert mechanisms.
- Creating a web portal for updates and feedback.
- Conducting extensive testing and continuous improvement.
Project Environment
The project utilizes a combination of hardware and software technologies:
- Hardware: Infrared camera, MPU6050 accelerometer, and Raspberry Pi 4.
- Software: Python, TensorFlow, TensorFlow Lite, OpenCV, Dlib, Django (backend), React (frontend), and Pytest for testing.
Data Source and Data Gathering

Data for model training and testing is gathered from Kaggle and sources like UTA-LRDD. The data gathered was very diverse and cover various scenarios. This data is transformed to have infrared effect and is pre-processed to extract relevant features such as facial landmarks. Additional data is collected via accelerometer readings to detect accidents.
Model Building

The model is built using TensorFlow and converted to TensorFlow Lite to optimize for edge device deployment. It employs convolutional neural networks (CNNs) for detecting driver drowsiness by analyzing facial features. OpenCV and Dlib libraries are used for image processing and facial landmark detection.
System Design and Architecture

The system architecture consists of an edge device with a camera and accelerometer, running the drowsiness detection model. The architecture follows a procedural programming approach for the device and an MVC architecture for the web portal. The device continuously monitors driver behavior and triggers alerts if drowsiness is detected. The web portal allows users to download updates and report incidents.
System Data Flow
Implemention and Testing

Implementation includes integrating the hardware components, developing the machine learning model, and setting up the web portal. The system undergoes rigorous testing using Pytest to ensure reliability and performance. Real-world testing is conducted to validate the system's effectiveness in detecting driver drowsiness and preventing accidents. Continuous feedback and updates will be provided to enhance system safety and functionality.

Temporal Analysis of Pallidus MR1 Dataset

This project, conducted in collaboration with KravitzLab, Washington University St. Louis, that involved experiments on MOUSERAT using Pallidus MR1 to collect activity data. The study focused on temporal analysis to achieve desired insights.

Analysis Report Source Code

Project Introduction

The Temporal Analysis of MR1 Data project was conducted for KravitzLab, focusing on the activity tracking data of subjects over time. The analysis involved identifying temporal patterns, testing hypotheses, and comparing day versus night activity levels using a Streamlit-based Python app featuring interactive Plotly charts.
Project Environment

The project was developed using Python with libraries such as Pandas for data manipulation, Plotly for interactive visualizations, and Streamlit for building the web app. The environment was set up to handle and visualize large datasets efficiently.
Scope and Project Steps
The project's scope included:
- Gathering and preprocessing MR1 activity data.
- Conducting exploratory data analysis (EDA) and temporal analysis.
- Testing predefined hypotheses.
- Visualizing findings through interactive charts.
Data Source and Data Gathering

Data was sourced from the Pallidus MR1 activity tracking system, provided by KravitzLab. The dataset contained timestamped activity readings for multiple subjects, recorded at regular intervals.
Data Analysis
The analysis included:
- Examining hourly and daily activity trends.
- Comparing activity patterns between subjects.
- Investigating day and night shifts in activity levels.
- Interactive Plotly charts were used to visualize these patterns dynamically within the Streamlit app.
Hypothesis Testing
Key hypotheses tested were:
- Subjects exhibit higher activity levels during the night compared to the day.
- Activity levels vary significantly between different light/dark cycles.
- The activity of one subject influences the activity of others.
Statistical tests such as t-tests and ANOVA were employed to validate these hypotheses.
Gallery

❮

❯

Stock Insights

The Stock Insights project aims to leverage machine learning to forecast future stock values, thereby assisting investors and businesses in making informed decisions. By analyzing historical stock data of top-performing companies on Pakistan Stock Exchange (PSX), the future volume and opening prices of stocks are predicted.

Report Source Code

Problem Statement

Stock market forecasting is a challenging task due to the volatile and unpredictable nature of stock prices. This project addresses the need for a reliable and accurate forecasting method by using Long Short-Term Memory (LSTM) neural networks, which are well-suited for time series prediction.
Project Scope and Steps
The project focuses on forecasting two key stock indicators: volume and opening price. The process involves several steps:
- Data Collection: Historical stock data is scraped from the PSX Data Portal.
- Feature Selection: Relevant features are identified for model training.
- Data Preprocessing: Data is split into training and testing sets, scaled, and reshaped for LSTM.
- Model Training: An LSTM model is trained using the prepared data.
- Prediction: The trained model forecasts future stock trends.
- Evaluation: Model performance is evaluated and visualized.
Project Environment
The project is developed using the following tools and technologies:
- Python: For data processing, model training, and evaluation.
- Libraries: NumPy, Pandas, Scikit-learn, TensorFlow, Keras, Matplotlib.
- Development Tools: Jupyter Notebook for experimentation and development.
Data Source and Data Gathering

Historical data of top-performing PSX companies is collected by scraping the PSX Data Portal. The data includes variables such as Open, High, Low, Close, and Volume.
Model Building
The project utilizes LSTM neural networks, which are particularly effective for time series prediction. The data is preprocessed and transformed into a 3-D shape required by LSTM models. Two separate models are trained:
- Volume Forecast Model: Uses features Open, High, Low, Close, and Volume.
- Open Price Forecast Model: Uses features Open, High, Low, and Close.
Model Performance

The performance of the models is evaluated using Mean Squared Error (MSE) on the test set. The loss curves and accuracy metrics for both volume and open price forecasts are analyzed to assess the models' effectiveness. Regularization techniques, such as L2 regularization and dropout, are employed to improve model performance and reduce overfitting.
Results and Inference

Future predictions are plotted to provide a clear insight into the stock's potential direction. The project demonstrates that while LSTM models can capture the overall trend in stock prices, achieving precise day-to-day predictions remains challenging due to the inherent volatility of the stock market.
Gallery

❮

❯

Prediction of Light path in SDON

The project was completed as a part of internship at research lab for Software Defined Optical Networks. The task was to predict the state of the control switches when light with certain wavelngths pass through a specialized photonic switch.

Paper

Problem Introduction

The peojcts aims at building a model for Photonics Switching Systems employs the XGBoost algorithm with chain classifiers to optimize the control of optical switches. This approach aims to predict the precise control parameters required to achieve desired switching states, thereby improving the efficiency and reliability of photonics systems in high-speed communication networks.
Problem Statement

Photonics switching systems require precise control to manage the routing of light signals in communication networks. Traditional control methods can be complex and inefficient. This project addresses the challenge by developing an ML-based inverse model that learns from data to predict the necessary control actions, thereby simplifying the control process and improving system performance.
Project Scope and Steps
The project scope includes data collection, model training, evaluation, and deployment. Key steps involve:
- Gathering and preprocessing control and state data from photonics switching systems.
- Designing the XGBoost model with chain classifiers.
- Training and validating the model.
- Evaluating model performance with validation datasets.
- Refining the model iteratively based on performance on unseen newly collected data.
Project Environment

The project is developed using Python, with libraries such as XGBoost for model implementation and Scikit-learn for preprocessing and evaluation. Development was done in Jupyter notebook and VS Code, with version control managed through Git.
Data Source and Data Gathering

Data is sourced from simulations and real-world photonics switching systems, including control inputs, system states, and resulting behaviors. Data gathering involves recording extensive operational data, which is then preprocessed through normalization, noise reduction, and splitting into training, validation, and test sets to ensure robust model training.

Model Building

The model for controlling photonics switching systems is treated as a multilabel binary classification task, with each output being binary (0 or 1). The Classifier Chain approach from the Scikit-learn library is utilized, with XGBoost as the base estimator; to address the multi-label nature of control parameters, allowing sequential prediction of each control variable based on previous predictions.

Classifier Chains arrange n-binary classifiers in a sequence, where each classifier's prediction informs the next.
The model is trained on a dataset with 8 features (wavelength combinations) and 20 labels (control states)
For model training the data is split into 80% training and 20% test set.

Hyperparameters are optimized using GridSearchCV.

Hyperparameter	Description	Value
n_estimators	Number of Trees	140
max_depth	Maximum Tree Depth	5
reg_lambda	L2 Regularization	1

Model Architecture

Model Performance

Metric	Description	Score
Hamming Loss	Fraction of incorrectly predicted labels	0.0401 (4%)
Precision	Ratio of correctly predicted labels to total labels	95%
Recall	Ratio of correctly predicted positive labels to total positive labels	96%
Macro F1-Measure	Harmonic mean of precision and recall	95.6%

Business Review Analyzer

BizReview Analyzer is a Streamlit web application designed to analyze business reviews and provide insights. It allows users to visualize business locations on a map, explore review details, and conduct market comparison analytics based on location and business type.

Source Code

Project Introduction

BizReview Analyzer is a web application built with Streamlit, aimed at delivering in-depth insights into business performance through the analysis of customer reviews. Users can visualize business locations on an interactive map, view detailed lists of businesses and reviews, and conduct market analysis based on geographic locations and business categories.
Project Scope

This project focuses on helping users make data-driven decisions by offering location-based business insights. It includes features such as mapping business locations, analyzing customer reviews, and comparing market performance across different regions and business types, making it useful for business owners and market analysts.
Project Steps
BizReview Analyzer features four core functionalities:
- Integrating Google Places API
  
  to fetch business locations and reviews.
- Fetching and processing location and review data
  
  using Python requests and pandas for data handling.
- Data Visualization
  
  using geopandas, folium, and plotly to create interactive maps and analytical charts.
- Building a Streamlit web interface
  
  to display business data on a map, list reviews, and perform analytics.
Project Environment
The project is developed using Python 3 in a Streamlit environment. It requires access to the Google Places API for data gathering. Key frameworks include:
- Pandas for data handling and analysis.
- Geopandas for location information.
- Folium for mapping business points.
- Streamlit for creating the interactive app.
- Plotly for displaying data insights
Data Source and Data Gathering

The application retrieves live data from the Google Places API. This includes business information such as location, reviews, and ratings. The data is filtered and presented based on user selections like business type, country, and city, enabling dynamic insights.
App Specifications
BizReview Analyzer features four core functionalities:
- Map View
  
  Visualizes business locations on an interactive map.
- Detailed View
  
  Displays detailed information about businesses and their reviews.
- Reviews Analytics
  
  Provides statistical insights and visualizations of customer reviews.
- Market Analysis
  
  Compares the performance of various businesses based on location and review data.
Stakeholders

The primary stakeholders for the app include business owners, market analysts, and investors who seek to gain insights into business performance based on customer reviews and location data. It also serves data-driven decision-makers in industries like retail, hospitality, and services, helping them understand market trends, customer sentiment, and competitive positioning in specific geographic areas.
Gallery

❮

❯

AI Plays Chrome Dino Game

NEAT Chrome Dino Game is an AI-driven project where an agent is trained using the NEAT algorithm to play the Chrome Dino game autonomously.

Source Code

Project Introduction

NEAT Chrome Dino Game is an AI-powered Python application where the AI agent learns to play the popular Chrome Dino game. The AI model is trained using the NEAT (NeuroEvolution of Augmenting Topologies) algorithm, and after successful training, it can autonomously achieve high scores.
Project Scope

The project aims to develop an AI model that learns to play and improve at the Chrome Dino game using NEAT. The goal is for the agent to consistently score 1000+ points, demonstrating the success of the NEAT algorithm in reinforcement learning.
Project Steps
BizReview Analyzer features four core functionalities:
- Developed the Chrome Dino game from scratch using Pygame, applying OOP principles.
- Integrated the NEAT algorithm to evolve an AI agent that learns to play the game.
- Trained the AI model, saving the best-performing agent.
- Tested and refined the agent to ensure it reaches a score of 1000+.
Project Environment
The project is done in Python 3 with following key tools:
- Pygame
- NEAT-Python
- OOP principles
Game Development
The game was built in Pygame, using object-oriented principles. Key classes include:
- Dino.
- Ground.
- Bird.
- Cactus.
- Obstacles superclass to create Bird and Cactus.
AI Integration (NEAT Algorithm)
The NEAT algorithm was integrated to train an AI agent to control the Dino’s actions
- jump
- duck
- do nothing
based on game state observations. The NEAT library was used to evolve neural networks, optimizing the agent to survive longer and score higher.

The AI agent was trained over multiple generations using the NEAT algorithm. The fitness function was designed to reward the agent for surviving longer(scoring higher) and avoiding obstacles. After training, the best-performing agent was saved for future use.

Future Opportunity Assessment Manager

The F.O.A.M dashboard is a tool that retrieves and consolidates past and current U.S. government contracts. It helps contractors and businesses identify new opportunities and craft winning proposals by analyzing relevant data and competitor successes.

Project Specification Document

Project Introduction

The Future Opportunity Assessment Manager (F.O.A.M) is a dashboard designed to consolidate new government contract opportunities and past competitor successes into a single, accessible location. The dashboard aims to aid users in spotting relevant contracts and assist in crafting proposals using previously successful strategies. The project is developed using Python and various libraries, with a focus on data manipulation, analysis, and interactive web application development.
Project Scope and Steps
The scope of the F.O.A.M project includes developing a user-friendly dashboard for accessing and analyzing government contract data, providing filters and visualizations to identify relevant opportunities, and implementing features to display and analyze past competitor successes. Key steps involve:
- Data gathering from the USAspending and SAM.gov APIs.
- Data processing with Pandas.
- Building an interactive web interface using Streamlit.
Project Environment
The F.O.A.M dashboard is built using Python, for its robust libraries for data manipulation and web application development. Key technologies include:
- Pandas for data handling and analysis.
- Streamlit for creating the interactive dashboard
- Plotly for charting and graphing data insights
Data Source and Data Gathering
Data gathering for the F.O.A.M project involves
- USAspending API to fetch past awards data.
- SAM.gov API for current opportunities.
Users need to create an account on SAM.gov to obtain an API key for data access.
Dashboard
The F.O.A.M dashboard is designed to be user-friendly and interactive, allowing users to analyze government contract opportunities and competitor information using various filters. It features various sections, including current opportunities, competitor analysis, and forecast recompetes, each equipped with filters, KPIs and visualizations to aid in data interpretation.
- Current Opportunities
  
  This section allows users to explore and filter current government contract opportunities. It provides detailed insights and visualizations on different types of opportunities, agencies, and posting dates, helping users identify and prioritize potential contracts.
- Competitor Info
  
  The Competitor Info section offers detailed information about past government contracts awarded to competitors. It includes key metrics and visualizations showing the number and value of past awards by recipient and agency, helping users understand competitive landscape and strategies.
- Forecast Recompetes
  
  In the Forecast Recompetes section, users can analyze upcoming contract recompetes and expiring contracts. Visualizations display award amounts and timelines, aiding users in planning and preparing for future contract opportunities.
Stakeholders
The primary stakeholders for the F.O.A.M dashboard include
- Government contractors
- Business development teams
- Proposal writers
These users benefit from the dashboard's ability to consolidate and analyze contract data, helping them identify new opportunities and understand competitor strategies.
Gallery

❮

❯

Business Performance Analytics

This project was developed for an eBay seller store, leveraging their sales data to generate meaningful insights.

Source Code

Project Introduction

The "Business Performance Analytics" project is designed to provide a comprehensive and interactive dashboard for the Chief Financial Officer (CFO) of a company. This dashboard aims to offer deep insights into the company’s financial health, sales performance, customer behavior, demand elasticity, and marketing effectiveness. By consolidating data from various departments of the company, the dashboard enables informed decision-making and strategic planning to drive business growth and efficiency.
Project Scope
The scope of the Business Performance Analytics project includes the development of a multi-functional dashboard with the following key aspects:
- High-level summary of financial performance and key metrics.
- Detailed analysis of sales performance, revenue, and costs.
- Insights into customer behavior, value, and segmentation.
- Analysis of how changes in price affect product demand.
- Evaluation of the effectiveness of marketing channels.
- Detailed financial reports and expense breakdown.
Project Objectives
The primary objectives of the Business Performance Analytics project are:
- To provide the CFO with a centralized and interactive dashboard for real-time business performance monitoring.
- To offer detailed insights into sales, customer behavior, demand elasticity, and marketing effectiveness.
- To enhance decision-making capabilities by presenting data in a visually appealing and easy-to-understand manner.
- To streamline financial reporting and analysis by integrating various data sources into a single platform.
- To identify key areas for cost reduction, revenue optimization, and strategic investment.
Project Environment
The project is developed using the following technologies and tools:
- Programming Language: Python
- Web Framework: Streamlit
- Data Visualization: Plotly, Matplotlib, Seaborn
- Data Management: Pandas, SQLAlchemy
- Version Control: Git/GitHub
Data Specifications
The Business Performance Analytics dashboard is structured into six main sections, each containing specific charts and figures to deliver targeted insights:
- Overview
  
  Provides a high-level summary of the company’s financial performance, including key metrics and trends.
  
  Includes visualizations for income statements, debt and equity analysis, cash flow, and profit/loss over time.
- Sales Insights
  
  Focuses on detailed analysis of sales performance, revenue, and costs.
  
  Offers insights into revenue trends, product performance, cost breakdowns, and sales volumes.
- Customer's Report
  
  Delivers insights into customer behavior, value, and segmentation.
  
  Includes metrics for customer lifetime value, customer acquisition costs, sales by customer groups, and conversion rates.
- Demand Elasticity
  
  Analyzes the impact of price changes on product demand.
  
  Provides insights into price elasticity, product pricing strategies, and the relationship between pricing and sales volumes.
- Marketing Attribution
  
  Evaluates the effectiveness of various marketing channels.
  
  Includes analysis of advertising performance, conversion rates, customer journey funnels, and average order value by marketing channels.
- Accounts
  
  Offers detailed financial reports and expense breakdowns.
  
  Includes visualizations for cash flow, expense categorization, and overall financial health.
Gallery

❮

❯

Dialogue Summarization with PEFT Using Flan-T5

This project was completed while learning about LLMs in the course Generative AI with LLMs taught by DeepLearning.ai.

Source Code

Project Introduction

This project focuses on enhancing daily-life dialogue summarization by fine-tuning the Flan-T5 model. The objective is to explore Parameter Efficient Fine-Tuning (PEFT) techniques to improve the model's performance in generating concise and accurate summaries of everyday conversations.
Project Scope
The scope of this project encompasses the implementation and comparison of two fine-tuning approaches:
- Full Fine-Tuning
- PEFT using the LoRA (Low-Rank Adaptation) method
The project aims to achieve better summarization results with reduced computational overhead and training time.
Project Steps
- Dataset Loading
  
  Using the Hugging Face "knkarthick/dialogsum" dataset.
- Model Initialization
  
  Loading the base Flan-T5 model and tokenizer.
- Zero-Shot Inferencing
  
  Conducting initial tests to establish a baseline performance.
- Data Preprocessing
  
  Preparing the dataset by formatting dialogue-summary pairs into the required prompt structure, i-e:
  Prompt: Summarize the following conversation.
  Chris: This is his part of the conversation.
  Antje: This is her part of the conversation.
  Summary:
- Full Fine-Tuning
  
  Training the model with the entire dataset using standard fine-tuning techniques.
- PEFT Implementation
  
  Introducing and training the model with LoRA adapters.
- Model Evaluation
  
  Assessing performance through qualitative human evaluations and quantitative metrics like ROUGE.
Project Environment
The project is developed using the following technologies and tools:
- Programming Language: Python
- Libraries: Hugging Face Transformers, Datasets, PyTorch, PEFT
Data Source and Data Gathering

The primary data source for this project is the "knkarthick/dialogsum" dataset available on Hugging Face. This dataset contains various dialogue-summary pairs, which are essential for training and evaluating the summarization model.

Model Building

The model building process began with loading the Hugging Face "knkarthick/dialogsum" dataset and initializing the Flan-T5 base model along with its tokenizer. We first conducted zero-shot inferencing to establish a performance baseline. The dataset was then preprocessed by formatting dialogue-summary pairs into structured prompts suitable for the model. For full fine-tuning, the Flan-T5 model was trained using the entire dataset with specific parameters like learning rate, epochs, and weight decay. Subsequently, we implemented Parameter Efficient Fine-Tuning (PEFT) using the LoRA method, which involved adding low-rank adapter layers to the base model and freezing its parameters, allowing only the adapters to be trained. This dual approach allowed us to compare the efficacy of traditional fine-tuning against the more efficient PEFT method.
Below is the summary of approach-specific model training.

Full Fine-Tuning

Trained the Flan-T5 model on the entire dataset.

Training parameters such as learning rate, number of epochs, and weight decay configured as follow:

Training Parameter	Description	Value
Learning Rate	Rate of learning updates	0.00001
Epochs	Number of training steps	15
Weight Decay	Regularization term	0.01
Max Steps	Maximum training steps	10

PEFT with LoRA

Added with LoRA adapter layers to the base model.

Freezing the underlying model parameters and only training the adapter layers.

PEFT-specific training parameters configured as:

PEFT-Training Parameter	Description	Value
Rank	Low-rank matrix size	32
LoRA alpha	Scaling factor	32
Target Modules	Modules to adapt	q, v
LoRA Dropout	Dropout rate	0.05

Grneral Model Training parameters configured as:

Training Parameter	Description	Value
Leanring Rate	Rate of learning updates	32
Epochs	Number of training cycles	32
Max Steps	Maximum training steps	150

Model Results

Full Fine-Tuning Results

Qualitatively, the generated summaries be evaluated by humans.

Baseline Summary	Person1 teaches Person2 how to upgrade software and hardware in Person2's system.
Origional Model	Person1: Have you considered upgrading your system?
Fine-Tuned model	Person1: I'm thinking of upgrading my computer.

Quantitative evaluation was done using ROUGE score.

	ROUGE-1	ROUGE-2	ROUGE-L	ROUGE-Lsum
Origional Model	0.2111	0.0613	0.1799	0.1800
Full Fine-Tuned Model	0.2329	0.0724	0.2015	0.2015

PEFT Results

Qualitatively, the generated summaries be evaluated as follows.

Baseline Summary	Person1 teaches Person2 how to upgrade software and hardware in Person2's system.
Origional Model	Person1: I'm thinking of upgrading my computer.
PEFT model	Upgrade your computer.

For Quantitative evaluation, ROUGE score was used.

	ROUGE-1	ROUGE-2	ROUGE-L	ROUGE-Lsum
Origional Model	0.2409	0.1177	0.2200	0.2213
PEFT Model	0.3007	0.1434	0.2475	0.2506

The PEFT-enhanced model demonstrated superior performance over the fully fine-tuned model, achieving higher ROUGE scores and producing more accurate and concise summaries. With a 5.98% improvement in ROUGE-1, the PEFT approach proves to be more efficient and effective for dialogue summarization.

Chest Cancer Classification Using MLFlow and CI/CD

Chest Cancer Classification using MLFlow and CI/CD is a deep learning project that classifies chest cancer from CT scan images. It utilizes MLFlow for experiment tracking, employs pre-trained models, and integrates CI/CD for seamless deployment on AWS EC2 using GitHub Actions and Docker.

Source Code

Project Introduction

Chest Cancer Classification using MLFlow is a deep learning project focused on detecting chest cancer from CT scan images. The project incorporates MLFlow for experiment tracking, CI/CD for deployment, and utilizes pre-trained models like VGG16, MobileNet, and ResNet50 for classification.
Project Scope

The project aims to streamline chest cancer detection by training deep learning classifiers using chest CT scan images. It also practices CI/CD principles for seamless deployment and robustness in machine learning applications.
Project Steps
- Integrated the Kaggle chest CT scan dataset for data ingestion.
- Implemented pre-trained models (VGG16, MobileNet, ResNet50) using the Keras API for classification.
- Tracked experiments, hyperparameters, and performance using MLFlow.
- Deployed the model using GitHub Actions, Docker, and AWS EC2 for production.
Data Source & Data Handling

The project uses a Kaggle chest CT scan dataset for classification tasks. Data ingestion is managed using the Strategy pattern, supporting local, Kaggle, and Google Drive data sources, that can be configured in YAML configuration file.
Tech Stack
Key technologies and tools used in building the project include:
- Python
- Keras Pre-trained models (VGG16, MobileNet, ResNet50)
- MLFlow
- GitHub Actions
- AWS EC2
- Docker
- Streamlit
Model Training & Experiment Tracking
- The project utilizes Keras pre-trained models, enabling flexibility in model selection.
- MLFlow is employed to track experiments, log metrics, and store model artifacts for comparison and versioning.
- YAML configuration files allow customization of training parameters and data ingestion methods.
CI/CD & Deployment
- GitHub Actions was set up for continuous integration, building a Docker image of the application, and pushing it to AWS ECR.
- The application was deployed on an EC2 instance to serve the model through a Streamlit app for real-time prediction.

US House Price Prediction Modeling

House Price Prediction is a comprehensive ML project that predicts house prices using the Ames Housing dataset from Kaggle. It employs MLflow for experiment tracking, integrates a Flask API for real-time predictions, and features a Streamlit app for user interaction.

Source Code

Project Introduction

House Price Prediction is an end-to-end machine learning project that predicts house prices based on input features. It leverages MLflow for experiment tracking and model management, and integrates a Flask API with a Streamlit app for real-time predictions.
Project Scope

The project aims to predict house prices using a comprehensive ML pipeline that includes data preprocessing, model training, and deployment. It focuses on using design patterns for robust code, with a strong emphasis on data analysis and feature engineering.
Project Steps
- Performed extensive data analysis and preprocessing to enhance model performance.
- Developed a training pipeline and tracked experiments using MLflow.
- Deployed the model with a Flask API for real-time predictions.
- Created a Streamlit app for user interaction and integrated it with the Flask API.
Data Source

The project utilizes the Ames Housing dataset from Kaggle, which contains comprehensive information about housing features and sale prices. It has rich set of attributes, including both numerical and categorical features that provide insights into various aspects of the properties.
Tech Stack
Key technologies and tools used in building the project include:
- Python
- MLFlow
- Flask
- Streamlit
Model Training & Experiment Tracking

The model training pipeline uses MLflow to track experiments, log metrics, and manage model versions. The trained model is registered in MLflow, allowing for easy versioning and comparison of different models.
Local Deployment

The trained model is served through a Flask API with an endpoint /predict for real-time predictions. The API uses an inference pipeline to load the model from MLflow and respond to prediction requests.
Frontend Integration

A Streamlit app provides a user-friendly interface for inputting data and receiving predictions. The app communicates with the Flask API to get predictions and display results to the user.

Inventory Management App

The project was done in collaboration with Ammonite LLC, a shipping company that offers intermodal logistics services to traders, retailers, freight forwarders, shipping lines, and other groups. The goal is to provide company with real-time insights and analytics to make informed decisions regarding inventory management and trading activities.

Project Proposal Project Specs Source Code

Project Introduction

The Inventory Management App for Ammonite LLC is a web-based application designed to streamline the analysis and presentation of container shipping, inventory management, and trading activities. Utilizing the power of the Streamlit framework, this app integrates multiple data sources to provide real-time insights and analytics, helping Ammonite LLC optimize their operations and decision-making processes.
Project Scope and Steps

The project aims to develop a comprehensive dashboard that covers various aspects of inventory management, including tracking inventory levels, analyzing sales and costs, monitoring inventory movement, updating sales port information, analyzing trading prices, and providing commodity prices and relevant news.
Project Steps
- Requirements Gathering: Collaborate with Ammonite LLC to understand their needs and identify key metrics and functionalities.
- Data Collection: Gather data from various sources, including internal databases and external APIs.
- Data Processing: Clean and preprocess the collected data to ensure it is suitable for analysis.
- Dashboard Design: Design the layout and user interface of the dashboard to ensure it is intuitive and user-friendly.
- Feature Implementation: Develop and integrate the required features into the dashboard, including various plots and analytics.
Project Environment
The app is built using Python. Key technologies & tools include:
- Framework: Streamlit
- Libraries: Pandas, Plotly, yfinance, BeautifulSoup and other data processing and visualization libraries
- Version Control: Git and GitHub for version control and collaboration
- Deployment Platform: Streamlit Cloud Sharing
Data Source and Data Gathering
- Internal Databases: Data on inventory levels, sales, costs, and transactions from Ammonite LLC’s internal systems.
- External APIs: Commodity prices fetched from yfinance API.
- News Sources: Company news integrated from the company’s Telegram channel.
- Frieght Pricig Data: Acquired from MoverDB.com, the Sea Freight Container Shipping Rates To & From The US.
- Global Events Data: Geo-political calendar data collected from ControlRisk.com.
App Specifications
The Inventory Management App features a comprehensive dashboard with multiple tabs, each providing crucial insights and functionalities for effective inventory and trading management. These tabs includes:
- Overview
  
  Offers a snapshot of key metrics, helping users quickly grasp the overall status of inventory levels and financial performance.
- Sales & Costs
  
  Provides detailed analysis of inventory sales and cost structure.
- Inventory In/Out
  
  Tracks the movement of inventory, ensuring accurate stock management and timely replenishment.
- Sales' Ports
  
  Provides updates on sales activities across different ports, facilitating better logistics planning and coordination.
- Trading Prices
  
  Offers real-time analysis of commodity prices, aiding users in making informed trading decisions.
- Commodities
  
  Integrates data from yfinance to provide up-to-date information on commodity prices and market trends.
- Calendar
  
  Includes a geo-political calendar to keep users informed about global events that might impact operations.
- News
  
  Consolidates relevant news and updates from the company's Telegram channel, keeping users informed about recent developments and strategic insights.
All combined, the App provide a holistic view of operations, helping users make well-informed decisions, optimize inventory levels, enhance sales strategies, and stay updated on market trends and global events.
Goal
- Provide Ammonite LLC with real-time insights and analytics to make informed decisions regarding inventory management and trading activities.
- Streamline the process of tracking inventory and analyzing sales and costs, reducing manual effort and increasing efficiency.
- Ensure the app is user-friendly, encouraging widespread adoption within the company.

About

Portfolio

Testimonials

Contact

Turning Data into Decisions

Crafting Insights with Precision

Precision Modeling

Actionable Insights

Data-Driven Results

Well Written Code

Hi! I am Huda, good to see you here. Please take a look around.

About Me

Noor Ul Huda Ajmal

I am a

My Skill Set

Projects

Driver Drowsiness Detection using onboard Camera