Skills & Interests

Graph
Making

Statistical
Analysing

Logical
Thinking

Projects

Cluster Analysis Research Project

Research study documenting the cluster analysis of ordinal and binary lingustics data (obtained from the Ewave data source). The model-based clustering approach using finite mixtures was proposed and the general ordinal models were described in the context of one-mode and two-mode hard clustering.

Kaggle Data Shiny Application

Shiny application allowing the user to perform interactive analysis on Kaggle suicide statistics accumulated at world level. I built this application when I first learned how to use Shiny for a project assignment at university in 2019.

Time Series Analysis Shiny Dashboard for Orange data

Shiny dashboard presenting time series analysis of randomly generated data. This dashboard was reproduced using the structure of the dashboard I had previously created for a client during my internship at Harmonic Analytics.

Time Series Analysis Technical Report

In this report summarised a variety of methods used for fitting time series data. The aim is to investigate and compare different models' perfomances and predictive ability based on accuracy metrics and prediction results.

ICMR in patients under 18 years old design study

Applying multinomial logistic regression and Kaplan-Meier survival methods to health data. The purpose of this study is to examine the long term effect of Carpentier-Edwards Ring or Band annuloplasty in patients under the age of 18 years old. ICMR is a recently released medical term short for Isolated Congenital Mitral Regurgitation.

Statistical Consulting Assignment

I wrote this report in 2019 for a university assignment. The research question was raised to investigate the relationship between the amount of fish intake and mercury levels found in fisherman hair living in Doha fishing village, Kuwait. I recently did the analysis again and rewrote the findings in order to improve my statistical writing skills.

ICMR in patients under 18 years old follow-up report

This follow-up report explores three different statistical methodologies, contingency table analysis, Bayesian approach to multiple multinomial logistic regression and Random Forest classification algorithm and their application to health data. The results were reported to conclude the relationship between the side effect of annuloplasty Band and the clinical outcome of Mitral Regurgitation in patients with mitral valve diseases.

Multi-class Random Forest classifier in Python

This short notebook describes the classification using Random Forest in a multi-class setting, where one class is fitted against all the other classes for each classifier. Confusion matrix and other performance metrics were reported to compare amongst models fitted for imbalanced and balanced health data.

NZ COVID-19 (fictitious) flexdashboard

Data visualisation of the number of positive COVID-19 cases (fictious) by ethnicity at Statistical Area 2018 geographical level, using highcharter and flexdashboard packages. Please make sure the dashboard is viewed in full screen for a better experience.

Python Plotly Dash Analytics

Application made in Python Dash, summarising data description and sklearnmachine leanring (Regression and Classficiation) algorithms. The data used is available at this github source.