Pinned Project

Data Science Projects notebook set

Collection of data science notebooks spanning EDA, translation, scraping, classical ML, and causal analysis.

JUPYTER NOTEBOOK
Default branch
main
Last pushed
Jan 25, 2026

Technologies Used

Python notebooks across analytics and ML; typical stack includes pandas, numpy, seaborn/matplotlib, scikit-learn, TensorFlow/Keras (for translation), BeautifulSoup (for scraping).

Notebook Breakdown

  1. Student Performance Indicator — EDA through model selection; walks the ML lifecycle from problem framing to training and choosing the best model.
  2. Effect of Government Social Programs on Poverty in Kenya — Descriptive analytics and correlations to understand program impact.
  3. Effect of Petroleum Prices on Demand in Kenya — Correlation and descriptive analysis of price changes vs demand.
  4. Effect of Taxation on SME Performance — Frequency analysis plus descriptive analytics on taxation effects.
  5. Fine-Tuning English-Swahili Translation Model — Deep-learning fine-tuning using TensorFlow/Keras and CIFAR-10 style preprocessing.
  6. Lyrics Finder — Scrapes Genius.com to gather song lyrics (URL collection + HTML parsing).
  7. English-Kiswahili Translation Notebook — GPU-backed fine-tuning for translation tasks.
  8. PandemAI — Data cleaning and formatting pipeline preparation.
  9. Supervised Learning with SVM — Implements and evaluates SVM models.
  10. Supervised Learning with Random Forests — Attribute selection, training, and evaluation with confusion matrices.
  11. Customer Churn Prediction — Feature engineering plus multiple classifiers (RF, AdaBoost, SVC, XGBoost) compared via accuracy and reports.
  12. Causal Inference with Bayesian Networks — Bayesian causal analysis (as listed in the contents).

Getting Started

Each notebook is runnable in Google Colab via the provided links in the repository; prerequisites include Python 3.x plus common data/ML libraries (TensorFlow, Keras, Matplotlib, Seaborn, NumPy, scikit-learn, BeautifulSoup, pandas, etc.).

Need more detail?
Happy to walk through the implementation or roadmap.