Project: Analysis of the Personal Key Indicators of Heart Disease Dataset


Authors: Pradeep Muthaiya, Sean Shimohara, Thi Nguyen, and Wesley Cheung

Project Description: This project is our final homework assignment for the Stat 159 Course taught at UC Berkeley. For this assignment, we analyze the publicly available Personal Key Indicators of Heart Disease Dataset. Here is a description of the files in the repo:

  • data is a folder that contains the .csv file of the dataset used in our analysis

  • figures is a folder that contains all of the .png files we generated from our analysis

  • tools is a folder that contains the custom functions we used in our analysis, along with some simple tests for these functions

  • is a Markdown file outlining each group member’s contributions to this repo

  • EDA.ipynb is a Jupyter Notebook displaying the exploratory data analysis we ran on the dataset

  • environment.yml is a .yml file that can be used to reproduce the environment we ran our analysis on

  • is a Markdown file containing the instructions for this homework assignment

  • LICENSE is a license for our work

  • main.ipynb is a Jupyter Notebook that contains a detailed description of our analysis and findings

  • Makefile is a Makefile that allows the easy rerunning of our analysis

  • model.ipynb is a Jupyter Notebook that displays the process by which we created our classifiers from the dataset

  • pyproj.toml is used to install our custom package

  • setup.cfg is used to install our custom package

  • is used to install our custom package

  • _config.yml is used to produce the jupyter book

  • _toc.yml is used to produce the jupyter book

Citations: The Dataset Used in Our Analysis:

The Full CDC BRFSS Dataset: