Created by Hieu Nguyen
https://hnguyen76.github.io/Teen_Mental_Health_Analysis/reports/DASHBOARD.html
Exploratory data analysis project focused on teen social media behavior, sleep habits, academic performance, stress, anxiety, addiction level, and depression label indicators.
This repository contains a complete beginner-friendly data workflow:
Status: Active learning project
Primary dataset: Teen_Mental_Health_Dataset.csv
Cleaned dataset: Teen_Mental_Health_Cleaned.csv
Cleaning script: cleaning.py
Main visualization notebook: teen_visiualize.ipynb
Dashboard app: dashboard.py
GitHub-viewable dashboard: reports/DASHBOARD.md
.
|-- Teen_Mental_Health_Dataset.csv
|-- Teen_Mental_Health_Cleaned.csv
|-- reports/
| |-- DATA_CLEANING_REPORT.md
| |-- DATA_DICTIONARY.md
| |-- DASHBOARD.md
| |-- EDA_REPORT.md
| |-- figures/
| |-- SOW.md
| `-- VISUALIZATION_REPORT.md
|-- cleaning.py
|-- dashboard.py
|-- generate_static_dashboard.py
|-- teen_visiualize.ipynb
|-- requirements.txt
`-- README.md
How do teen lifestyle and digital behavior variables relate to sleep, academic performance, stress, anxiety, addiction level, and depression label outcomes?
This project is descriptive EDA. It identifies patterns and relationships in the dataset, but it does not prove causation and should not be used as medical advice.
The cleaned teen dataset contains:
Key columns include:
agegenderdaily_social_media_hoursplatform_usagesleep_hoursscreen_time_before_sleepacademic_performancephysical_activitysocial_interaction_levelstress_levelanxiety_leveladdiction_leveldepression_labelSee reports/DATA_DICTIONARY.md for the full data dictionary.
1, representing 2.58% of the dataset.1 show lower average sleep hours than label 0.1 show higher average daily social media hours than label 0.Detailed interpretation is available in reports/EDA_REPORT.md.
Open the full static dashboard here:

Create and activate a virtual environment:
python -m venv .venv2
.\.venv2\Scripts\activate
Install dependencies:
python -m pip install -r requirements.txt
Start Jupyter:
jupyter notebook
Open the visualization notebook:
teen_visiualize.ipynbView the static dashboard directly on GitHub:
Run the Streamlit dashboard:
streamlit run dashboard.py
Regenerate the GitHub dashboard charts:
python generate_static_dashboard.py
The dashboard includes:
Load the cleaned dataset:
import pandas as pd
df = pd.read_csv("Teen_Mental_Health_Cleaned.csv")
df.head()
Check missing values:
df.isnull().sum()
Check duplicate rows:
df.duplicated().sum()
Export a cleaned CSV:
df.to_csv("Teen_Mental_Health_Cleaned.csv", index=False)
depression_label column is treated as a dataset label, not a clinical diagnosis.