SpaceX Launch Analysis

From May 2025 to July 2025, I completed a 5-week, 45-hour capstone project for the IBM Data Science Professional Certificate as part of my B.S. in Computer Science from MSU Denver. This project analyzed SpaceX launch outcomes, building a full data science pipeline to predict mission success using Python, machine learning, and interactive dashboards.

👉 View IBM Data Science Specialization Certificate
👉 View GitHub Repository


🔍 Project Highlights

  • Designed a Python-based data science pipeline across five weeks, integrating SpaceX REST API and web scraping.
  • Built interactive dashboards with Plotly Dash and Folium for launch site analysis.
  • Achieved >90% accuracy in predicting launch outcomes using machine learning classifiers.
  • Delivered a technical presentation summarizing methodology and insights.

📦 My Role: Data Scientist

  • Pipeline Development: Built and executed a data science workflow, from data collection to model deployment.
  • Data Analysis: Performed web scraping, data wrangling, and exploratory analysis with Pandas and SQL.
  • Visualization: Created interactive dashboards using Plotly Dash and Folium maps.
  • Modeling: Developed and tuned machine learning classifiers (Logistic Regression, Decision Trees).
  • Presentation: Authored a technical report and presentation for Coursera evaluation.

This role strengthened my skills in data science, machine learning, and data visualization.


👥 Contributors and Credits

A solo-led academic project for the IBM Data Science Professional Certificate, completed for educational purposes.


✨ Key Features

The SpaceX Launch Analysis project offers:

  1. Data Collection: Extracted launch data via SpaceX REST API and web scraping with BeautifulSoup.
  2. Data Wrangling: Cleaned and preprocessed data using Pandas.
  3. Exploratory Analysis: Generated scatter plots, bar charts, and SQL-based insights.
  4. Geolocation Visualization: Mapped launch sites with Folium.
  5. Interactive Dashboards: Built real-time filtering dashboards with Plotly Dash.
  6. Predictive Modeling: Classified launch outcomes with >90% accuracy using scikit-learn.

Integrations: SpaceX REST API, Plotly Dash, Folium, Jupyter Notebooks.


🛠️ Technologies Used

  • Languages/Libraries: Python, Pandas, scikit-learn, Plotly, BeautifulSoup, Folium
  • Tools: Jupyter Notebooks, GitHub, SQL
  • Workflow: Data wrangling, visualization, statistical modeling, dashboarding
  • Documentation: README, Technical Presentation

📁 Repository Contents

Resource Description
SpaceX_API.ipynb Data collection via SpaceX REST API
Web_Scraping.ipynb HTML scraping for additional launch records
Data_Wrangling.ipynb Data cleaning and preprocessing
EDA_Visualization.ipynb Scatter plots, bar charts, and line graphs
EDA_SQL.ipynb SQL-based payload and booster insights
Folium_Map.ipynb Launch site geolocation and outcomes
Plotly_Dash.ipynb Interactive dashboard with filters and metrics
Predictive_Analysis.ipynb Classification model predictions

📈 Project Rigor

The GitHub repository showcases:

  • Structured commit history across five weeks (May–July 2025).
  • Comprehensive Jupyter notebooks covering API integration, web scraping, analysis, visualization, and modeling.
  • Locally deployed Plotly Dash dashboards for interactive exploration.

Setup:

  1. Clone: git clone https://github.com/willmaddock/Data-Science-Capstone-SpaceX.git
  2. Install dependencies: pip install -r requirements.txt
  3. Run notebooks: Use Jupyter to execute SpaceX_API.ipynb, Plotly_Dash.ipynb, etc.
  4. See README for details.

Data Science Pipeline:

API & Scraping

Cleaning & Preprocessing

SQL & Visualization

Feature Engineering

Plotly Dash & Folium

scikit-learn

Data Collection

Data Wrangling

Exploratory Data Analysis

Interactive Dashboards

Predictive Modeling

Launch Site Insights

Launch Outcome Predictions


© 2025 William Maddock - All Rights Reserved