Video Tutorial – Full Hockey Analytics Project

Every Wednesday I will upload a new video to YouTube. I will add the videos in this article as we go along.

Schedule

2024-08-21: Getting data from the NHL Api based on date selection

2024-08-28: Scrape Shift data from the NHL HTML reports

2024-09-04: Get Play-by-Play and shift data for a full season at a time

2024-09-11: Scrape Play-by-Play and TOI data from the PWHL or the AHL

2024-09-18: Install MySQL and load data to a database

2024-09-25: Clean the NHL Play-by-Play data in MySQL

2024-10-02: Clean and transform the NHL Shift data in MySQL

2024-10-10: Combine the Play-by-Play and Shift data

2024-10-23: Prepare shot data for logistic regression modelling

2024-10-30: Make an xG model in Python using Scikit-learn

2024-11-06: Install Power BI desktop and load data from the MySQL database

2024-11-13: Build a Shot Visualization in Power BI using Icon Map

2024-11-20: Add Rink zones with GeoJSON

2024-11-27: Build a Line Tool visualization

Introduction

Video 1 – Getting data from the NHL Api based on a date selection

In the first video we build a tool to get play-by-play and shift data from NHL Api based on a date selection. This is useful if the goal is to do daily updates.

Link to the Excel file

Video 2 – Scraping Shift Data from the NHL HTML reports

In the second video we scrape and clean shift data directly from the HTML reports. This is relevant if you want to get live shift data – The shift data in the NHL Api doesn’t update until after the game.

Link to the Excel file

Video 3 – Getting data from the NHL Api based on a season selection

In the third video we build a tool to get data for a full season at a time. This is relevant if you want to build a database with historic data.

Link to the Excel file

Video 4 – Scraping Play-By-Play data from the PWHL (or the AHL)

In the fourth and final video about getting data we will build a tool to scrape data from the PWHL website. We will also do some data cleaning directly in Power Query.

Link to the Excel file

Video 5 – Installing and Loading NHL data into MySQL database

In the fifth video we learn how to install and load data into MySQL.

Link to the MySQL Installation Guide

Video 6 – Transforming and Cleaning Play-By-Play NHL Data in MySQL

In the sixth video we’re transforming and cleaning the play-by-play data in our database.

Link to updated Excel file

Link to MySQL script

Video 7 – Transforming and Cleaning Shift Data from NHL Api in MySQL

In the seventh video we’re transforming and cleaning the Shift data in our database.

Video 8 – Combining Play-By-Play and Shift Data from the NHL in MySQL

In the eighth video we’re combining the play-by-play and shift data. This allows us to determine who is on the ice for every event.

Video 9 – Prepare Shot Data for logistic regression modelling (xG) – MySQL

In the ninth video we do some final preparations before building the xG model and write the data to a csv file from MySQL.

Video 10 – Build an xG model in Python using Scikit learn

In this video we’re building an xG model in Python using the Scikit Learn library.

Contact

If you have questions or comments you’re more than welcome to contact me.

email: hockeystatistics.com@gmail.com

Twitter: https://twitter.com/HockeySkytte

YouTube: https://www.youtube.com/@hockey-statistics268

Support

If you like my work you should consider following me on Twitter and subscribing to my YouTube channel.

You can also subscribe to my website for just 10$ per year. There are a few tools and books only available to subscribers, but view it mostly as a way to support my continued work.

Leave a comment