Skip to content

Code-Ebullient/agape_datathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

FIFA Women’s World Cup - WiDA Datathon

Screen Shot 2023-05-03 at 1 51 37 PM

─ Prepared by Team Agape on 24.03.2023

Members: Peace Oluchi Onyekachi | Owolola Rakayat

Overview / Introduction

Exploring a dataset is one rewarding act that gives one grand insight about that dataset. It takes us down the history lane to what happened, how it went and the what to learn from there. This FIFA Women's World Cup Stats dataset from Kaggle, contains detailed information on the players, matches, and results of every Women's World Cup tournament from 1991 to 2019.

By analyzing this dataset, we gained valuable insights into the evolution of women's soccer over the years, the players performances, and their countries which all played a part on how they succeed in this highly competitive sport.

Goals

  1. Thoroughly explore and analyze (EDA) the dataset to know top performers, their goal scores and countries.
  2. The workflow of how data exploratory analysis plays a major part in the know how on every aspects of life.

Specifications

The dataset used was gotten from kaggle: https://www.kaggle.com/datasets/mattop/fifa-womens-world-cup-stats which contains 136 number of Rows and 21 columns with so may insights on this highly competitive game.

Screen Shot 2023-05-03 at 2 01 48 PM

Dataset Exploration

This is the first step/approach on analyzing any dataset. We get to understand the dataset, then went ahead to prepare and clean it for easy use in our analysis.

Data Cleaning and Transformation

  1. Data Preparation:

Towards understanding the dataset, some functions like shape, head, describe, info, columns were used to ascertain the core feature and its relatives. To further ensure that our data will be ready to be used, we cleaned up the data by removing duplicates, dropping irrelevant rows & columns and renaming the columns for easy identification.

  1. Understanding The Features (univariate) Univariate analysis simply means analyzing each feature to ascertain their aggregates. We used plotting features distributions like Bar, Histogram, Area, Density, Box etc to test each series. We were able to see the value counts of each series, their unique values and how their distributions can fit & will be presented.

Screen Shot 2023-05-03 at 2 21 46 PM

Total number of goals series analysis

  1. Features Relationships:

Comparing the features side by side using different functions / methods like scatter plot, Heatmap, Pairplot etc made us realize that there can be so much connecting two or more variables than anticipated. Visualization of the distributions of the data helped us to understand & determine its behaviours and patterns. These went further to relate to us the relationships amongst the independent variables, amongst the dependent variables and between the independent & dependent ones. This brought about the establishment of the perfect positive relationship. Screen Shot 2023-05-03 at 2 22 04 PM

Conclusion & Recommendation

What did this analysis solve? What can be seen as the most important take out on this whole exploration process? After this thorough analysis, it threw many light on some areas:

  1. The most important thing in this whole process is that games is all about Goals / Winning.
  2. The Starting Time of Players are very important.
  3. The goals scored on non penalty seems special.
  4. Major top Players tends to attract Yellow Cards.
  5. Some Countries has been producing many top players for years.

Submission Agape - Datathon

The drive that contains all the necessary document: Agape - Datathon.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published