Wheat Seed Characteristics

Multivariate Analysis | Factor Analysis | PCA | K-Means Clustering

Project Overview

This academic project applies multivariate statistical techniques to analyze wheat seed characteristics and identify the most suitable seed variety for cultivation through data-driven insights.

Dataset

  • Source: Kaggle (Public Dataset)
  • Seed Types: Kama, Rose, Canadian
  • Attributes: Area, Perimeter, Compactness, Kernel Length, Kernel Width, Asymmetry Coefficient, Kernel Groove Length

Tools & Techniques

  • R & SPSS
  • Exploratory Data Analysis (EDA)
  • Principal Component Analysis (PCA)
  • Factor Analysis
  • K-Means & Hierarchical Clustering

Key Observations

  • Area and Perimeter showed strong positive correlation
  • First two principal components explained 86% of total variance
  • K-Means clustering with k = 3 produced optimal separation
  • Canadian wheat seeds consistently formed the strongest cluster

Recommendation

Based on validated dimensionality reduction and clustering results, the Canadian wheat seed variety is recommended for cultivation due to superior physical structure and consistency across models.