Linking Machine Learning Dataset Structures with Theoretical Algebra
DOI:
https://doi.org/10.60787/gjmsti.vol1no1.48Keywords:
Dataset structure,, dimensionality reduction, machine learning,, matrices, vectorsAbstract
Adequate knowledge of the dataset structure enables researchers to publish scholarly articles. Often, a lack of proper understanding and interpretation of dataset concepts has deterred quality scholarly research in the field of machine learning modeling. In this light, the study considered the theoretical structure of datasets by first exposing the algebraic theories of scalars, vectors, and matrices, along with their corresponding notations, and second, linking these theories to datasets. Findings revealed that (a) a set of scalar objects cannot be linked to a dataset (b) a set of vectors is linked to a simple dataset of one predictor variable and a response variable and (c) a matrix is linked to a multidimensional dataset with more than one predictor variables and a response variable whose data-points can be virtualized in a multidimensional space. The study also explored the dimensionality of a dataset, highlighted the curse of dimensionality and resolution methods, and then showed a strong pointer for academia that uses datasets in research.