Название: Statistical Models and Methods for Data Science Автор: Leonardo Grilli, Monia Lupparelli, Carla Rampichini Издательство: Springer Год: 2023 Страниц: 186 Язык: английский Формат: pdf (true), epub Размер: 18.0 MB
This book focuses on methods and models in classification and data analysis and presents real-world applications at the interface with Data Science. Numerous topics are covered, ranging from statistical inference and modelling to clustering and factorial methods, and from directional data analysis to time series analysis and small area estimation. The applications deal with new developments in a variety of fields, including medicine, finance, engineering, marketing, and cyber risk.
Analyzing categorical data in Machine Learning generally requires a coding strategy. This problem is common to multivariate statistical techniques, and several approaches have been suggested in the literature. This article proposes a method for analyzing categorical variables with neural networks. Both a supervised and unsupervised approaches were considered, in which the variables can have high cardinality. Some simulated data applications illustrate the interest in the proposal.
Most Machine Learning algorithms cannot be applied directly to categorical data that are generally non-numeric. Their application therefore requires some form of encoding that transforms the categorical features into one or more numeric variables. This problem can pose a serious difficulty if the variables have many categories, a common situation for big data generally including mixed measurement levels of the variables.
Скачать Statistical Models and Methods for Data Science
|