Название: Reverse Clustering: Formulation, Interpretation and Case Studies Автор: Jan W. Owsinski, Jarosław Stanczak, Karol Opara Издательство: Springer Год: 2021 Страниц: 177 Язык: английский Формат: pdf (true), epub Размер: 18.7 MB
This book presents a new perspective on and a new approach to a wide spectrum of situations, related to data analysis, actually, a kind of a new paradigm. Namely, for a given data set and its partition, whose origins may be of any kind, the authors try to reconstruct this partition on the basis of the data set given, using very broadly conceived clustering procedure. The main advantages of this new paradigm concern the substantive aspects of the particular cases considered, mainly in view of the variety of interpretations, which can be assumed in the framework of the paradigm. Due to the novel problem formulation and the flexibility in the interpretations of this problem and its components, the domains, which are encompassed (or at least affected) by the potential use of the paradigm, include cluster analysis, classification, outlier detection, feature selection, and even factor analysis as well as geometry of the data set. The book is useful for all those who look for new, nonconventional approaches to their data analysis problems.
We witness nowadays an explosive growth and development of methods and techniques, related to data analysis, this growth being conditioned, on the one hand, by the rapidly expanding availability of data in virtually all domains of human activity, and, on the other hand, the very substantive progress in technical and scientific capabilities of dealing with the increasing volumes of data. All this amounts to a dramatic change, especially in quantitative terms.
Yet, as researchers and practitioners involved in the work on methodological side of data analysis know very well, many of the fundamental substantive problems in this domain still require solutions, or at least—better solutions—than those available now. This concerns, in particular, such fundamental areas as clustering, classification, rule extraction, and so on. The primary issue is here constituted by the opposition between precision or accuracy and speed or computational cost (when the problem at hand is already truly well-defined). One cannot forget, neither, of the very strong data dependence of effectiveness and efficiency of many of the methodologies being applied nowadays, making the situation even more difficult.
The present book addresses this nexus of issues, aiming, in this case, apparently at the interface of clustering and classification, but, in fact, being relevant to a much broader domain, with much broader implications in terms of applicability and interpretation. Namely, it describes the paradigm of “reverse clustering”, introduced by the present authors. The paradigm concerns the situation, in which we are given a certain data set, composed of entities, observations, objects…, which is usual for the data analysis situation, and, at the same time, we are given, or we consider, a certain partition of this data set. We do not assume a priori anything about the data set, nor about the partition, and, essentially importantly, about the relation between the data set and the partition. Thus, the partition may be the result of a definite kind of analysis of the given data set, but may, as well, result from quite a different mechanism (e.g. a division of the set of objects according to some variable or criterion not contained in the data set at hand).
Скачать Reverse Clustering: Formulation, Interpretation and Case Studies
|