|
Categorical data (Categorical Data) details
What is data classification?
Classification data, also known as classification data or data, is a type of data with characters or symbols to indicate the classification of things.
The difference between classification data and numerical data
eigenstats classification data value data
Display format of characters, symbols and numbers
、ratio ratio
Example gender、color、brand、bodyweight、income
Export to Sheets
classification of data types
Nominal Data:
It only shows categories,
For example: sex(
Ordinal Data:
In addition to showing the category, it also has an order relationship.
.
classification of data processing
code:
Numeric code: Each category is mapped to a unique number.
(One-hot encoding): Convert each class into a Whatsapp Number binary vector, one element in the vector is 1, the rest are 0.
Label encoding: The categories are mapped to integers in order.
visualization:
Used to display the frequency of different categories.
gorithm: decision tree, random forest, support vector machine, etc. algorithm can directly process classification data.
Numerical valuert the classification data to numerical value.
Application of classification data in machine learning
Classification problem: Directly as a characteristic input to the classification model. be used as a basis fo
Association rules mining: find the association relations between different categories.
Example: customer data analysis
Suppose we have one customer data, including gender, age, academic record, purchase of goods, etc. information.
Classification data: sex, education, purchasing goods
value data: age.
Through the analysis of these data, we can understand the purchasing bias of customers of different genders and educational backgrounds, thereby providing a basis for decision-making for product recommendations and marketing.
summary
Classified data is a common type of data in data analysis. The correct handling and analysis of classified data is essential for extracting valuable information from data.
For instance:
Application of classification data in different machine learning algorithms
Classification data preprocessing method
Classification of data and combination of other types of data
|
|