Normalization yog siv los tshem tawm cov ntaub ntawv tsis txaus ntseegthiab ua kom cov pawg zoo tsim tawm uas tuaj yeem txhim kho qhov ua tau zoo ntawm pawg algorithms. Yog li nws dhau los ua ib qho tseem ceeb ua ntej pawg raws li Euclidean nrug yog rhiab heev rau cov kev hloov ntawm qhov sib txawv[3].
Peb puas yuav tsum ua kom cov ntaub ntawv zoo li qub rau K-txhais tau tias pawg?
Raws li nyob rau hauv txoj kev k-NN, cov yam ntxwv uas siv rau pawg yuav tsum tau ntsuas hauv qhov sib piv. Nyob rau hauv cov ntaub ntawv no, units tsis yog ib qho teeb meem vim hais tias tag nrho cov 6 yam ntxwv qhia nyob rau hauv 5-point scale. Normalization lossis standardization tsis tsim nyog.
Koj npaj cov ntaub ntawv ua ntej pawg li cas?
Data Preparation
Yuav ua ib qho kev soj ntsuam pawg hauv R, feem ntau, cov ntaub ntawv yuav tsum tau npaj raws li hauv qab no: Kab yog kev soj ntsuam (tus neeg) thiab kab ntawv yog qhov sib txawv. Ib qho nqi uas ploj lawm hauv cov ntaub ntawv yuav tsum raug tshem tawm lossis kwv yees. Cov ntaub ntawv yuav tsum yog tus qauv (piv txwv li, ntsuas) kom ua kom muaj kev sib txawv.
Yuav tsum tau ntsuas cov ntaub ntawv rau pawg?
Nyob hauv pawg, koj xam qhov zoo sib xws ntawm ob qho piv txwv los ntawm sib txuas tag nrho cov ntaub ntawv featurerau cov piv txwv no rau hauv tus lej. Kev sib xyaw cov ntaub ntawv tshwj xeeb xav kom cov ntaub ntawv muaj qhov ntsuas tib yam.
Vim li cas nws tseem ceeb rau Normalize nta ua ntej pawg?
Standardization yog ib kauj ruam tseem ceeb ntawm Cov Ntaub Ntawvpreprocessing.
Raws li tau piav qhia hauv daim ntawv no, k-txhais tau tias txo qis qhov ua yuam kev uas siv Newton algorithm, piv txwv li gradient-based optimization algorithm. Normalizing cov ntaub ntawv txhim kho convergence ntawm xws algorithms.