sklearn主要内容
sklearn三大组件:Estimator,Transformer,Pipeline'
sklearn六大模块:Classfication ,Rregression,Clustering , Dimensionality reduction,Model selection,Preprocessing
数据预处理方面:
MinMaxScalar
Normalizer
StandardSaclar
LabelEncoder
OneHotEncoder
Binarizer
MultiLabelBinary
特征抽取方面:
DictVectorizer
FeatureHasher
CountVectorizer
TfidfVectorizer
HashingVectorizer
特征选择方面:
VarianceThreshold
SelectKBest
SelectPercentile
聚类:
DBSCAN
KMeans
MiniBatchKMeans
SpectralClustering
MeanShift
矩阵分解:
FactorAnalysis
FastICA
IncrementalPCA
KernelPCA
LatentDirichletAllocation
NMF
SparsePCA
集成学习:
BaggingClassifier
BaggingRegressor
AdaBoostClassifier
AdaBoostRegressor
GradientBoostingClassifier
GradientBoostingRegressor
ExtraTreeClassfier
ExtraTreeRegressor
RandomForestClassfier
RandomForestRegressor
VotingClassifier
GLM系列:
BayesianRidge
Lasso
LinearRegression
LogisticRegression
Perceptron
Ridge
SGDClassifier
SGDRegressor
Manifold learning:
Isomap
MDS
LocallyLinearEmbedding
SpectralEmbedding
模型评价metrics:
accuracy_score
auc
classfication_report
confusion_matrix
f1_score
hamming_loss
log_loss
precision_score
recall_score
ros_auc_score
roc_curve
mean_absolute_error
mean_squared_error
median_absolute_error
r2_score
高斯模型:
BayesianGaussianMixture
GaussianMixture
模型选择model_selection:
GridSearchCV
RandomizedSearchCV
cross_validate
多类多标签:
OneVsRestClassifier
OneVsOneClassifier
OutputCoderClassifier
数据预处理preprocessing:
朴素贝叶斯
最近邻
神经网络
Pipeline流水线作业
SVM
Decision Trees