中国大学MOOC: 下面是一段文档的向量化的程序,且未经停用词过滤from sklearn.feature_extraction.text import CountVectorizercorpus = [Jobs was the chairman of Apple Inc., and he was very famous,I like to use apple computer,And I also like to eat apple] vectorizer =CountVectorizer()print(vectorizer.vocabulary_)print(vectorizer.fit_transform(corpus).todense()) #转化为完整特征矩阵已知print(vectorizer.vocabulary_)的输出结果为:{uand: 1, ujobs: 9, uapple: 2, uvery: 15, ufamous: 6, ucomputer: 4, ueat: 5, uhe: 7, uuse: 14, ulike: 10, uto: 13, uof: 11, ualso: 0, uchairman: 3, uthe: 12,
中国大学MOOC: 下面是一段文档的向量化的程序,且未经停用词过滤from sklearn.feature_extraction.text import CountVectorizercorpus = [Jobs was the chairman of Apple Inc., and he was very famous,I like to use apple computer,And I also like to eat apple] vectorizer =CountVectorizer()print(vectorizer.vocabulary_)print(vectorizer.fit_transform(corpus).todense()) #转化为完整特征矩阵已知print(vectorizer.vocabulary_)的输出结果为:{uand: 1, ujobs: 9, uapple: 2, uvery: 15, ufamous: 6, ucomputer: 4, ueat: 5, uhe: 7, uuse: 14, ulike: 10, uto: 13, uof: 11, ualso: 0, uchairman: 3, uthe: 12,
1