生成词云
安装wordcloud模块
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ wordcloud
用重复的单个单词组成单词云
import numpy as np
from wordcloud import WordCloud
text = "square"
x, y = np.ogrid[:300, :300]
mask = (x - 150) ** 2 + (y - 150) ** 2 > 130 ** 2
mask = 255 * mask.astype(int)
wc = WordCloud(background_color="white", repeat=True, mask=mask)
wc.generate(text)
wc.to_file('wc.png')
使用一句话生成词云
from wordcloud import WordCloud
wc = WordCloud() # 创建词云对象
wc.generate('This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.') # 生成词云
wc.to_file('wc.png') # 保存词云
读取txt文件生成
import os
from os import path
from wordcloud import WordCloud
import matplotlib.pyplot as plt
d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()
text = open(path.join(d, 'test.txt')).read()
wordcloud = WordCloud(max_font_size=40).generate(text)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
生成一个词云文件需要三步:
1、配置对象参数
2、加载词云文本
3、输出词云文件 (如果不加说明默认的图片大小为400 * 200)
wordcloud做词频统计分为以下几个步骤:
1、分隔:以空格分隔单词
2、统计 :单词出现的次数并过滤
3、字体:根据统计搭配相应的字号
4、布局
常用参数
eg:
import os
from os import path
from wordcloud import WordCloud
d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()
text = open(path.join(d, 'test.txt')).read()
text=text.lower()
wordcloud = WordCloud(background_color="white",width=800,height=660).generate(text)
import matplotlib.pyplot as plt
plt.imshow(wordcloud)
plt.axis("off")
plt.show()
wc.to_file('test.png')
test.txt的获取
链接:https://pan.baidu.com/s/1zfuK9-W5tyq1P8ftlQJuJQ 提取码:iet4
更多参考 http://amueller.github.io/word_cloud/
https://github.com/amueller/word_cloud
发表评论