生成词云

安装wordcloud模块

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ wordcloud

用重复的单个单词组成单词云

import numpy as np

from wordcloud import WordCloud

text = "square"

x, y = np.ogrid[:300, :300]

mask = (x - 150) ** 2 + (y - 150) ** 2 > 130 ** 2

mask = 255 * mask.astype(int)

wc = WordCloud(background_color="white", repeat=True, mask=mask)

wc.generate(text)

wc.to_file('wc.png')

使用一句话生成词云

from wordcloud import WordCloud

wc = WordCloud() # 创建词云对象

wc.generate('This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.') # 生成词云

wc.to_file('wc.png') # 保存词云

读取txt文件生成

import os

from os import path

from wordcloud import WordCloud

import matplotlib.pyplot as plt

d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()

text = open(path.join(d, 'test.txt')).read()

wordcloud = WordCloud(max_font_size=40).generate(text)

plt.figure()

plt.imshow(wordcloud, interpolation="bilinear")

plt.axis("off")

plt.show()

生成一个词云文件需要三步:

   1、配置对象参数 

   2、加载词云文本 

   3、输出词云文件 (如果不加说明默认的图片大小为400 * 200)

wordcloud做词频统计分为以下几个步骤:

1、分隔:以空格分隔单词 

2、统计 :单词出现的次数并过滤 

3、字体:根据统计搭配相应的字号 

4、布局

常用参数

 eg:

import os

from os import path

from wordcloud import WordCloud

d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()

text = open(path.join(d, 'test.txt')).read()

text=text.lower()

wordcloud = WordCloud(background_color="white",width=800,height=660).generate(text)

import matplotlib.pyplot as plt

plt.imshow(wordcloud)

plt.axis("off")

plt.show()

wc.to_file('test.png')

 

 test.txt的获取

链接:https://pan.baidu.com/s/1zfuK9-W5tyq1P8ftlQJuJQ 提取码:iet4

更多参考 http://amueller.github.io/word_cloud/

    https://github.com/amueller/word_cloud

查看原文