需要安装nltk,安装完之后还有stopwords,装在copora文件夹下边
!

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
set(stopwords.words('english'))
text="""Removal of amoxicillin from aqueous solution using sludge-based activated carbon modified ."""#插入需要停用词处理的txt
stop_words=set(stopwords.words('english'))
word_tokens=word_tokenize(text)
filtered_sentence = []
for w in word_tokens:
if w not in stop_words:
filtered_sentence.append(w)
print("\n\nFiltered Sentence \n\n")
print(" ".join(filtered_sentence))
输出的结果是:
Filtered Sentence
Removal amoxicillin aqueous solution using sludge-based activated carbon modified walnut shell nano-titanium dioxide . Dewatered municipal sludge used raw material prepare activated carbon ( SAC ) , SAC modified walnut shell nano-titanium dioxide ( MSAC ) . The results showed MSAC higher specific surface area ( S-BET ) ( 279.147 ( 2 ) /g ) total pore volume ( V-T ) ( 0.324 cm ( 3 ) /g ) SAC .
Process finished with exit code 0
我也是个小白菜鸡文科硕士生……
正在记录自己的处理过程
本文介绍了一个使用Python的nltk库进行文本停用词处理的例子。通过去除英语停用词,对一段关于水溶液中阿莫西林去除方法的文本进行了过滤。展示了从文本分词到应用停用词过滤的完整过程。

1万+

被折叠的 条评论
为什么被折叠?



