Python 自动操作浏览器

原创

已于 2023-01-11 11:07:07 修改 · 1k 阅读

标签

#python #selenium #chrome #爬虫 #html

收录于

于 2022-12-28 18:39:57 首次发布

本文介绍了如何使用Python的Selenium库进行浏览器自动化操作，包括安装库和浏览器驱动，设置无头模式，通过XPATH和CSS_SELECTOR选择器获取数据，以及与BeautifulSoup结合解析网页。虽然Selenium操作直观，但速度相对较慢，适合需要动态交互的场景。

1. 安装库

pip install selenium # Windows电脑安装selenium
pip3 install selenium # Mac电脑安装selenium

2.安装浏览器驱动

Chrome浏览器

http://chromedriver.storage.googleapis.com/index.html?path=103.0.5060.134/

3.设置浏览器引擎

# Chrome浏览器设置方法
from selenium import webdriver
#从selenium库中调用webdriver模块
driver = webdriver.Chrome() 
# 设置引擎为Chrome，真实地打开一个Chrome浏览器
driver.close()
#关闭浏览器，以免浪费资源

并不想让浏览器弹出来，浮在其他界面上的话，可以采用下面的写法。

# 本地Chrome浏览器的静默模式设置：
from selenium import  webdriver 
#从selenium库中调用webdriver模块
from selenium.webdriver.chrome.options import Options 
# 从options模块中调用Options类

chrome_options = Options() 
# 实例化Option对象
chrome_options.add_argument('--headless') # 把Chrome浏览器设置为静默模式
driver = webdriver.Chrome(options = chrome_options) 
# 设置引擎为Chrome，在后台默默运行
driver.close()