python requests之 基础用法
分类专栏: python进阶 文章标签: python requests
版权
python进阶专栏收录该内容
56 篇文章2 订阅
订阅专栏
文章目录
本文仅供学习参考,建议调试阶段使用, 生产阶段可以使用进阶用法
一. 发送请求的类型
(***)
表示经常会用到, (**)
表示会用到但不经常, (*)
表示很少用到- GET 查看 (***)
- POST 增加 (***)
- PUT 修改 (**)
- PATCH 修改(**)
- DELETE 删除 (**)
- HEAD 查看响应头 (*)
- OPTIONS 查看可用请求方法 (*)
1 2
| requests.[method](url) 1
|
二. GET 传递URL参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| In [1]: import requests
In [2]: url = 'http://www.test.com'
In [3]: params = {"key1":"val1", "key2":"val2"}
In [4]: r = requests.get(url, params)
In [5]: r.url Out[5]: 'https://www.test.com/?key1=val1&key2=val2'
In [6]: params2 = {"key1": "val1", "key2":["v2", "val2"]}
In [7]: r = requests.get(url,params2)
In [8]: r.url Out[8]: 'https://www.test.com/?key1=val1&key2=v2&key2=val2'
1234567891011121314151617181920
|
- requests.get(url, params={“key1”: “val1”, “key2”: “val2”}), 相当于在url后面拼接一些参数
params
参数只在get中使用.
三. 响应内容:
3.1 文本响应内容及其编码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| In [3]: r = requests.get("https://www.baidu.com")
In [4]: r.url Out[4]: 'https://www.baidu.com/'
In [5]: r.text Out[5]: '<!DOCTYPE html>\r\n<!--STATUS OK--><html> <head><meta http-equiv=conten t-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content= IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css hr ef=https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/bdorz/baidu.min.css ><title>莽\x99戮氓潞娄盲赂\x80盲赂\x8b茂录\x8c盲陆\xa0氓掳卤莽\x9f楼茅\x81\x93</title></head> <bo dy link=#0000cc> ... 此处省略几千字...</body> </html>\r\n'
In [6]: r.encoding Out[6]: 'ISO-8859-1'
In [7]: r.encoding= 'utf-8'
In [9]: r.encoding Out[9]: 'utf-8' 1234567891011121314151617181920
|
3.2 二进制响应内容(常用)
1 2 3 4 5 6 7 8 9 10
| In [23]: r.content Out[23]: ...省略几千字... 如果r的链接地址是个图片, 可以非常好的作为示例.
In [24]: from PIL import Image
In [25]: from io import BytesIO
In [26]: i = Image.open(BytesIO(r.content))
123456789
|
3.3 JSON 响应内容(常用)
1 2 3 4 5 6 7 8 9 10 11
| In [1]: r = requests.get('https://api.github.com/events')
In [2]: r.json() Out[2]: [{'id': '13092490355', 'type': 'PushEvent', 'actor': {'id': 46406730, 'login': 'NekoSilverFox ... }] 12345678910
|
3.4 原始响应内容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| In [1]: r = requests.get('https://api.github.com/events', stream=True)
In [2]: r.json() Out[2]: [{'id': '13092490355', 'type': 'PushEvent', 'actor': {'id': 46406730, 'login': 'NekoSilverFox ... }]
In [3]: r.raw Out[3]: <urllib3.response.HTTPResponse at 0x1b4c491ddd8>
In [4]: r.raw.read(10) Out[4]: b''
1234567891011121314151617
|
一般情况下使用如下代码替代文件流:
1 2 3 4
| with open(filename, 'wb') as fd: for chunk in r.iter_content(chunk_size): fd.write(chunk) 123
|
3.5 定制请求头
1 2 3 4
| >>> url = 'https://api.github.com/some/endpoint' >>> headers = {'user-agent': 'my-app/0.0.1'} >>> r = requests.get(url, headers=headers) 123
|
四. POST请求
4.1 数据上传
requests.post(url, data={"key1": "val1", "key2": "val2"})
是将字典数据
传给data
参数, 还可以传输元组列表
, 也可以传string
(此时需要用json.dumps()处理下).requests.post(url, json={"key1": "val1", "key2": "val2"})
是将字典数据
传给json
参数, 会自动编码
- post请求可以 接受
data
和json
两个参数,不接受params - 若 post请求传参要求是body中raw的格式josn(application/json), 则在post请求参数中添加headers参数, 参数值包含{“Content-Type”: “application/json; charset=UTF-8”}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
| import requests headers = { "Content-Type": "application/json; charset=UTF-8", } resp = requests.post(url, data=(), headers=headers) 12345
>>> data = {'key1': 'value1', 'key2': 'value2'} >>> r = requests.post("http://httpbin.org/post", data=data) >>> print(r.text) { ... "form": { "key2": "value2", "key1": "value1" }, ... }
>>> payload = (('key1', 'value1'), ('key1', 'value2')) >>> r = requests.post('http://httpbin.org/post', data=payload) >>> print(r.text) { ... "form": { "key1": [ "value1", "value2" ] }, ... }
>>> import json >>> url = 'https://api.github.com/some/endpoint' >>> json_dict = {"some": "data"} >>> r = requests.post(url, data=json.dumps(payload))
>>> url = 'https://api.github.com/some/endpoint' >>> json_dict = {"some": "data"} >>> r = requests.post(url, json=json_dict) 123456789101112131415161718192021222324252627282930313233343536
|
4.2 文件上传
格式如下:
r = requests.post(url, files={'file': open('report.xls', 'rb')})
4.2.1 最简单的上传方法
1 2 3 4 5 6 7 8 9 10 11 12 13
| >>> url = 'http://httpbin.org/post' >>> files = {'file': open('report.xls', 'rb')} >>> r = requests.post(url, files=files) >>> r.text { ... "files": { "file": "<censored...binary...data>" }, ... }
123456789101112
|
4.2.2 显式设置文件名, 文件类型和请求头
1 2 3 4 5 6 7 8 9 10 11 12
| >>> url = 'http://httpbin.org/post' >>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})} >>> r = requests.post(url, files=files) >>> r.text { ... "files": { "file": "<censored...binary...data>" }, ... } 1234567891011
|
五. 响应状态码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| >>> r = requests.get('https://www.baidu.com') >>> r.status_code 200 >>> r.status_code == requests.codes.ok True >>> bad_r = requests.get("http://httpbin.org/status/404") >>> bad_r.status_code 404 >>> bad_r.raise_for_status() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\Anaconda3\lib\site-packages\requests\models.py", line 940, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: NOT FOUND for url: http://httpbin.org/sta tus/404 >>> r.raise_for_status() >>> 1234567891011121314151617
|
六. 响应头
1 2 3 4 5 6 7 8 9
| >>> r.headers {'Cache-Control': 'private, no-cache, no-store, proxy-revalidate, no-transform', 'Connecti on': 'keep-alive', 'Content-Encoding': 'gzip', 'Content-Type': 'text/html', 'Date': 'Mon, 03 Aug 2020 09:15:49 GMT', 'Last-Modified': 'Mon, 23 Jan 2017 13:23:51 GMT', 'Pragma': 'no -cache', 'Server': 'bfe/1.0.8.18', 'Set-Cookie': 'BDORZ=27315; max-age=86400; domain=.baid u.com; path=/', 'Transfer-Encoding': 'chunked'} >>> r.headers.get('content-type') 'text/html' 12345678
|
七. Cookie
7.1 快速访问cookies
1 2 3 4 5
| >>> url = 'http://example.com/some/cookie/setting/url' >>> r = requests.get(url) >>> r.cookies['example_cookie_name'] 'example_cookie_value' 1234
|
7.2 发送cookies到服务器, 使用cookies参数
1 2 3 4 5 6 7 8 9 10 11
| >>> url = 'http://example.com/some/cookie/setting/url' >>> r = requests.get(url) >>> r.cookies <RequestsCookieJar[]> >>> r.cookies.get('example_cookie_name') >>> test_cookies = dict(cookes_are='working') >>> r = requests.get(url, cookies=test_cookies) >>> r.text '...省略几千字...' >>> 12345678910
|
7.3 Cookies返回对象
对象为RequestsCookiesJar
, 和字典非常相似, 也可以将jar传到cookies参数中
1 2 3 4 5 6 7 8
| >>> jar = requests.cookies.RequestsCookieJar() >>> jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies') >>> jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere') >>> url = 'http://httpbin.org/cookies' >>> r = requests.get(url, cookies=jar) >>> r.text '{"cookies": {"tasty_cookie": "yum"}}' 1234567
|
八. 请求超时处理(生产代码必须使用这一参数.)
1 2 3 4 5
| requests.get(url, timeout=(3, 7)) requests.get(url, timeout=10)
1234
|
set 限制解除
70%
搜索
登入
python requests 进阶用法
分类专栏: python进阶 文章标签: python requests
版权
python进阶专栏收录该内容
56 篇文章2 订阅
订阅专栏
文章目录
一. 会话对象
1.1会话对象可以跨请求保持某些参数,
会话对象具有requests API的所有方法.
1 2 3 4 5 6 7 8
| >>> import requests >>> s = requests.Session() >>> s.get("http://httpbin.org/cookies/set/sessioncookie/123456789") <Response [200]> >>> r = s.get("http://httpbin.org/cookies") >>> r.text '{\n "cookies": {\n "sessioncookie": "123456789"\n }\n}\n' 1234567
|
1.2 会话可以为请求方法提供缺省数据
1 2 3 4 5 6 7 8
| >>> s = requests.Session() >>> s.auth("user", "pass") ...因为没有真正验证登录, 所以会报错... >>> s.headers.update({"x-test": "true"}) >>> s.get('http://httpbin.org/headers', headers={'x-test2': 'true'}) <Response [200]>
1234567
|
1.3 参数合并
传递给请求方法的字典都会与已设置的会话层数据合并. 方法层的参数覆盖会话的参数. 方法级别的参数不会被跨请求保持
1 2 3 4 5 6 7 8 9
| >>> s = requests.Session() >>> r = s.get('http://httpbin.org/cookies', cookies={'from-my': 'browser'}) >>> r.text '{\n "cookies": {\n "from-my": "browser"\n }\n}\n' >>> r = s.get('http://httpbin.org/cookies') >>> r.text '{\n "cookies": {}\n}\n' >>> 12345678
|
1.4 建议使用的代码方式:上下文管理器
1 2 3 4 5 6 7 8 9 10 11
| with requests.Session() as s: s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
with requests.Session() as s: resp = s.request(method="get|post", url=url)
content_type = ["application/json", "application/stream", "text/explain"]
12345678910
|
未完待续…
python常用
1 2 3 4 5 6 7 8 9 10 11
|
1,多引用类库 2,数据结构位置 3,尽量减少使用for 4、避免使用全局变量 5、增加列表推导式(List Comprehension)list.apend 6、用xrange()替换range() 7、使用生成器(Generators)生成器以块计算数据。 8、用Join连接字符串串 i
|
set 限制解除
python笔记——split()函数详解
一、split()
- 语法
- str.split(str=””, num=string.count(str))
- 参数
- str 分隔符,默认为所有的空字符,包括空格、换行(\n)、制表符(\t)等。
- num – 分割次数。默认为 -1, 即分隔所有。
- 注意:当使用空格作为分隔符时,对于中间为空的项会自动忽略
- 例子
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| >>> str = "Line1-abcdef \nLine2-abc Line4-abcd" >>> print(str.split()) ['Line1-abcdef', 'Line2-abc', 'Line4-abcd'] 123 >>> str = "Line1-abcdef \nLine2-abc Line4-abcd" >>> (a,b,c) = str.split() >>> print(a) >>> print(b) >>> print(c) Line1-abcdef Line2-abc Line4-abcd 12345678 >>> str="hello boy<[www.baidu.com]>byebye" >>> print(str.split("[")[1].split("]")[0]) www.baidu.com 123
|
二、os.path.split()
- 语法
- 参数
- 如果PATH传入的是一个目录和文件名,则输出路径和文件名
- 如果给出的是一个目录名,则输出路径和为空文件名
- 例子
1 2 3 4 5
| >>> import os >>> file_path = "E:\\desktop\\cangjie.avi" >>> print(os.path.split(file_path)) ('E:\\desktop', 'cangjie.avi') 1234
|