python requests之 基础用法

分类专栏: python进阶 文章标签: python requests
版权
python进阶专栏收录该内容
56 篇文章2 订阅
订阅专栏


文章目录
本文仅供学习参考,建议调试阶段使用, 生产阶段可以使用进阶用法
一. 发送请求的类型
(***)表示经常会用到, (**) 表示会用到但不经常, (*)表示很少用到- GET 查看 (***)
 - POST 增加 (***)
 - PUT 修改 (**)
 - PATCH 修改(**)
 - DELETE 删除 (**)
 - HEAD 查看响应头 (*)
 - OPTIONS 查看可用请求方法 (*)
 
1 2
   | requests.[method](url) 1
 
  | 
二. GET 传递URL参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
   | In [1]: import requests
  In [2]: url = 'http://www.test.com'
  In [3]: params = {"key1":"val1", "key2":"val2"}
  In [4]: r = requests.get(url, params)
  In [5]: r.url Out[5]: 'https://www.test.com/?key1=val1&key2=val2'
  In [6]: params2 = {"key1": "val1", "key2":["v2", "val2"]}
 
  In [7]: r = requests.get(url,params2)
 
  In [8]: r.url Out[8]: 'https://www.test.com/?key1=val1&key2=v2&key2=val2'
  1234567891011121314151617181920
 
  | 
- requests.get(url, params={“key1”: “val1”, “key2”: “val2”}), 相当于在url后面拼接一些参数
 params参数只在get中使用.
三. 响应内容:
3.1 文本响应内容及其编码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
   | In [3]: r = requests.get("https://www.baidu.com")
  In [4]: r.url Out[4]: 'https://www.baidu.com/'
  In [5]: r.text Out[5]: '<!DOCTYPE html>\r\n<!--STATUS OK--><html> <head><meta http-equiv=conten t-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content= IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css hr ef=https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/bdorz/baidu.min.css ><title>莽\x99戮氓潞娄盲赂\x80盲赂\x8b茂录\x8c盲陆\xa0氓掳卤莽\x9f楼茅\x81\x93</title></head> <bo dy link=#0000cc> ... 此处省略几千字...</body> </html>\r\n'
  In [6]: r.encoding Out[6]: 'ISO-8859-1'
  In [7]: r.encoding= 'utf-8'
  In [9]: r.encoding Out[9]: 'utf-8' 1234567891011121314151617181920
 
  | 
3.2 二进制响应内容(常用)
1 2 3 4 5 6 7 8 9 10
   | In [23]: r.content Out[23]: ...省略几千字... 如果r的链接地址是个图片, 可以非常好的作为示例.
  In [24]: from PIL import Image
  In [25]: from io import BytesIO
  In [26]: i = Image.open(BytesIO(r.content))  
  123456789
 
  | 
3.3 JSON 响应内容(常用)
1 2 3 4 5 6 7 8 9 10 11
   | In [1]: r = requests.get('https://api.github.com/events')
  In [2]: r.json() Out[2]: [{'id': '13092490355',   'type': 'PushEvent',   'actor': {'id': 46406730,    'login': 'NekoSilverFox    ... }] 12345678910
 
  | 
3.4 原始响应内容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
   | In [1]: r = requests.get('https://api.github.com/events', stream=True) 
  In [2]: r.json() Out[2]: [{'id': '13092490355',   'type': 'PushEvent',   'actor': {'id': 46406730,    'login': 'NekoSilverFox    ... }]
  In [3]: r.raw Out[3]: <urllib3.response.HTTPResponse at 0x1b4c491ddd8>
  In [4]: r.raw.read(10) Out[4]: b''
  1234567891011121314151617
 
  | 
一般情况下使用如下代码替代文件流:
1 2 3 4
   | with open(filename, 'wb') as fd:    for chunk in r.iter_content(chunk_size):        fd.write(chunk) 123
 
  | 
3.5 定制请求头
1 2 3 4
   | >>> url = 'https://api.github.com/some/endpoint' >>> headers = {'user-agent': 'my-app/0.0.1'} >>> r = requests.get(url, headers=headers) 123
 
  | 
四. POST请求
4.1 数据上传
requests.post(url, data={"key1": "val1", "key2": "val2"})
是将字典数据传给data参数, 还可以传输元组列表, 也可以传string(此时需要用json.dumps()处理下).requests.post(url, json={"key1": "val1", "key2": "val2"})
是将字典数据传给json参数, 会自动编码
- post请求可以 接受 
data和json 两个参数,不接受params - 若 post请求传参要求是body中raw的格式josn(application/json), 则在post请求参数中添加headers参数, 参数值包含{“Content-Type”: “application/json; charset=UTF-8”}
 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
   | import requests headers = {    "Content-Type": "application/json; charset=UTF-8", } resp = requests.post(url, data=(), headers=headers) 12345
  >>> data = {'key1': 'value1', 'key2': 'value2'} >>> r = requests.post("http://httpbin.org/post", data=data) >>> print(r.text) {   ...   "form": {     "key2": "value2",     "key1": "value1"   },   ... }
  >>> payload = (('key1', 'value1'), ('key1', 'value2')) >>> r = requests.post('http://httpbin.org/post', data=payload) >>> print(r.text) {   ...   "form": {     "key1": [       "value1",       "value2"     ]   },   ... }
  >>> import json >>> url = 'https://api.github.com/some/endpoint' >>> json_dict = {"some": "data"} >>> r = requests.post(url, data=json.dumps(payload))
 
  >>> url = 'https://api.github.com/some/endpoint' >>> json_dict = {"some": "data"} >>> r = requests.post(url, json=json_dict) 123456789101112131415161718192021222324252627282930313233343536
 
  | 
4.2 文件上传
格式如下:
r = requests.post(url, files={'file': open('report.xls', 'rb')})
4.2.1 最简单的上传方法
1 2 3 4 5 6 7 8 9 10 11 12 13
   | >>> url = 'http://httpbin.org/post' >>> files = {'file': open('report.xls', 'rb')} >>> r = requests.post(url, files=files) >>> r.text {   ...   "files": {     "file": "<censored...binary...data>"   },   ... }
  123456789101112
 
  | 
4.2.2 显式设置文件名, 文件类型和请求头
1 2 3 4 5 6 7 8 9 10 11 12
   | >>> url = 'http://httpbin.org/post' >>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})} >>> r = requests.post(url, files=files) >>> r.text {   ...   "files": {     "file": "<censored...binary...data>"   },   ... } 1234567891011
 
  | 
五. 响应状态码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
   | >>> r = requests.get('https://www.baidu.com') >>> r.status_code   200 >>> r.status_code == requests.codes.ok   True >>> bad_r = requests.get("http://httpbin.org/status/404") >>> bad_r.status_code 404 >>> bad_r.raise_for_status()    Traceback (most recent call last):   File "<stdin>", line 1, in <module>   File "D:\Anaconda3\lib\site-packages\requests\models.py", line 940, in raise_for_status     raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: NOT FOUND for url: http://httpbin.org/sta tus/404 >>> r.raise_for_status()   >>> 1234567891011121314151617
 
  | 
六. 响应头
1 2 3 4 5 6 7 8 9
   | >>> r.headers {'Cache-Control': 'private, no-cache, no-store, proxy-revalidate, no-transform', 'Connecti on': 'keep-alive', 'Content-Encoding': 'gzip', 'Content-Type': 'text/html', 'Date': 'Mon, 03 Aug 2020 09:15:49 GMT', 'Last-Modified': 'Mon, 23 Jan 2017 13:23:51 GMT', 'Pragma': 'no -cache', 'Server': 'bfe/1.0.8.18', 'Set-Cookie': 'BDORZ=27315; max-age=86400; domain=.baid u.com; path=/', 'Transfer-Encoding': 'chunked'} >>> r.headers.get('content-type')   'text/html' 12345678
 
  | 
七. Cookie
7.1 快速访问cookies
1 2 3 4 5
   | >>> url = 'http://example.com/some/cookie/setting/url' >>> r = requests.get(url) >>> r.cookies['example_cookie_name'] 'example_cookie_value' 1234
 
  | 
7.2 发送cookies到服务器, 使用cookies参数
1 2 3 4 5 6 7 8 9 10 11
   | >>> url = 'http://example.com/some/cookie/setting/url' >>> r = requests.get(url) >>> r.cookies <RequestsCookieJar[]> >>> r.cookies.get('example_cookie_name') >>> test_cookies = dict(cookes_are='working') >>> r = requests.get(url, cookies=test_cookies) >>> r.text '...省略几千字...' >>> 12345678910
 
  | 
7.3 Cookies返回对象
对象为RequestsCookiesJar, 和字典非常相似, 也可以将jar传到cookies参数中
1 2 3 4 5 6 7 8
   | >>> jar = requests.cookies.RequestsCookieJar() >>> jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies') >>> jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere') >>> url = 'http://httpbin.org/cookies' >>> r = requests.get(url, cookies=jar) >>> r.text '{"cookies": {"tasty_cookie": "yum"}}' 1234567
 
  | 
八. 请求超时处理(生产代码必须使用这一参数.)
1 2 3 4 5
   |  requests.get(url, timeout=(3, 7)) requests.get(url, timeout=10)
  1234
 
  | 
set 限制解除

70%
搜索
登入
python requests 进阶用法

分类专栏: python进阶 文章标签: python requests
版权
python进阶专栏收录该内容
56 篇文章2 订阅
订阅专栏
文章目录
一. 会话对象
1.1会话对象可以跨请求保持某些参数,
会话对象具有requests API的所有方法.
1 2 3 4 5 6 7 8
   | >>> import requests >>> s = requests.Session() >>> s.get("http://httpbin.org/cookies/set/sessioncookie/123456789") <Response [200]> >>> r = s.get("http://httpbin.org/cookies") >>> r.text '{\n  "cookies": {\n    "sessioncookie": "123456789"\n  }\n}\n' 1234567
 
  | 
1.2 会话可以为请求方法提供缺省数据
1 2 3 4 5 6 7 8
   | >>> s = requests.Session() >>> s.auth("user", "pass") 	...因为没有真正验证登录, 所以会报错... >>> s.headers.update({"x-test": "true"}) >>> s.get('http://httpbin.org/headers', headers={'x-test2': 'true'}) <Response [200]>
  1234567
 
  | 
1.3 参数合并
传递给请求方法的字典都会与已设置的会话层数据合并. 方法层的参数覆盖会话的参数. 方法级别的参数不会被跨请求保持
1 2 3 4 5 6 7 8 9
   | >>> s = requests.Session() >>> r = s.get('http://httpbin.org/cookies', cookies={'from-my': 'browser'}) >>> r.text '{\n  "cookies": {\n    "from-my": "browser"\n  }\n}\n' >>> r = s.get('http://httpbin.org/cookies') >>> r.text '{\n  "cookies": {}\n}\n' >>> 12345678
 
  | 
1.4 建议使用的代码方式:上下文管理器
1 2 3 4 5 6 7 8 9 10 11
   | with requests.Session() as s:     s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
 
  with requests.Session() as s: 	resp = s.request(method="get|post", url=url)
 
  content_type = ["application/json", "application/stream", "text/explain"]
  12345678910
 
  | 
未完待续…
python常用
1 2 3 4 5 6 7 8 9 10 11
   | 
  1,多引用类库 2,数据结构位置 3,尽量减少使用for 4、避免使用全局变量 5、增加列表推导式(List Comprehension)list.apend 6、用xrange()替换range() 7、使用生成器(Generators)生成器以块计算数据。 8、用Join连接字符串串 i
 
 
  | 
set 限制解除
python笔记——split()函数详解
一、split()
- 语法
 
- str.split(str=””, num=string.count(str))
 
- 参数
 
- str 分隔符,默认为所有的空字符,包括空格、换行(\n)、制表符(\t)等。
 - num – 分割次数。默认为 -1, 即分隔所有。
 - 注意:当使用空格作为分隔符时,对于中间为空的项会自动忽略
 
- 例子
 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
   | >>> str = "Line1-abcdef \nLine2-abc Line4-abcd" >>> print(str.split()) ['Line1-abcdef', 'Line2-abc', 'Line4-abcd']    123 >>> str = "Line1-abcdef \nLine2-abc Line4-abcd" >>> (a,b,c) = str.split() >>> print(a) >>> print(b) >>> print(c) Line1-abcdef Line2-abc Line4-abcd                     12345678 >>> str="hello boy<[www.baidu.com]>byebye" >>> print(str.split("[")[1].split("]")[0]) www.baidu.com                   123
 
  | 
二、os.path.split()
- 语法
 
- 参数
 
- 如果PATH传入的是一个目录和文件名,则输出路径和文件名
 - 如果给出的是一个目录名,则输出路径和为空文件名
 
- 例子
 
1 2 3 4 5
   | >>> import os >>> file_path = "E:\\desktop\\cangjie.avi" >>> print(os.path.split(file_path)) ('E:\\desktop', 'cangjie.avi') 1234
 
  |