html – Python follow redirects and then download the page?
html – Python follow redirects and then download the page?
You might be better off with Requests library which has better APIs for controlling redirect handling:
https://requests.readthedocs.io/en/master/user/quickstart/#redirection-and-history
Requests:
https://pypi.org/project/requests/ (urllib replacement for humans)
Use requests
as the other answer states, here is an example. The redirect will be in r.url
. In the example below the http
is redirected to https
For HEAD:
In [1]: import requests
...: r = requests.head(http://github.com, allow_redirects=True)
...: r.url
Out[1]: https://github.com/
For GET:
In [1]: import requests
...: r = requests.get(http://github.com)
...: r.url
Out[1]: https://github.com/
Note for HEAD you have to specify allow_redirects
, if you dont you can get it in the headers but this is not advised.
In [1]: import requests
In [2]: r = requests.head(http://github.com)
In [3]: r.headers.get(location)
Out[3]: https://github.com/
To download the page you will need GET, you can then access the page using r.content