Make the python requests work via socks proxy on CentOS server

让爬虫脚本拥有翻墙能力是很重要的:)

在CentOS上安装Shadowsocks

系统环境

1
2
3
4
5
6
7
8
9
$ cat /etc/issue
CentOS release 6.6 (Final)
Kernel \r on an \m
$ python --version
Python 3.5.2
$ pip --version
pip 8.1.2 from /usr/local/lib/python3.5/site-packages (python 3.5)

通过pip安装ShadowSocks

1
2
3
4
5
6
7
8
9
$ pip install shadowsocks
Collecting shadowsocks
Using cached shadowsocks-2.8.2.tar.gz
Building wheels for collected packages: shadowsocks
Running setup.py bdist_wheel for shadowsocks ... done
Stored in directory: /root/.cache/pip/wheels/c9/d8/ff/5425932823af361970658e9421b4d53ac50b08dcbe6fd41e5f
Successfully built shadowsocks
Installing collected packages: shadowsocks
Successfully installed shadowsocks-2.8.2

创建配置文件/etc/shadowsocks.json

1
2
3
4
5
6
7
8
9
10
11
12
13
$ cat <<EOF > /etc/shadowsocks.json
{
"server":"your_server_ip", #ss服务器IP
"server_port":your_server_port, #服务器端口
"local_address": "127.0.0.1", #本地IP
"local_port":1080, #本地端口
"password":"your_server_passwd",#连接ss密码
"timeout":300, #等待超时
"method":"aes-256-cfb", #加密方式
"fast_open": false, #true或false。如果你的服务器Linux内核在3.7+,可以开启fast_open以降低延迟。开启方法:echo 3 > /proc/sys/net/ipv4/tcp_fastopen 开启之后,将fast_open的配置设置为 true 即可
"workers": 1 # 工作线程数
}
EOF

启动ShadowSocks客户端

1
$ nohup sslocal -c /etc/shadowsocks.json /dev/null 2>&1 &

将Shadowsocks客户端加入开机自启动

1
$ echo "nohup sslocal -c /etc/shadowsocks.json /dev/null 2>&1 &" >> /etc/rc.local

测试代理

1
2
3
4
$ curl --socks5 127.0.0.1:1080 http://httpbin.org/ip
{
"origin": "xx.xx.xx.xx" #如果这个IP是你Shadowsocks服务器的IP就OK了。
}

requests使用Socks代理

安装requests和PySocks

1
$ pip install 'requests[socks]'

用法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import requests
proxies = {
'http': 'socks5://user:pass@host:port',
'https': 'socks5://user:pass@host:port'
}
requests.get('https://httpbin.org/ip', proxies=proxies)
### 通过Python交互式Shell测试代理
>>> import requests
>>> proxies = {
... 'http': 'socks5:127.0.0.1:1080',
... 'https': 'socks5:127.0.0.1:1080'
... }
>>> requests.get('https://httpbin.org/ip', proxies=proxies)
<Response [200]> #HTTP返回码200,说明代理OK

Reference

http://www.kickshaw.me/2/
http://overtrue.me/articles/2016/03/shadowsocks-on-server.html