NGINX 反爬策略

  |  
阅读次数
  |  
字数 221
  |  
时长 ≈ 1 分钟

1)配置agent_deny.conf

在nginx的conf目录下新增agent_deny.conf配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
if ($http_user_agent ~* "qihoobot|Baiduspider|Googlebot|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|Adsbot-Google|Feedfetcher-Google|Yahoo! Slurp|Yahoo! Slurp China|YoudaoBot|Sosospider|Sogou spider|Sogou web spider|MSNBot|ia_archiver|Tomato Bot|Catall Spider|AcoiRobot|Yisou|bingbot|360Spider") { 
return 403;
}

if ($http_user_agent ~ "WinHttp|WebZIP|FetchURL|node-superagent|FeedDemon|Jullo|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|Feedly|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|MJ12bot|heritrix|EasouSpider|Ezooms|BOT/0.1|YandexBot|FlightDeckReports|Linguee Bot|iaskspider|^$") {
return 403;
}

if ($request_method !~ ^(GET|POST)$) {
return 403;
}

if ($http_user_agent ~* (Python|Wget|Scrapy|Spider)) {
return 403;
}

2)修改反向代理配置文件

在具体的location下面配置 include agent_deny.conf
如:

1
2
3
4
5
6
7
8
9
10
11
12
13
# Server
server {
listen 80;
server_name localhost;
charset utf-8;

location / {
# 此设置在 http, server, location 节点都可以设置;设置了 nodelay 将不会等待
limit_req zone=ratelimit burst=20 nodelay;
include agent_deny.conf;
proxy_pass http://127.0.0.1:8080/java-web;
}
}