在进行网络请求时,有时需要通过不同的IP地址来发起请求。本文将从技术实现角度,介绍在 Python 程序中切换出口 IP 的几种常见方法,包括代理IP轮换、多网卡绑定等方案,并提供完整的可运行代码。

运行环境

项目版本
操作系统Ubuntu 22.04 / macOS Ventura
Python3.8+
依赖库requests, aiohttp, itertools

方法一:代理IP轮换

这是最常见的方式,通过维护一个代理IP列表,在每次请求时轮换使用。
基础实现

import requests
from itertools import cycle

# 代理IP列表(示例格式)
PROXY_LIST = [
    'http://user:pass@proxy1.example.com:8080',
    'http://user:pass@proxy2.example.com:8080',
    'http://user:pass@proxy3.example.com:8080',
]

def fetch_with_rotation(url: str, max_attempts: int = 3):
    """
    轮换代理IP发送请求
    """
    proxy_cycle = cycle(PROXY_LIST)
    
    for attempt in range(max_attempts):
        proxy = next(proxy_cycle)
        proxies = {'http': proxy, 'https': proxy}
        
        try:
            response = requests.get(url, proxies=proxies, timeout=10)
            response.raise_for_status()
            print(f"使用代理: {proxy} - 状态码: {response.status_code}")
            return response.text
        except requests.exceptions.RequestException as e:
            print(f"代理 {proxy} 失败: {e}")
            continue
    
    raise Exception(f"所有代理尝试失败,共 {max_attempts} 次")

if __name__ == '__main__':
    result = fetch_with_rotation('https://httpbin.org/ip')
    print(f"响应: {result}")

预期输出

使用代理: http://user:pass@proxy1.example.com:8080 - 状态码: 200
响应: {"origin": "203.0.113.45"}

方法二:随机代理选择

从代理池中随机选取一个IP,适用于不需要严格按顺序轮换的场景。

import random
import requests

def fetch_with_random_proxy(url: str):
    """随机选择代理IP"""
    proxy = random.choice(PROXY_LIST)
    proxies = {'http': proxy, 'https': proxy}
    
    response = requests.get(url, proxies=proxies, timeout=10)
    return response.text

# 使用示例
for i in range(5):
    result = fetch_with_random_proxy('https://httpbin.org/ip')
    print(f"第{i+1}次请求: {result}")

方法三:基于失败率的动态切换

在实际应用中,某些代理可能失效。以下代码实现了失败自动切换机制:

import requests
from queue import Queue
from threading import Lock

class ProxyManager:
    """代理管理器,支持失败自动切换"""
    
    def __init__(self, proxy_list: list):
        self.proxy_list = proxy_list.copy()
        self.current_index = 0
        self.lock = Lock()
        self.failure_count = {p: 0 for p in proxy_list}
        self.max_failures = 3
    
    def get_current_proxy(self):
        with self.lock:
            proxy = self.proxy_list[self.current_index]
            return proxy
    
    def switch_to_next(self):
        with self.lock:
            self.current_index = (self.current_index + 1) % len(self.proxy_list)
            return self.proxy_list[self.current_index]
    
    def report_failure(self, proxy: str):
        """报告代理失败,累计失败次数"""
        self.failure_count[proxy] += 1
        if self.failure_count[proxy] >= self.max_failures:
            print(f"代理 {proxy} 失败次数达上限,将被移除")
            if proxy in self.proxy_list:
                self.proxy_list.remove(proxy)
        self.switch_to_next()
    
    def fetch(self, url: str, max_switches: int = 5):
        """使用管理器发送请求"""
        for _ in range(max_switches):
            proxy = self.get_current_proxy()
            proxies = {'http': proxy, 'https': proxy}
            
            try:
                response = requests.get(url, proxies=proxies, timeout=10)
                response.raise_for_status()
                return response.text
            except requests.exceptions.RequestException as e:
                print(f"代理 {proxy} 请求失败: {e}")
                self.report_failure(proxy)
        
        raise Exception("所有代理均失效")

# 使用示例
manager = ProxyManager(PROXY_LIST)
result = manager.fetch('https://httpbin.org/ip')
print(f"最终结果: {result}")

方法四:异步环境下的代理轮换

对于高并发场景,使用 aiohttp 配合异步轮换:

import asyncio
import aiohttp
from itertools import cycle

async def fetch_async(session: aiohttp.ClientSession, url: str, proxy: str):
    """异步请求函数"""
    try:
        async with session.get(url, proxy=proxy, timeout=10) as resp:
            return await resp.text()
    except Exception as e:
        return f"错误: {e}"

async def batch_fetch_with_proxy_rotation(urls: list, proxy_list: list):
    """批量异步请求,自动轮换代理"""
    proxy_cycle = cycle(proxy_list)
    connector = aiohttp.TCPConnector(limit=10)
    
    async with aiohttp.ClientSession(connector=connector) as session:
        tasks = []
        for url in urls:
            proxy = next(proxy_cycle)
            tasks.append(fetch_async(session, url, proxy))
        
        results = await asyncio.gather(*tasks)
        return results

# 使用示例
async def main():
    urls = ['https://httpbin.org/ip'] * 10
    results = await batch_fetch_with_proxy_rotation(urls, PROXY_LIST)
    for i, res in enumerate(results):
        print(f"请求{i+1}: {res[:100]}")

if __name__ == '__main__':
    asyncio.run(main())

方法五:多网卡环境下的源IP绑定

在服务器配置了多个物理网卡或多个IP地址的情况下,可以通过绑定源IP来实现IP切换:

import socket
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.connection import create_connection

class SourceIPAdapter(HTTPAdapter):
    """自定义适配器,绑定源IP地址"""
    
    def __init__(self, source_ip: str, **kwargs):
        self.source_ip = source_ip
        super().__init__(**kwargs)
    
    def init_poolmanager(self, *args, **kwargs):
        kwargs['source_address'] = (self.source_ip, 0)
        super().init_poolmanager(*args, **kwargs)

def fetch_with_source_ip(url: str, source_ip: str):
    """使用指定源IP发送请求"""
    session = requests.Session()
    adapter = SourceIPAdapter(source_ip)
    session.mount('http://', adapter)
    session.mount('https://', adapter)
    
    response = session.get(url, timeout=10)
    return response.text

# 使用示例(假设服务器绑定了多个IP)
# result = fetch_with_source_ip('https://httpbin.org/ip', '192.168.1.100')
# print(result)

方法对比总结

注意事项

  1. 代理IP来源:本文示例中的代理地址为占位符,实际使用时需替换为有效的代理服务地址。
  2. 请求频率:合理控制请求频率,避免对目标服务器造成过大压力。
  3. 异常处理:生产环境建议增加更完善的异常处理和日志记录。
  4. 合规使用:请确保使用代理IP的行为符合目标网站的条款以及相关法律法规。

标签: none

添加新评论