HTTP请求的整体处理过程大体可以理解为,
然而,当为系统设置了代理的时候,整个数据流都会经过代理服务器。那么代理设置究竟是如何工作的呢?它是如何影响我们上面看到的HTTP请求的处理过程的呢?是在操作系统内核的TCP实现中的策略呢,还是HTTP stack中的机制?这里我们通过OkHttp3中的实现来一探究竟。
代理分为两种类型,一种是SOCKS代理,另一种是HTTP代理。对于SOCKS代理,在HTTP的场景下,代理服务器完成TCP数据包的转发工作。而HTTP代理服务器,在转发数据之外,还会解析HTTP的请求及响应,并根据请求及响应的内容做一些处理。这里看一下OkHttp中对代理的处理。
在Java中,通过 java.net.Proxy 类描述一个代理服务器:
public class Proxy {
/**
* Represents the proxy type.
*
* @since 1.5
*/
public enum Type {
/**
* Represents a direct connection, or the absence of a proxy.
*/
DIRECT,
/**
* Represents proxy for high level protocols such as HTTP or FTP.
*/
HTTP,
/**
* Represents a SOCKS (V4 or V5) proxy.
*/
SOCKS
};
private Type type;
private SocketAddress sa; /**
* A proxy setting that represents a {@code DIRECT} connection,
* basically telling the protocol handler not to use any proxying.
* Used, for instance, to create sockets bypassing any other global
* proxy settings (like SOCKS):
* <P>
* {@code Socket s = new Socket(Proxy.NO_PROXY);}
*
*/
public final static Proxy NO_PROXY = new Proxy();
// Creates the proxy that represents a {@code DIRECT} connection.
private Proxy() {
type = Type.DIRECT;
sa = null;
}
/**
* Creates an entry representing a PROXY connection.
* Certain combinations are illegal. For instance, for types Http, and
* Socks, a SocketAddress <b>must</b> be provided.
* <P>
* Use the {@code Proxy.NO_PROXY} constant
* for representing a direct connection.
*
* @param type the {@code Type} of the proxy
* @param sa the {@code SocketAddress} for that proxy
* @throws IllegalArgumentException when the type and the address are
* incompatible
*/
public Proxy(Type type, SocketAddress sa) {
if ((type == Type.DIRECT) || !(sa instanceof InetSocketAddress))
throw new IllegalArgumentException("type " + type + " is not compatible with address " + sa);
this.type = type;
this.sa = sa;
}
/**
* Returns the proxy type.
*
* @return a Type representing the proxy type
*/
public Type type() {
return type;
}
/**
* Returns the socket address of the proxy, or
* {@code null} if its a direct connection.
*
* @return a {@code SocketAddress} representing the socket end
* point of the proxy
*/
public SocketAddress address() {
return sa;
}
......
}
只用代理的类型及代理服务器地址即可描述代理服务器的全部。对于HTTP代理,代理服务器地址可以通过域名和IP地址等方式来描述。
在Java中通过ProxySelector为一个特定的URI选择代理:
public abstract class ProxySelector {
......
/**
* Selects all the applicable proxies based on the protocol to
* access the resource with and a destination address to access
* the resource at.
* The format of the URI is defined as follow:
* <UL>
* <LI>http URI for http connections</LI>
* <LI>https URI for https connections
* <LI>{@code socket://host:port}<br>
* for tcp client sockets connections</LI>
* </UL>
*
* @param uri
* The URI that a connection is required to
*
* @return a List of Proxies. Each element in the
* the List is of type
* {@link java.net.Proxy Proxy};
* when no proxy is available, the list will
* contain one element of type
* {@link java.net.Proxy Proxy}
* that represents a direct connection.
* @throws IllegalArgumentException if the argument is null
*/
public abstract List<Proxy> select(URI uri);
/**
* Called to indicate that a connection could not be established
* to a proxy/socks server. An implementation of this method can
* temporarily remove the proxies or reorder the sequence of
* proxies returned by {@link #select(URI)}, using the address
* and the IOException caught when trying to connect.
*
* @param uri
* The URI that the proxy at sa failed to serve.
* @param sa
* The socket address of the proxy/SOCKS server
*
* @param ioe
* The I/O exception thrown when the connect failed.
* @throws IllegalArgumentException if either argument is null
*/
public abstract void connectFailed(URI uri, SocketAddress sa, IOException ioe);
}
这个组件会读区系统中配置的所有代理,并根据调用者传入的URI,返回特定的代理服务器集合。由于不同系统中,配置代理服务器的方法,及相关配置的保存机制不同,该接口在不同的系统中有着不同的实现。
OkHttp3中抽象出Route来描述网络数据包的传输路径,最主要还是要描述直接与其建立TCP连接的目标端点。
public final class Route {
final Address address;
final Proxy proxy;
final InetSocketAddress inetSocketAddress;
public Route(Address address, Proxy proxy, InetSocketAddress inetSocketAddress) {
if (address == null) {
throw new NullPointerException("address == null");
}
if (proxy == null) {
throw new NullPointerException("proxy == null");
}
if (inetSocketAddress == null) {
throw new NullPointerException("inetSocketAddress == null");
}
this.address = address;
this.proxy = proxy;
this.inetSocketAddress = inetSocketAddress;
}
public Address address() {
return address;
}
/**
* Returns the {@link Proxy} of this route.
*
* <strong>Warning:</strong> This may disagree with {@link Address#proxy} when it is null. When
* the address's proxy is null, the proxy selector is used.
*/
public Proxy proxy() {
return proxy;
}
public InetSocketAddress socketAddress() {
return inetSocketAddress;
}
/**
* Returns true if this route tunnels HTTPS through an HTTP proxy. See <a
* href="http://www.ietf.org/rfc/rfc2817.txt">RFC 2817, Section 5.2</a>.
*/
public boolean requiresTunnel() {
return address.sslSocketFactory != null && proxy.type() == Proxy.Type.HTTP;
}
......
}
主要通过 代理服务器的信息proxy ,及 连接的目标地址 描述路由。 连接的目标地址inetSocketAddress 根据代理类型的不同而有着不同的含义,这主要是由不同代理协议的差异而造成的。对于无需代理的情况, 连接的目标地址inetSocketAddress 中包含HTTP服务器经过了DNS域名解析的IP地址及协议端口号;对于SOCKS代理,其中包含HTTP服务器的域名及协议端口号;对于HTTP代理,其中则包含代理服务器经过域名解析的IP地址及端口号。
HTTP请求处理过程中所需的TCP连接建立过程,主要是找到一个Route,然后依据代理协议的规则与特定目标建立TCP连接。对于无代理的情况,是与HTTP服务器建立TCP连接;对于SOCKS代理及HTTP代理,是与代理服务器建立TCP连接,虽然都是与代理服务器建立TCP连接,而SOCKS代理协议与HTTP代理协议做这个动作的方式又会有一定的区别。
借助于域名解析做负载均衡已经是网络中非常常见的手法了,因而,常常会有相同域名对应不同IP地址的情况。同时相同系统也可以设置多个代理,这使Route的选择变得复杂起来。
在OkHttp中,对Route连接失败有一定的错误处理机制。OkHttp会逐个尝试找到的Route建立TCP连接,直到找到可用的那一个。这同样要求,对Route信息有良好的管理。
OkHttp3借助于 RouteSelector
类管理所有的路由信息,并帮助选择路由。 RouteSelector
主要完成3件事:
收集所有可用的路由。
public final class RouteSelector {
private final Address address;
private final RouteDatabase routeDatabase;
/* The most recently attempted route. */
private Proxy lastProxy;
private InetSocketAddress lastInetSocketAddress;
/* State for negotiating the next proxy to use. */
private List<Proxy> proxies = Collections.emptyList();
private int nextProxyIndex;
/* State for negotiating the next socket address to use. */
private List<InetSocketAddress> inetSocketAddresses = Collections.emptyList();
private int nextInetSocketAddressIndex;
/* State for negotiating failed routes */
private final List<Route> postponedRoutes = new ArrayList<>();
public RouteSelector(Address address, RouteDatabase routeDatabase) {
this.address = address;
this.routeDatabase = routeDatabase;
resetNextProxy(address.url(), address.proxy());
}
......
/** Prepares the proxy servers to try. */
private void resetNextProxy(HttpUrl url, Proxy proxy) {
if (proxy != null) {
// If the user specifies a proxy, try that and only that.
proxies = Collections.singletonList(proxy);
} else {
// Try each of the ProxySelector choices until one connection succeeds. If none succeed
// then we'll try a direct connection below.
proxies = new ArrayList<>();
List<Proxy> selectedProxies = address.proxySelector().select(url.uri());
if (selectedProxies != null) proxies.addAll(selectedProxies);
// Finally try a direct connection. We only try it once!
proxies.removeAll(Collections.singleton(Proxy.NO_PROXY));
proxies.add(Proxy.NO_PROXY);
}
nextProxyIndex = 0;
}
收集路由分为两个步骤:第一步收集所有的代理;第二步则是收集特定代理服务器选择情况下的所有 连接的目标地址 。 收集代理的过程如上面的这段代码所示,有两种方式,一是外部通过address传入了代理,此时代理集合将包含这唯一的代理。address的代理最终来源于OkHttpClient,我们可以在构造OkHttpClient时设置代理,来指定由该client执行的所有请求经过特定的代理。 另一种方式是,借助于ProxySelector获取多个代理。ProxySelector最终也来源于OkHttpClient,OkHttp的用户当然也可以对此进行配置。但通常情况下,使用系统默认的ProxySelector,来获取系统中配置的代理。 收集到的所有代理保存在列表 proxies
中。 为OkHttpClient配置Proxy或ProxySelector的场景大概是,需要让连接使用代理,但不使用系统的代理配置的情况。 收集特定代理服务器选择情况下的所有路由,因代理类型的不同而有着不同的过程:
/** Returns the next proxy to try. May be PROXY.NO_PROXY but never null. */
private Proxy nextProxy() throws IOException {
if (!hasNextProxy()) {
throw new SocketException("No route to " + address.url().host()
+ "; exhausted proxy configurations: " + proxies);
}
Proxy result = proxies.get(nextProxyIndex++);
resetNextInetSocketAddress(result);
return result;
}
/** Prepares the socket addresses to attempt for the current proxy or host. */
private void resetNextInetSocketAddress(Proxy proxy) throws IOException {
// Clear the addresses. Necessary if getAllByName() below throws!
inetSocketAddresses = new ArrayList<>();
String socketHost;
int socketPort;
if (proxy.type() == Proxy.Type.DIRECT || proxy.type() == Proxy.Type.SOCKS) {
socketHost = address.url().host();
socketPort = address.url().port();
} else {
SocketAddress proxyAddress = proxy.address();
if (!(proxyAddress instanceof InetSocketAddress)) {
throw new IllegalArgumentException(
"Proxy.address() is not an " + "InetSocketAddress: " + proxyAddress.getClass());
}
InetSocketAddress proxySocketAddress = (InetSocketAddress) proxyAddress;
socketHost = getHostString(proxySocketAddress);
socketPort = proxySocketAddress.getPort();
}
if (socketPort < 1 || socketPort > 65535) {
throw new SocketException("No route to " + socketHost + ":" + socketPort
+ "; port is out of range");
}
if (proxy.type() == Proxy.Type.SOCKS) {
inetSocketAddresses.add(InetSocketAddress.createUnresolved(socketHost, socketPort));
} else {
// Try each address for best behavior in mixed IPv4/IPv6 environments.
List<InetAddress> addresses = address.dns().lookup(socketHost);
for (int i = 0, size = addresses.size(); i < size; i++) {
InetAddress inetAddress = addresses.get(i);
inetSocketAddresses.add(new InetSocketAddress(inetAddress, socketPort));
}
}
nextInetSocketAddressIndex = 0;
}
/**
* Obtain a "host" from an {@link InetSocketAddress}. This returns a string containing either an
* actual host name or a numeric IP address.
*/
// Visible for testing
static String getHostString(InetSocketAddress socketAddress) {
InetAddress address = socketAddress.getAddress();
if (address == null) {
// The InetSocketAddress was specified with a string (either a numeric IP or a host name). If
// it is a name, all IPs for that name should be tried. If it is an IP address, only that IP
// address should be tried.
return socketAddress.getHostName();
}
// The InetSocketAddress has a specific address: we should only try that address. Therefore we
// return the address and ignore any host name that may be available.
return address.getHostAddress();
}
收集一个特定代理服务器选择下的 连接的目标地址 因代理类型的不同而不同,这主要分为3种情况。 对于没有配置代理的情况,会对HTTP服务器的域名进行DNS域名解析,并为每个解析到的IP地址创建 连接的目标地址;对于SOCKS代理,直接以HTTP服务器的域名及协议端口号创建 连接的目标地址;而对于HTTP代理,则会对HTTP代理服务器的域名进行DNS域名解析,并为每个解析到的IP地址创建 连接的目标地址。 这里是OkHttp中发生DNS域名解析唯一的场合。对于使用代理的场景,没有对HTTP服务器的域名做DNS域名解析,也就意味着HTTP服务器的域名解析要由代理服务器完成。 代理服务器的收集是在创建 RouteSelector
完成的;而一个特定代理服务器选择下的 连接的目标地址 收集则是在选择Route时根据需要完成的。
/**
* Returns true if there's another route to attempt. Every address has at least one route.
*/
public boolean hasNext() {
return hasNextInetSocketAddress()
|| hasNextProxy()
|| hasNextPostponed();
}
public Route next() throws IOException {
// Compute the next route to attempt.
if (!hasNextInetSocketAddress()) {
if (!hasNextProxy()) {
if (!hasNextPostponed()) {
throw new NoSuchElementException();
}
return nextPostponed();
}
lastProxy = nextProxy();
}
lastInetSocketAddress = nextInetSocketAddress();
Route route = new Route(address, lastProxy, lastInetSocketAddress);
if (routeDatabase.shouldPostpone(route)) {
postponedRoutes.add(route);
// We will only recurse in order to skip previously failed routes. They will be tried last.
return next();
}
return route;
}
/** Returns true if there's another proxy to try. */
private boolean hasNextProxy() {
return nextProxyIndex < proxies.size();
}
/** Returns true if there's another socket address to try. */
private boolean hasNextInetSocketAddress() {
return nextInetSocketAddressIndex < inetSocketAddresses.size();
}
/** Returns the next socket address to try. */
private InetSocketAddress nextInetSocketAddress() throws IOException {
if (!hasNextInetSocketAddress()) {
throw new SocketException("No route to " + address.url().host()
+ "; exhausted inet socket addresses: " + inetSocketAddresses);
}
return inetSocketAddresses.get(nextInetSocketAddressIndex++);
}
/** Returns true if there is another postponed route to try. */
private boolean hasNextPostponed() {
return !postponedRoutes.isEmpty();
}
/** Returns the next postponed route to try. */
private Route nextPostponed() {
return postponedRoutes.remove(0);
}
RouteSelector
实现了两级迭代器来提供选择路由的服务。
RouteSelector
借助于RouteDatabase
维护失败的路由的信息。/**
* Clients should invoke this method when they encounter a connectivity failure on a connection
* returned by this route selector.
*/
public void connectFailed(Route failedRoute, IOException failure) {
if (failedRoute.proxy().type() != Proxy.Type.DIRECT && address.proxySelector() != null) {
// Tell the proxy selector when we fail to connect on a fresh connection.
address.proxySelector().connectFailed(
address.url().uri(), failedRoute.proxy().address(), failure);
}
routeDatabase.failed(failedRoute);
}
RouteDatabase
是一个简单的容器:
public final class RouteDatabase {
private final Set<Route> failedRoutes = new LinkedHashSet<>();
/** Records a failure connecting to {@code failedRoute}. */
public synchronized void failed(Route failedRoute) {
failedRoutes.add(failedRoute);
}
/** Records success connecting to {@code failedRoute}. */
public synchronized void connected(Route route) {
failedRoutes.remove(route);
}
/** Returns true if {@code route} has failed recently and should be avoided. */
public synchronized boolean shouldPostpone(Route route) {
return failedRoutes.contains(route);
}
}
相关阅读:
网易云新用户大礼包:https://www.163yun.com/gift
本文来自网易实践者社区,经作者韩鹏飞授权发布。