Using ProxySelector to take control of what proxies to use and when

While helping some developers work with a code that I wrote, that used HttpClient, I was confronted with this problem : their existing code requires to work with a socks proxy, while the functionality that my code was providing would not work if sitting behind a proxy. The JVM was instructed to use socks via the socksProxyHost property.

After reading this excellent guide on proxies in Java, I decided to try using the ProxySelector mechanism. The results were awesome!!

Using ProxySelector gives us the flexibility to :
  1. decide whether a proxy should be used or not for a URI being connected to. You can choose not to use a proxy altogether.
  2. specify what proxies (yes multiple!) to use - including varying protocols in each proxy
  3. manage failures when connecting to proxy servers
So here is what I did :
  1. Extended a new class from Java's java.net.ProxySelector
  2. The select method of this class (which is an override of the abstract method from the ProxySelector class), would be called each time Java tries to make a network connection - querying for the proxy to be used for that connection. The URI being connected to is passed to the method. I checked the attributes of this URI and if matched the hostname that my code used to connect to (which did not work via a proxy), I returned a java.net.Proxy.NO_PROXY to signify that no proxy should be used for this URI.
  3. For all other URIs, I did not want to fidget with the user's settings, so I delegated the proxy decision making to the default ProxySelector that ships with Java
Let me put out the code I used to experiment this and everything should be crystal clear  :

/**
 *  Created by : Madhur Tanwani
 *  Created on : May 28, 2010
 */
package edu.madhurtanwani.net;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.Proxy;
import java.net.ProxySelector;
import java.net.SocketAddress;
import java.net.URI;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.methods.GetMethod;

/**
 *
 * @author Madhur Tanwani (madhurt@yahoo-inc.com)
 */
class CustomProxySelector extends ProxySelector {

    private final ProxySelector def;

    CustomProxySelector(ProxySelector aDefault) {
        this.def = aDefault;
    }

    @Override
    public List<Proxy> select(URI uri) {
        System.out.println("select for URL : " + uri);

        if ("http".equalsIgnoreCase(uri.getScheme()) || "socket".equalsIgnoreCase(uri.getScheme())) {

            if (uri.getHost().startsWith("mail")) {
                List<Proxy> proxyList = new ArrayList<Proxy>();
                proxyList.add(Proxy.NO_PROXY);
                System.out.println("NO PROXY TO BE USED");
                return proxyList;
            }
        }

        //Proxy proxy = new Proxy(Proxy.Type.SOCKS, new InetSocketAddress("socks.corp.yahoo.com", 1080));
        List<Proxy> select = def.select(uri);
        System.out.println("Default proxy list : " + select);
        return select;
    }

    @Override
    public void connectFailed(URI uri, SocketAddress sa, IOException ioe) {
        throw new UnsupportedOperationException("Not supported yet.");
    }
}

/**
 *
 * @author Madhur Tanwani (madhurt@yahoo-inc.com)
 */
public class Socks_Public {

    private static final String URL_BEHIND_SOCKS = "http://yahoo.com";
    private static final String URL_NO_SOCKS = "http://mail.yahoo.com";

    public static void main(String[] args) throws Exception {
        ProxySelector.setDefault(new CustomProxySelector(ProxySelector.getDefault()));

        System.out.println("\n\n++++++++++++++++++++USING HTTP CLIENT++++++++++++++++++++");

        HttpClient client = new HttpClient();

        System.out.println("\nURL : " + URL_NO_SOCKS);
        GetMethod get = new GetMethod(URL_NO_SOCKS);
        int response = client.executeMethod(get);
        System.out.println("Response code : " + response + " , Response : " + get.getResponseBodyAsString().substring(0, 50));

        System.out.println("\nURL : " + URL_BEHIND_SOCKS);
        get = new GetMethod(URL_BEHIND_SOCKS);
        response = client.executeMethod(get);
        System.out.println("Response code : " + response + " , Response : " + get.getResponseBodyAsString().substring(0, 50));


        System.out.println("\n\n++++++++++++++++++++USING JAVA URL CONNECTION++++++++++++++++++++");

        System.out.println("\nURL : " + URL_NO_SOCKS);
        URI uri = new URI(URL_NO_SOCKS);
        InputStream is = uri.toURL().openStream();
        BufferedReader rdr = new BufferedReader(new InputStreamReader(is));
        for (int i = 0; i < 2; i++) {
            System.out.println(rdr.readLine());
        }
        is.close();


        System.out.println("\nURL : " + URL_BEHIND_SOCKS);
        uri = new URI(URL_BEHIND_SOCKS);
        is = uri.toURL().openStream();
        rdr = new BufferedReader(new InputStreamReader(is));
        for (int i = 0; i < 2; i++) {
            System.out.println(rdr.readLine());
        }
        is.close();
    }
}

And here is the output of the code run :
++++++++++++++++++++USING HTTP CLIENT++++++++++++++++++++
URL : http://mail.yahoo.com
select for URL : socket://mail.yahoo.com:80
NO PROXY TO BE USED
select for URL : socket://login.yahoo.com:443
Default proxy list : [SOCKS @ socks.corp.yahoo.com:1080]
Response code : 200 , Response : 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01

URL : http://yahoo.com
select for URL : socket://yahoo.com:80
Default proxy list : [SOCKS @ socks.corp.yahoo.com:1080]
select for URL : socket://www.yahoo.com:80
Default proxy list : [SOCKS @ socks.corp.yahoo.com:1080]
Response code : 200 , Response : <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"


++++++++++++++++++++USING JAVA URL CONNECTION++++++++++++++++++++

URL : http://mail.yahoo.com
select for URL : http://mail.yahoo.com/
NO PROXY TO BE USED
<!-- l03.member.in2.yahoo.com uncompressed/chunked Fri May 28 20:32:24 IST 2010 -->
null

URL : http://yahoo.com
select for URL : http://yahoo.com/
Default proxy list : [SOCKS @ socks.corp.yahoo.com:1080]
select for URL : http://www.yahoo.com/
Default proxy list : [SOCKS @ socks.corp.yahoo.com:1080]
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">



1 comment:

Ernesto said...

Mr. Tanwani: I am writing a Java program just like a bot. I am using : conn = (HttpURLConnection)url.openConnection(proxy); but seems it is not the correct statement to do this work because all the errors msg I have seen . I used conn = (HttpURLConnection)url.openConnection(); and that WORKS!!!! . I think that ProxySelector Class could be the solution. Whats wrong with my approach?, What do you think about this?

 
Stats