One of the goals of the FreedomBox project is to give users back their privacy. No more snooping by Google, Facebook, governments etc.
In order to give users back their privacy two conditions must be met:
- First: It must not be possible for a third party to monitor (or change) the information a user exchanges with a website. In an ideal world this kind of protection should be provided by SSL. Unfortunately the world is not ideal. Companies that issue SSL certificates can be incompetent, or forced by a government to give out false certificates. It is not clear why they should have our trust.
- Second: It must not be possible for a third party to monitor which website a user contacts. Knowing which websites a user visits can be quite revealing and possibly dangerous for the user (think of someone in China who surfs to MaoWasACriminal.org)
One project that tackles both privacy conditions is the Tor project. Tor stands for The Onion Router. It is a special type of network in which all traffic is encrypted and there is never a direct connection from a user to a website (details are here). Protecting privacy the way Tor does comes at a price, connecting to a website using the Tor network is much slower than a direct connection.
How fast (slow) is Tor? At the Tor metrics portal you can find all kinds of interesting statistics that are updated every day. One statistic I find interesting is the time it takes to download files of different sizes. Although this measurement is a good indication of the speed of the network, it does not measure the user experience. Normal websites download at least 10 files and do so on multiple parallel connections. To give an (extreme) example: http://edition.cnn.com contacts 11 different domains using 64 connections, on these connections a total of 145 HTTP requests (wow!) are issued.
I really like the Tor project and I would love to see it improve (especially its speed). One thing that’s missing from the project is a way to accurately measure what is going on between Tor and the browser. I believe this information can be quite useful and therefore wrote a special measuring proxy: MITM.
What is MITM?
MITM stands for Monitor In The Middle. It is basically a Socks 5 proxy that is placed between the browser and Tor. The browser connects to MITM, MITM connects to Tor. All communication between the browser and Tor is intercepted and decoded.
MITM decodes both the Socks protocol as the HTTP protocol and collects the following data:
- Time of the available authentication methods request.
- Time of the selected authentication method response.
- Time, domain and port of the connect request.
- Time of the connect response.
- Time and lines of a request header.
- Time and data of a request body.
- Time and lines of a response header.
- Time and data of a response body (or each body chunk).
All collected data is converted into a nice report that can be viewed by contacting an internal mini webserver that lives at: http://mitm.proxy.
The report consists of the following sections:
A time graph of all connections. The graph uses SVG so you need a recent browser to display the results.
In the graph you see the following information:
- Each line is a separate connection.
- Blue bars show the total time to handle the Socks protocol (DNS request+TCP connection).
- Gray bars show the time between a Http request and the start of a response (waiting time).
- Green bars show the time between the start and end of a Http response (transfer time).
- Green dots are displayed when there is no response body, or if the time of the response body is too short to display.
- On the right side of the graph are a number of circles that indicate if a connection is open or closed.
Each bar or circle in the graph has a tool-tip and a link connected to it. If you click on a bar you navigate to the connection information.
This section shows the following statistics of the servers that were connected
|Connections||8 (still open : 6)|
|TotalConnectionTime||there are still open connections|
This information can be quite substantial. In the table below you only see the first lines of a request and response. You can use the settings page to change this to all the header lines.
|001.292||HttpRequestStart||GET / HTTP/1.1|
|004.293||HttpResponseStart||HTTP/1.1 200 OK|
|005.333||HttpRequestStart||GET /wp-includes/js/l10n.js?ver=20101110 HTTP/1.1|
|007.318||HttpResponseStart||HTTP/1.1 200 OK|
|010.390||HttpRequestStart||GET /wp-content/themes/graphene/images/sprite_h.png HTTP/1.1|
|012.338||HttpResponseStart||HTTP/1.1 200 OK|
How to use MITM.
MITM is a Python 2.x program that uses the Twisted event-driven networking framework. You can download MITM by pressing this link. MITM has been tested with OpenSuse 11.3 and Firefox > 4. You need at least Firefox version 4 to see the SVG graphics.
In order to use MITM effectively you need to create and use a special Firefox profile. Use the following steps:
- Close all Firefox instances.
- In a terminal window type: firefox -ProfileManager.
- In the dialog press: Create Profile and answer the questions (more details)
If you named your profile “mitm” you can start using this profile by typing:
firefox -P mitm -no-remote
The first time you start you have to change the configuration of the profile:
Remove all feeds if they are present.
Choose: Manual proxy configuration,
Socks Host 127.0.0.1 Port 9000
Clear the “No proxy for” setting.
Save these settings and open the about:config page. Filter on socks_remote_dns and change this setting to true
On the privacy tab choose: Never remember history
With these settings in place and TOR running you can start mitm_proxy.py. To display the results I usually open a second tab and navigate to http://mitm.proxy.
Some extra functions.
MITM does not only monitor all communication but it can also delay Socks connect requests and HTTP requests. This function can be used together with some bandwidth limiting software to simulate a slow TOR connection.
MITM can also take measurements without connecting to Tor. On the settings page there is an option to use an internal Socks5 server.
Limitations and final words.
The HTTP specification(s) are large documents with many sections that are open to interpretation (confusing). This makes it difficult to write an HTTP decoder that’s 100% bulletproof. I did not try to handle all possible aspects of the specification. I just wanted something uncomplicated that is correct most of the time. Feel free to contact me if you find a problem.
MITM is my first Python program. I tried to make it as Pythonic as possible. Don’t hesitate to contact me if something in the code is less optimal. I like to learn more.