CS 352 Spring 2006

Programming Assignment 2

Due Date: 29 Mar, 2006 11:59pm (Extended (3))

Sections

Updates

Before jumping into the project I would like to mention one thing. Most of the time your first attempt at a software project is either too slow, too buggy, or is difficult to maintain/expand. Be prepared to throw away the first versions of most of your code. You can even plan to have them as throw-aways. You don't waste time doing this because you learn where you can make improvements in your final version.

Theory of Operation

The second project will expand upon what you did for the first. Your client must now open a .torrent file, contact the tracker as well as multiple peers and be able to upload and download to the peers simultaneously. This means you will have to manage the state of several connections and work with multiple threads. Your client should also accept incoming connections from the peers at IP addresses listed above.

If your code for project 1 was well thought-out, it should be easy to expand your program for multiple downloads, and uploading should not be much more difficult than listening for more types of packets and sending new ones. You will not have to worry about choking any peers, but like the first project, you will have to make sure to unchoke them and keep them up to date on what pieces your client has downloaded.

Assignment Specifics

Your assignment should basically do the following:

  1. Take as a command-line argument the name of the .torrent file to be loaded and the name of the file to save the data to. For example:
    					java mybtclient somefile.torrent picture.jpg
    		
  2. Pass the name of the .torrent file to the TorrentFileHandler class as a String and store the returned TorrentFile object.
  3. Open a TCP socket on the local machine and send an HTTP GET request to the tracker at the IP address and port specified by the TorrentFile object.
  4. Capture the response from the tracker and decode it in order to get the list of peers. From this list of peers, use only a pre-selected list of peers to be specified . You must extract these IP from the list, hard-coding it is not acceptable, except hard-coding the comparison.
  5. Open TCP connections to the peers and be able to download simultaneously from several at one time.
  6. Download one or more pieces of the file and verify its SHA-1 hash against the hash stored in the TorrentFile object.
  7. After a piece is downloaded and verified, the peers are notified that you have completed the piece.
  8. Other peers should be able to request pieces from your client that you have verified, and your client should send those pieces to the peers.
  9. Repeat steps 5-8 (using the same TCP connections) for the rest of the file. Make sure your client has uploaded at least 10% of the file size before closing your program.
  10. When the file is finished, you must contact the tracker and send it the completed event and properly close all connections.

In addition to the above basic run-through of your program's execution, your program should be able to detect input from the user at any point (you can decide what the input should be), and allow the user to close the program, suspending download of the file. The user should be able to restart the program at a later time and resume download. Any non-verified pieces/blocks should be discarded before suspending, and the program should also save how much it has uploaded and downloaded for that torrent.

How you choose to implement saving the state of your program is up to you, but a good start is using the Serializable interface for many Java objects. If you do not use an object that implements the Serializable interface, you must write your own method for storing your data to disk and restoring it upon start-up.

Files Available

The files below are intended to help you concentrate on the networking portion of the project and not have to worry about how to bencode or "unbencode" the torrent file and some of the communication.

Programming Style

Writing maintainable code is just as important as writing correct code. Consequently, you will be graded on your style. Below is a list of suggestions to make your code easier to understand and maintain:

Bencoding (Pronounced "Bee Encoding")

Bencoding a method of encoding binary data. Tracker responses, and interpeer communication will be bencoded. Below is how data types are bencoded according to the BT protocol. [The following list is taken from http://www.bittorrent.com/protocol.html ]

Communication With the Tracker

Your client must take the information supplied by the TorrentFile object and use it to communicate with the tracker. The tracker's IP address and port number will be given to you by the TorrentFile object, and your program must then contact the tracker. Your program will send an HTTP GET request to the tracker with the following key/value pairs. Note that these are NOT bencoded, but must be properly escaped [this list is taken from http://www.bittorrent.com/protocol.html ]:

The response from the tracker is a bencoded dictionary and contains two keys:

In addition to what your program did for the last project, your client should also periodically update its status to the tracker. The update period should be no less than the interval returned by the tracker. This may change during execution, so it should be updated each time the tracker is contacted.

Communicating With the Peer

Handshaking between peers begins with character nineteen (decimal) followed by the string 'BitTorrent protocol'. After the fixed headers are 8 reserved bytes which are set to 0. Next is the 20-byte SHA-1 hash of the bencoded form of the info value from the metainfo (.torrent) file. The next 20-bytes are the peer id generated by the client. The info_hash should be the same as sent to the tracker, and the peer_id is the same as sent to the tracker. If the info_hash is different between two peers, then the connection is dropped.

After the handshake, messages between peers take the form of <length prefix><message ID><payload> , where length prefix is a 4-byte big-endian value and message ID is a single decimal character. The payload depends on the message. Please consult either of the BT-related resources for detailed information describing these messages. Below is a list of messages that need to be implemented in the project.

During the first project, a lot of people were confused about the order of messages after the handshake. Below is an example of what would take place between two peers setting-up a connection and starting sharing.

  1. The local host opens a TCP Socket to the remote peer and sends the handshake packet. The local host then listens for the remote peer to respond.
  2. Upon receiving the handshake packet and verifying the info_hash, the remote peer responds with a similar handshake packet (except with its peer_id). The remote peer then listens for the local host to send a bitfield or other packet. The remote host can send a bitfield packet to the local host at this time.
  3. Upon receiving the handshake and verifying the info_hash, the local host then (optionally) sends a bitfield packet which tells the remote peer which pieces it has downloaded and verified so far.
  4. If the local host is interested in what the remote peer has downloaded, then it sends an interested packet, otherwise it sends an uninterested packet. If the remote peer is interested in what the local host has downloaded, then it sends an interested packet, otherwise it sends an uninterested packet.
  5. When the local host, or the remote peer, is ready to download/upload to the other, it will send an unchoke packet.

Please note that clients will, and your client should, ignore any request packets received while a remote peer is choked. A client should only upload to another client if the connection is unchoked AND the remote peer is interested. This means that a remote peer will not reply to the local hosts's request packets unless you have expressed interest AND the remote peer has sent an unchoke packet. If the remote peer sends a choke packet during data transfer, any outstanding requests will be discarded and unanswered - they should be re-requested after the next unchoke packet.

The Write-Up

Submitting the Project

The project should be submitted through the Computer Science Handin Server.

Grading

Resources

It is strongly recommended that you bookmark or download the Sun Java 1.5.0 API , as well as read the following pages:

Please email romoore _at_ cs _dot_ rutgers _dot_ edu concerning any questions or problems with this page. Please visit the Sakai website for questions concerning this project.

Frequently Asked Questions

I've been getting a lot of similar questions lately, so I'll post general forms of them as well as advice on how to solve the underlying problems. As a bit of general advice, if you aren't sure what a packet is supposed to look like, you can always fire up a working client (the mainline BitTorrent client, for example) and then use a network packet sniffing tool like Ethereal to capture the traffic and analyze it.