CS 352 Fall 2006
Programming Assignment 1
Due Date: Extended to Wed 17 Oct 11:59pm
Sections
Updates
-
(10/14) Today the iLab machines have been used for CS111 midterm. So, we have to move our tracker over there to another machine. Please try to run your program with the new torrent file: photo2.jpg.torrent which will point you to the new tracker and new seeds.
-
(10/5) Robert warned me that there may be some bugs in the current Bencoder.java. If so, please use the Java 1.5 version of Bencoder.java that can be download HERE . Note that this version doesn't work with the old BencoderTester.java, and you have to use Java 1.5 or higher. !!!.
-
(9/27) Sample codes for analyzing the torrent file have been posted:
· TorrentFile.java
- Stores the data from a .torrent file in an easy-to-access format.
· TorrentFileHandler.java -
Accepts a String corresponding to a filename for a .torrent file and returns a TorrentFile object containing the unencoded
data stored in the file.
· TorrentFileHandlerTester.java - An
example of how to use the TorrentFileHandler and TorrentFile classes.
In this project, you will build a simple
BitTorrent(BT) client which will serve as the foundation for a fully-featured
BT client you will build in later assignments. For project 1, your client will
load a .torrent file, interface with a tracker and a single peer and will
download a single file(Jpeg) from that peer. When the file is finished
downloading, you will save it to the hard disk. All communication will be done
over TCP.
The file your program will read in is a
.torrent file containing metadata for the file to be downloaded.
Your assignment should basically do the
following:
- Take as
a command-line argument the name of the .torrent file to be loaded and the
name of the file to save the data to. For example:
java mybtclient somefile.torrent picture.jpg
- Open
the .torrent file and parse the data inside. You may use the Bencoder class to decode the data.
- Open a TCP socket on the local machine and send an HTTP
GET request to the tracker at the IP address and port specified by the TorrentFile object.
- Capture the response from the tracker and decode it in
order to get the list of peers. From this list of peers, use only
the peer at IP address 128.6.60.*. You must extract this IP from the list,
hard-coding it is not acceptable.
- Open a TCP socket on the local machine and contact the
peer using the BT peer protocol and request a piece of the file.
- Download the piece of the file and verify its SHA-1
hash against the hash stored in the TorrentFile
object. The first time you begin the download, you need to contact the
tracker and let it know you are starting download.
- After a piece is downloaded and verified, the peer is
notified that you have completed the piece.
- Repeat steps 5-7 (using the same TCP connection) for
the rest of the file.
- When the file is finished, you must contact the tracker
and send it the completed event and properly close all TCP
connections
- Save the file to the hard disk according to the second
command-line argument.
The files below are intended to help you concentrate
on the networking portion of the project and not have to worry about how to bencode or "unbencode"
the torrent file and some of the communication.
·
TorrentFile.java - Stores the data from a .torrent
file in an easy-to-access format.
·
TorrentFileHandler.java - Accepts a String
corresponding to a filename for a .torrent file and returns a TorrentFile object containing the unencoded
data stored in the file.
·
TorrentFileHandlerTester.java - An
example of how to use the TorrentFileHandler and TorrentFile classes.
- project1.torrent
- The torrent file containing metadata about the file to be downloaded.
- Bencoder.java
- Accepts Strings, Integers, Lists, and Maps (Dictionaries) and
returns them as bencoded byte arrays. Also
accepts bencoded byte arrays of Strings,
Integers, Lists, and Dictionaries and returns them as Strings, Integers,
Lists and Maps.
- BencoderTester.java
- An example of how to use the Bencoder
class.
- ToolKit.java
- A utility class that includes functions for conversion between byte[] and int.
Writing maintainable code is just as important as
writing correct code. Consequently, you will be graded on your style. Below is a
list of suggestions to make your code easier to understand and maintain:
- Conditions
- All of your methods must have clearly specified preconditions and postconditions. A precondition is what you assume to
be true before your method is called, and a postcondition
is what you can show to be true after your method has returned.
Trivial preconditions (such as "int a is an integer") should be omitted. The best
place to put the pre and postconditions is in a
comment before the first line of each method.
Example:
· /*
· Precondition: b is non-zero.
· Postcondition: returns a / b.
· */
· double divide(double a, double b)
· {· return (a / b);
· }
- Comments -
Comments for variables and for sections of code that does not have an
obvious result. Please do not comment obvious statements or sections of
code. For example,
· int i = 0; //i is an integer and the initial value is 0
is not useful and unnecessarily
clutters the code.
- Naming - Names
of classes, methods, and variables should be descriptive. For example, if
a class has a field containing its hash value, naming the field hash_value is much better than naming it
hv .
- Exceptions
- Catch exceptions and do something useful with them. For example, print a
statement describing what went wrong to System.error.
- Easy to
understand loops/recursion. When possible, it's best to avoid break statements
whenever possible. When using recursion, tail-recursion is often
preferable because "smart" compilers can often translate it into
a loop.
- Efficiency
- Avoid being terribly inefficient. For example, if you have to sort 10M
objects, it's better to sort them with a O(n*lg(n)) algorithm than a O(n2) algorithm.
- Encapsulation
- Encapsulation is a useful feature of object-oriented languages, and so
you should try to write classes that expose as little as possible. This
way you can change the implementation of a class without affecting the
rest of your program.
Bencoding
(Pronounced "Bee Encoding")
Bencoding a method of encoding binary data. Tracker responses, and interpeer communication will be bencoded.
Below is how data types are bencoded according to the
BT protocol. [The following list is taken from http://www.bittorrent.org/protocol.html
]
- Strings are length-prefixed base ten followed by a
colon and the string. They are encoded in UTF-8. For example 4:spam
corresponds to 'spam'.
- Integers are represented by an 'i'
followed by the number in ASCII base 10 followed by an 'e'. For example
i3e corresponds to 3 and i-3e corresponds to -3. Integers have no size
limitation. i-0e is invalid. All encodings with a
leading zero, such as i03e, are invalid, other than i0e, which of course
corresponds to 0.
- Lists are encoded as an 'l' followed by their elements
(also bencoded) followed by an 'e'. For example,
l4:spam4:egse corresponds to ['spam', 'eggs'].
- Dictionaries are encoded as a 'd' followed by a list of
alternating keys and their corresponding values followed by an 'e'. For
example, d3:cow3:moo4:spam4:eggse corresponds to {'cow':'moo',
'spam':'eggs'} and d4:spaml1:a1:bee corresponds
to {'spam': ['a', 'b']}. Keys must be strings and appear in sorted order
(sorted as raw strings, not alphanumerics).
Your client must take the information supplied by
the TorrentFile object and use it to communicate with
the tracker. The tracker's IP address and port number will be given to you by
the TorrentFile object, and your program must then
contact the tracker. Your program will send an HTTP_GET request to the tracker
with the following key/value pairs. Note that these are NOT bencoded,
but must be properly escaped [this list is taken from http://www.bittorrent.org/protocol.html
]:
- info_hash - The 20 byte(160-bit)
SHA1 hash of the bencoded form of the info value
from the metainfo file. This value will almost
certainly have to be escaped.
- peer_id - A string of length 20
which this downloader uses as its id. Each downloader generates its own id
at random at the start of a new download. This value will almost certainly
have to be escaped.
- ip - An optional parameter
giving the IP (or DNS name) which this peer is at.
- port - The port number this
peer is listening on. Common behavior is for a downloader to try to listen
on port 6881 and if that port is taken try 6882, then 6883, etc. and give
up after 6889.
- uploaded - The total amount
uploaded so far, encoded in base ten ascii.
- downloaded - The total amount
downloaded so far, encoded in base ten ascii.
- left - The number of bytes
this peer still has to downloaded, encoded in base ten ascii.
This key is important - If you do not specify how much you have
left to download, the tracker assumes you are a seed and will not return
any seeds in the peer list.
- event - This is an optional
key which maps to started , completed , or stopped (or
empty , which is the same as not being present. If not present,
this is one of the announcements done at regular intervals. An
announcement using started is sent when a download first begins,
and one using completed is sent when the download is complete. No completed
is sent if the file was complete when started. Downloaders
send an announcement using stopped when they cease downloading.
The response from the tracker
is a bencoded dictionary and contains two keys:
- interval - Maps to the number of
seconds the downloader should wait between regular rerequests.
- peers - Maps to a list of dictionaries corresponding to
peers, each of which contains the keys peer id , ip , and port , which map to the
peer's self-selected ID, IP address or dns name
as a string, and port number, respectively.
Handshaking between peers begins with character
nineteen (decimal) followed by the string 'BitTorrent protocol'. After the fixed headers are 8 reserved bytes which are set to 0.
Next is the 20-byte SHA-1 hash of the bencoded form
of the info value from the metainfo (.torrent) file.
The next 20-bytes are the peer id generated by the client. The info_hash should be the same as sent to the tracker, and
the peer_id is the same as sent to the tracker. If
the info_hash is different between two peers, then
the connection is dropped.
- All integers
are encoded as 4-bytes big-endian.
- The peer_id should simply be random numbers.
After
the handshake, messages between peers take the form of <length
prefix><message ID><payload> , where length prefix is a 4-byte
big-endian value and message ID is a single decimal character. The payload
depends on the message. Please consult either of the BT-related resources for
detailed information describing these messages. Below is a list of messages
that need to be implemented in the project.
- keep-alive:
<length prefix> is 0000. There is no message ID and no
payload. These should be sent around once every 2 minutes to prevent peers
from closing connections.
- interested:
<length prefix> is 0001 and message ID is 2. There is no
payload.
- uninterested:
<length prefix> is 0001 and message ID is 3. There is no
payload.
- have:
<length prefix> is 0005 and message ID is 4. The payload is a
zero-based index of the piece that has just been downloaded and verified.
- request:
<length prefix> is 0013 and message ID is 6. The payload is
as follows:
<index><begin><length>
Where <index> is an integer specifying the zero-based piece
index, <begin> is an integer specifying the zero-based byte
offset within the piece, and <length> is the integer
specifying the requested length. <length>
is typically 2^14 (16384) bytes. A smaller piece should only be used
if the piece length is not divisible by 16384. A peer may close the
connection if a block larger than 2^14 bytes is requested.
- piece:
<length prefix> is 0009+X and message ID is 7. The payload is
as follows:
<index><begin><block>
Where <index> is an integer specifying the zero-based piece
index, <begin> is an integer specifying the zero-based byte
offset within the piece, and <block> which is a block of data,
and is a subset of the piece specified by <index> .
- Include
your name as it appears on the roster and your student ID.
- A
high-level description of how your program works. Specifically, describe
the high-level interactions between the classes. This is a good way to
make sure that your program doesn't have any classes depending on the
implementations of other classes. If you know how to use LaTeX, this should be fairly easy to diagram. Also, OpenOffice.org's Draw program has an Export to PDF
feature that is very simple to use.
- A brief
(about 1 paragraph) description of each class in the program.
- Optionally,
you may include a section on feedback about the program. Please only
provide constructive/useful comments or insights. What parts of the
project were the most challenging and which were the easiest? Did you have
trouble finding resources for the project? Were any parts confusing to
you?
The project should be submitted through the Computer Science Handin Server.
- Make sure your name is in comments at the top of every
file submitted.
- The "main" starting method should be in a
file called mybtclient.java.
- The write-up in HTML or PDF format saved as a writeup.<html/pdf>.
- All files should be submitted either as a .zip or .tar
file. This file should be named <NetID>.<zip/tar>,
where <NetID> is your eden NetID.
- Programming
Style: 15%
- Correctly
setting up TCP sockets/connections: 15%
- Correctly
interfacing with the tracker: 20%
- Correctly
interfacing with the peer: 20%
- Correctly
downloading and verifying the file: 20%
- Write-up
10%
It is strongly recommended that you
bookmark or download the Sun Java 1.5.0
API , as well as read the following pages:
Please email tphan@cs.rutgers.edu concerning any
questions or problems with this page.
Frequently
Asked Questions
- This part
will be updated soon.