2. Common Scenario Millions want to download the same popular huge files (for free) Softwares Media (the real example!) Client-server model fails Single server fails Canât afford to deploy enough servers
11. BitTorrent Terminology Peers â A node or computer that does not have the complete file Seed or seeder - A computer with a complete copy of a BitTorrent file Swarm- A group of computers simultaneously sending (uploading) or receiving (downloading) the same file .torrent - A pointer file that directs your computer to the file you want to download Tracker- A server that manages the BitTorrent file-transfer process
18. A *.torrent guides users to owners of a file User obtains *.torrent file. File contains meta info about a target file. 1 Armed with a list of peers holding pieces of the file, user downloads from many peers 4 User loads *.torrent file into BitTorrent client, which then looks up the named client 2 3 Tracker coordinates peers.
19. All peers act as a source A machine with a complete copy (the seed) can distribute incomplete pieces to multiple peers Seed Peers exchange different pieces of the file with one another until they assemble a whole As soon as the user has a piece of the file on his machine, he can become a source of that piece to other peers, helping speed download
20. The key ingredients of the *.torrent file are the trackerâs address and the unique SHA1 hash All data in a metainfo file is encoded. info: a dictionary that describes the file(s) of the torrent. announce: contains the URL of the âtrackerâ creation date Comments from the author(optional) created by: (optional) piece length: number of bytes in each piece (integer) pieces: string consisting of the concatenation of all 20-byte SHA1 hash values, one per piece
21.
22. Check and configure firewall and/or router for BitTorrent (if applicable)
27. Create a New .torrent filePublish the .torrent file on torrent search Index sites such as PirateBay
28. Peer-peer transactions:Choosing pieces to request Rarest-first: Look at all pieces at all peers, and request piece thatâs owned by fewest peers Increases diversity in the pieces downloaded avoids case where a node and each of its peers have exactly the same pieces; increases throughput Increases likelihood all pieces still available even if original seed leaves before any one node has downloaded entire file
29. Choosing pieces to request Random First Piece: When peer starts to download, request random piece. So as to assemble first complete piece quickly Then participate in uploads When first complete piece assembled, switch to rarest-first
30. Why BitTorrent took off Better performance through âpull-basedâ transfer Slow nodes donât bog down other nodes Allows uploading from hosts that have downloaded parts of a file
31. Why BitTorrent took off Practical Reasons (perhaps more important!) Working implementation (Bram Cohen) with simple well-defined interfaces for plugging in new content Many recent competitors got sued / shut down Napster, Kazaa Users use well-known, trusted sources to locate content Avoids the pollution problem, where garbage is passed off as authentic content
32. Pros and cons of BitTorrent Pros Proficient in utilizing partially downloaded files Discourages âfreeloadingâ By rewarding fastest uploaders Encourages diversity through ârarest-firstâ Extends lifetime of swarm Works well for âhot contentâ
33. Pros and cons of BitTorrent Cons Assumes all interested peers active at same time; performance deteriorates if swarm âcools offâ Even worse: no trackers for obscure content
34. Pros and cons of BitTorrent Dependence on centralized tracker: pro/con? ď Single point of failure: New nodes canât enter swarm if tracker goes down Lack of a search feature ď Prevents pollution attacks ď Users need to resort to out-of-band search: well known torrent-hosting sites / plain old web-search
35. âTrackerlessâ BitTorrent To be more precise, âBitTorrent without a centralized-trackerâ E.g.: Azureus Uses a Distributed Hash Table (Kademlia DHT) Tracker run by a normal end-host (not a web-server anymore) The original seeder could itself be the tracker Or have a node in the DHT randomly picked to act as the tracker
36. Why is (studying) BitTorrent important? BitTorrent consumes significant amount of internet traffic today In 2004, BitTorrent accounted for 35 to 60% of all internet traffic (according to CacheLogic) BT always used for legal software (linuxiso) distribution to