Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Development of a Tor library

541 Aufrufe

Veröffentlicht am

  • If we are speaking about saving time and money this site ⇒ www.WritePaper.info ⇐ is going to be the best option!! I personally used lots of times and remain highly satisfied.
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Get access to 16,000 woodworking plans. ➤➤ https://url.cn/ktFCrsHZ
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Free Shed Plans, 12000 SHED PLANS. FREE SHED PLANS CLICK HERE ✤✤✤ https://url.cn/xFeBN0O4
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Download over *12,000* fully detailed shed plans and start building your next shed easily and quickly. ●●● https://t.cn/A62YdZJg
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Get Instant Access to 12000 SHED PLANS, Download plans now. ✔✔✔ https://url.cn/15NhsAEL
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Gehören Sie zu den Ersten, denen das gefällt!

Development of a Tor library

  1. 1. 2014 MASTERS ENGINERING PROJECT RICHARD DENNIS 500198 | University of Portsmouth Development of a Tor library Supervisor: Dr Gareth Owen Moderator: Dr Nick Savage
  2. 2. Abstract Given the increasing popularity of Tor following the Edward Snowdon revelations, the dark net has been the subject of much debate, both amongst academics and in the media. Existing Tor libraries require Tor to be installed on the user's PC and are currently incapable of conducting an attack against Tor, which would be advantageous to law enforcement agencies on local, national and international levels. This leaves room for further development within the field and it was with these reasons in mind that the current study was undertaken. Aiming to create an intuitive application that is not reliant on an existing Tor client, this project aims to radicalise the field by creating a Tor library that is capable of providing a foundation for further development that could lead to the de-anonymization of Tor users. The outcome of this project was that a fully functioning independent library was created that, following extensive testing, was capable of conducting an initial Denial of Service attack against a Tor node. The results of this attack were inconclusive, however this serves to demonstrate that the developer has fulfilled his objective of creating a forward-looking library that provides a solid base for future development in the field. It is hoped that the research and code developed throughout this project will contribute to the development of an attack on Tot that can de-anonymise users of the network and thus make a valuable contribution to law enforcement and the field of cyber security.
  3. 3. Acknowledgements I wish to thank Dr. Gareth Owen for providing valuable guidance and support throughout the course of the project in his role as project supervisor
  4. 4. Contents 1. Introduction ........................................................................................................................................1 1.1. Introduction ............................................................................................................................1 1.2. Rationale.................................................................................................................................1 1.3. The problem............................................................................................................................1 1.4. Objectives................................................................................................................................1 1.5. Constraints..............................................................................................................................2 1.6. Project Deliverables ................................................................................................................2 1.7. Report Structure overview......................................................................................................2 2. Literature Review............................................................................................................................4 2.1. Introduction ............................................................................................................................4 2.2. Analysis ...................................................................................................................................4 What is Tor?....................................................................................................................................4 History of Tor ..................................................................................................................................4 How Tor works................................................................................................................................5 Hidden services...............................................................................................................................6 Who uses Tor and for what?...........................................................................................................7 Current Tor development projects .................................................................................................8 Current attacks on Tor ....................................................................................................................8 Conclusion.......................................................................................................................................9 3. Project Management ....................................................................................................................10 3.1. Brief description of chapter..................................................................................................10 3.2. Select methodology ..............................................................................................................10 3.3. Requirements elicitation.......................................................................................................11 3.4. Risk analysis ..........................................................................................................................13 3.5. Schedule................................................................................................................................14 3.6. Professional issues................................................................................................................14 3.7. Conclusion of the section......................................................................................................14 4. Application Development .............................................................................................................15 4.1. Iteration 1 .............................................................................................................................15 4.1.1. Requirements................................................................................................................15 4.1.2. Design............................................................................................................................17 4.1.3. Implementation ............................................................................................................19 4.1.4. Testing...........................................................................................................................21 4.1.5. Moving forward from Iteration 1..................................................................................23 4.2. Iteration 2 .............................................................................................................................24
  5. 5. 4.2.1. Requirements................................................................................................................24 4.2.2. Design............................................................................................................................24 4.2.3. Implementation ............................................................................................................25 4.2.4. Testing...........................................................................................................................29 4.2.5. Moving forward from Iteration 1..................................................................................32 4.3. Iteration 3 .............................................................................................................................33 4.3.1. Requirements................................................................................................................33 4.3.2. Design............................................................................................................................33 4.3.3. Implementation ............................................................................................................34 4.3.4. Testing...........................................................................................................................36 4.3.5. Moving forward from iteration 3..................................................................................38 4.4. Iteration 4 .............................................................................................................................39 4.4.1. Requirements................................................................................................................39 4.4.2. Design............................................................................................................................39 4.4.3. Implementation ............................................................................................................40 4.4.4. Testing...........................................................................................................................44 4.5. Iteration 5 .............................................................................................................................47 4.5.1. Requirements................................................................................................................47 4.5.2. Design............................................................................................................................47 4.5.3. Implementation ............................................................................................................49 4.5.4. Testing...........................................................................................................................51 5. Evaluation .....................................................................................................................................52 6. Summary, conclusion and recommendations ..............................................................................54 7. Bibliography ..................................................................................................................................55 8. Appendixes....................................................................................................................................59 Appendix 1 – Project Initialization document...................................................................................59 Appendix 2 – Ethical checklist...........................................................................................................66 Appendix 3 – Original planned project schedule..............................................................................73 Appendix 4 – Project schedule updated due to iteration 1 overrunning .........................................74 Appendix 5 – Project schedule updated due to iteration 2 overrunning .........................................75 Appendix 6 – Suitability analysis questionnaire ...............................................................................76 Appendix 7 – Black box testing questionnaire..................................................................................77 Appendix 8 – Evaluating functional requirements ...........................................................................78 Appendix 9 - Evaluating Non-Functional requirements....................................................................80 Appendix 10 – Actual project schedule ............................................................................................81
  6. 6. List of Figures Figure 1 - Onion routing circuit...............................................................................................................6 Figure 2 - Hidden service architecture....................................................................................................7 Figure 3 - Agile methodology diagram..................................................................................................10 Figure 4 - Table showing possible elicitation methods.........................................................................12 Figure 5 - Table containing potential risks and countermeasures........................................................13 Figure 6 - Table containing requirements for iteration 1 .....................................................................15 Figure 7 - Non-functional requirements ...............................................................................................16 Figure 8 - Code snippet showing connection to a Tor node.................................................................19 Figure 9 - Testing results for iteration 1................................................................................................22 Figure 10 - Functional requirements for iteration 2.............................................................................24 Figure 11 - Code snippet showing the creation of Diffie Hellman public and private keys..................26 Figure 12 - Code snippet showing the Tor Circuit Class........................................................................26 Figure 13 - Code snippet showing the decoding of a Create cell..........................................................27 Figure 14 - Code snippet showing the decryption function..................................................................27 Figure 15 - Code snippet showing the creation of the Stream cell.......................................................28 Figure 16 - Code snippet showing the conversion of error codes to english .......................................29 Figure 17 - Testing results iteration 2 ...................................................................................................31 Figure 18 - Requirements iteration 3....................................................................................................33 Figure 19 - Code snippet showing the creation of a rendezous point..................................................35 Figure 20 - Code snippet showing how the service is downloaded and saved to a text file ................36 Figure 21- Testing results iteration 3....................................................................................................38 Figure 22 - Requirements for iteration 4 ..............................................................................................39 Figure 23 - Code snippet showing how the service descriptor data is extracted.................................41 Figure 24 - Code snippet showing base 64 decoding............................................................................41 Figure 25 - Code snippet showing dynamic methods to extract data from a service descriptor.........42 Figure 26 - Code snippet showing how to connect to an induction point............................................43 Figure 27- Code snippet showing the packet creation of stream to a hidden service .........................44 Figure 28 - Testing results iteration 4...................................................................................................45 Figure 29 - Diagram showing the bandwidth saturation attack ...........................................................48 Figure 30 - Configuration screenshot showing the Max bandwidth allowed per day on a Tor node ..49 Figure 31 - Code snippet showing a function to allow code to run every 24 hours.............................49 Figure 32 - Code snippet showing the DoS attack ................................................................................50
  7. 7. 1 | P a g e 1. Introduction 1.1. Introduction This introduction discusses the motivation and reasons behind this project, as well the project objectives and constraints before concluding with a brief summary of all the chapters in this report. 1.2. Rationale Since the Edward Snowden revelations regarding government surveillance, more and more people have been wanting privacy and anonymity when using the internet. The Onion Router (hereafter known as Tor) provides such a service, and is widely used by people wanting to remain anonymous on the internet and in countries where internet censorship is a major issue. As well as providing anonymity to internet users, Tor also allows a website or service to be hosted within the Tor network, these are known as hidden services. A normal website is hosted on the internet; the hosting location can be found and visitors to the site can be monitored, but a hidden service provides the person hosting the service and the site’s users with anonymity. This can lead to websites containing illegal material such as child pornography or drugs being hosted on Tor as both the users and the host are guaranteed complete anonymity. This demonstrates that being anonymous on the internet brings with it a lot of issues, one of which is accountability. For example, in the event that somebody had used the internet for nefarious means, how can we prove they accessed a certain website or illegal service? 1.3. The problem Despite the increased exposure and popularity of Tor, development in the field is currently almost non-existent. Although many research papers look at theoretical attacks on Tor (Borisov et al, 2013; Biryukov et al; 2013b; Jansen, 2014), there is not currently a Tor application, library or framework that would facilitate these theoretical attacks. Currently, to use Tor, a user downloads and installs it to their machine. The application is very limited in terms of its use; it is currently only designed to connect a user to the Tor network and allow them to use a pre-configured web browser to use the internet anonymously. Although is an open-source project, meaning development of the Tor client is possible, the sheer size and complexity of the application makes developing it extremely challenging. This means there is no room to easily build upon or further develop the current application, for example to increase the security of Tor or to program attacks to de-anonymise users. It is clear that there is a need for an application that can connect a user to the Tor network, which provides all the functionality of the current Tor application but which is also able to be expanded and built up, with a final goal of implementing an attack such as a denial of service or de-anonymization attack. 1.4. Objectives The objective of the project is to create a Tor library which, unlike previous Tor libraries, does not require the user to install Tor. This will make this project unique. It will also make the project more challenging as the application will need to use the Tor protocol to communicate with the Tor network. This involves challenges such as encrypting and decrypting packets to Tor nodes and communicating with hidden services. This project aims to develop a fully functional library whose functionality is comparable to that of the Tor client and, in the wider context, which will be able to be used for future development for projects such as attacking or hacking Tor.
  8. 8. 2 | P a g e 1.5. Constraints With all projects many constraints faced during the development of the project, both internally and externally.  Hard deadline – 12th September 2014. On this day all project deliverables must be handed in to the client.  Zero budget  Availability of software – Although the software used will be open source due to the budget constraint, support or access to the software may be limited.  Knowledge of the problem is limited, and in order to gain the necessary understanding of Tor, as well as learning new skills required for the project, will require a large amount of time.  Other commitments will interfere with the project; good project and time management are required. 1.6. Project Deliverables The list of project deliverables for this project is:  A Tor library that can connect to Tor, Tor’s hidden services and which can be expanded on  A report to document the development of the application consisting of: o Literature review o Project Management o Specification and discussions of the requirements o Application Development o Summary of the project o Evaluation against requirements o Conclusion of the project o Bibliography 1.7. Report Structure overview Chapter Title Synopsis 1 Introduction 2 Literature review Firstly this chapter analyses what methods are available to ensure the most appropriate documents were chosen to review. Next is the literature review itself, this will cover the topic areas of:  What is Tor? (Historical and technical discussion)  How does Tor provide its users with anonymity and privacy?  Uses of Tor  What is the social impact of Tor?  Alternatives to Tor 3 Project management Discusses why the method of project management chosen for this project was the most appropriate choice and the process that would be used to elicit requirements for the project. Next, a risk analysis and various countermeasures are discussed before concluding with a brief discussion of the intended schedule and any relevant professional issues that could arise. 4 Application Development This chapter looks at each iteration stage of development. The requirements of each iteration are discussed, followed by a description of the design process and the implementation
  9. 9. 3 | P a g e procedure. The testing of each iteration is explained as well as concluding which features will be carried forward or dropped from the application in the following iteration. 5 Evaluation against requirements Evaluates whether the project has met the requirements outlined earlier in the development. It will look into how the requirements have been meet or exceeded, before looking at requirements that have not been met, providing an explanation of why these have not been met and a recommendation of how these may be achieved. 6 Conclusion of the project Reflects on the project as a whole. Looks at how the project has been developed, what mistakes were made during the development as well as what has been achieved. This chapter discusses what the project has brought to the field of information security and possible future development for the project. 7 Bibliography Lists all the references used throughout this report, as well as sources used throughout the development of the project. 8 Appendixes Contain all the figures, tables, and section of code along with any other information such as testing strategies that accompany the report.
  10. 10. 4 | P a g e 2. Literature Review 2.1. Introduction Tor used to be unheard of, except within the tech community and amongst users of illegal sites. However, since 2013, when the Edward Snowden leaks revealed that GCHQ and the NSA are unable to de-anonymize all Tor users, Tor has been cast into the public eye. Now with 200,000 daily connected users (Tor Project, 2014) Tor and its use is debated more than ever. This report examines Tor in great detail; what it is, who uses it and for what reasons. It will then discuss the illegal side of Tor, before looking into why Tor needs to be investigated and what has been done so far. Further research and a review of the relevant literature are carried out throughout the project, as research into the Tor protocol as well as research into current attacks, not necessarily against Tor, will be required. 2.2. Analysis What is Tor? Tor is an open source project that is a decentralized low latency mix network of specially configured nodes, commonly called relays or bridges, which transmits only TCP traffic through virtual tunnels from a client to a destination, usually, but not exclusively, the Internet. Tor has been incorrectly described as having a single authority and being a Virtual Private Network (VPN) or a peer-to-peer (P2P) network (Hurley et al., 2013, p. 1); but this is not the case. Tor is described by McCoy et al. as a “privacy enhancing system, designed to protect the privacy of Internet users from traffic analysis” (McCoy et al., 2008, p. 63). Syverson elaborates on this, stating that Tor provides anonymous connections to the Internet providing protection against traffic analysis as well as eavesdropping (Syverson, Goldschlag & Reed, 1997, p. 44). Both groups of scholars agree that Tor provides anonymity. As well providing users with anonymity and privacy, Tor can be a tool for anti-censorship. Endorsed by the Electronic Frontier Foundation and other civil liberties groups, one use of Tor is as a means of communication between journalists, whistle-blowers and human rights workers. (Levine, 2014) Tor is also the network supporting what the media commonly refer to as the 'Darknet', due to the encrypted nature of the network and its association with illegal, or ‘dark’, activities. History of Tor Roger Dingledine is largely credited for being one of Tor’s creators, in 2004 he was part of the team that released the paper Tor: The Second-Generation Onion Router (Dingledine, Matthewson & Syverson, 2004). Levine explains how the concept behind Tor, “Onion routing”, can be traced back to 1995, and this should be considered the origin of Tor (Levine, 2014). Michael Reed, one of Tor’s creators, revealed Tor was originally created for military intelligence usage, such as open source intelligence gathering, and the reason for releasing it open source was to provide better cover traffic to hide what the network was really being used for (Reed, 2011). Reed expands upon this, adding that Tor was not designed for helping dissidents in repressive countries or criminals to avoid law enforcement (Reed, 2011).
  11. 11. 5 | P a g e Until 2013, Tor was almost unheard of expect amongst technology and criminal circles. In 2013, Edward Snowden made a series of high profile disclosures about several global surveillance programs. One leaked document, “Tor Stinks”, made international news, and describes how the NSA tried to compromise Tor anonymity (The Guardian, 2013). Silk Road, a Tor hidden service which is also known as the eBay for drugs, was an online marketplace where users could buy and sell drugs around the world (Barratt, 2012, p. 683) and was hailed as being a “criminal innovation” (Aldridge & Décary-Hétu, 2014). It was taken offline in 2013 by the FBI. This was the first public demonstration of a government agency taking down a website hosted on Tor, and attracted significant media attention internationally (Greenberg, A., 2013b). However, the effectiveness of the closure of the Silk Road has been questioned by Greenberg, who reports that Silk Road 2.0, an updated and improved Silk Road, was online just one month after the closure of Silk Road (Greenberg, A., 2013a). Since these leaks and the FBI takedown of the Silk Road, Tor has never been more popular (Jeffries, 2013); its popularity has sky-rocketed and it now has around 2.5 million monthly users (Tor Project, 2014). How Tor works Tor is built on the second generation onion routing. Syverson et al. (2000, p. 1) states one of the major reasons in the change of design from gen 1 to gen 2 is to be able to release the source code for public distribution as the patent of generation 1 onion routing preventing this. Dingledine et al. (2004, p. 1) describe how another of the major reasons for changing the design of onion routing was to solve “many critical design and deployment issues that were never resolved” as well as stating the “design has not been updated in years”. Tor creates a “circuit” through several, usually three, Tor nodes; a Guard (Entry) node, a relay, and an exit node. Borisov et al. (Borisov et al., 2007) strengthen the case for selecting three nodes, by demonstrating how increased circuit lengths compromises anonymity. With the nodes selected, and the user wishing to send a message through the Tor network, the message is encrypted like an onion with all three of the nodes Keys. The Diffie-Hellman (DH) handshake is used between each relay and the client to create the session key (Jagerman et al., 2014, p. 4). However the DH handshake has also been criticized, with some scholars proposing an ElGamal key agreement based protocol, although this has currently not been implemented (Øverlier & Syverson, 2007, p. 4). Catalano is also critical of the current onion routing protocol, claiming it has a high round complexity which affects the running time, although he agrees that Tor onion routing provides forward secrecy and is secure (Catalano et al., 2011, p. 255). Loesing is argues that using the DH key exchange makes “building circuits in Tor is a time consuming task” although he fails to fully justify his reasons for this statement (Loesing, 2009, p. 48). Dingledine et al. (2004), Øverlier and Syverson (2007), and Loesing (2009) all acknowledge that using DH to create the circuits in Tor achieves perfect forward secrecy, with Loesing explaining how if an attacker were to collect and store all traffic, to try and force the nodes to decrypt it would be ineffective due to the “telescoping” approach (2009, p. 45). This means that once the session keys are deleted a relay cannot be forced to decrypt old traffic. In a similar vein, Øverlier & Syverson (2007, p. 3), demonstrate how, due to the forward secrecy feature of Tor, attacks do not succeed. The message is encrypted with the last node’s key first, then the encrypted message is encrypted again with the second node’s key, before finally encrypting this message with the first node’s key.
  12. 12. 6 | P a g e Onion routing is shown in the image below: Figure 1 - Onion routing circuit Dingledine et al. (2004, p. 5) provide an explanation of how a message that is to be sent through an onion routing network is then decrypted, with the message arriving at and being decrypted by the first node; here the only information that will be known is which node to send the packet to. This will be done at each node through the circuit, until the message is sent out to the Internet. The removal of a layer of encryption is like peeling the layers of an onion, hence the name onion routing. Apart from the three main nodes, Entry, relay, and exit, there is another type of relay, a bridge; this is a relay that is not listed on the consensus, and is used as the first hop in the circuit to route traffic out of a certain country. Ling et al. (2012, p. 2381) conclude that Tor bridges are critical to counter censorship blocking. Winter and Lindskog reinforce this viewpoint by explaining how China is blocking all public relays in the consensus then stating that only “1.6% of public relays are able to be connected to” (2012, p. 11) which shows that, since the bridges are unknown, they cannot be connected to. Hidden services Services such as websites can also be hosted within the Tor network. These are known as “Hidden Services” and can only be visited through Tor. Tor Hidden Services were added in 2004, when the second generation of onion routing was developed (Dingledine et al., 2004). Loesing (2009, p. 36) describes how Tor offers a TCP-based service to be accessed while concealing the identity of the hosting servers IP address. This enables a user (Alice) to connect to another user (Bob’s) server without knowing where, or who, it is. Loesing (2009, p. 40) also brilliantly sums up the hidden service design as, “connecting two circuits created by client and server on a common rendezvous point”. Biryukov et al. (2013, p. 82) expand on Loesing’s explanation, stating that all communication between a client and the hidden service is done through a rendezvous point (RP) which connect circuits from the user to the hidden service. These RPs are mutually agreed points (Dingledine et al, 2004). Data Data Data Dat Unencrypted unless using HTTPS Guard node Relay node Exit node Tor protocol
  13. 13. 7 | P a g e Dingledine et al. (2004) also discuss how to connect to a hidden service; the user tells the hidden service what RP will be used, first using a hidden service descriptor to search the distributed hash table, a lookup service to inform the user of which induction points (IP) are servicing the hidden service. The client will then communicate with the IP of the RP which will then be used to communicate with the hidden service. Who uses Tor and for what? Tor was originally designed for military intelligence gathering (Reed, 2011), however it has now become more diverse; whilst it is still used by the military and law enforcement, its users now include activists reporting abuse from danger zones (Tor Project, 2014b). This is particularly relevant given the continued fighting in areas such as Iran and Syria; Tor has been instrumental in getting reports of information from within these countries to the outside world. Tor was even awarded the FSF's Award for Projects of Social Benefit for its role in the revolutions in the Middle East (Sullivan, 2011). Since the Snowden revelations about the surveillance program PRISM, a data mining program used by the NSA and GCHQ to store Internet communications from companies such as Google and Yahoo, Tor has been increasingly used by normal people seeking privacy online and wanting to prevent government agencies from monitoring them. Munson validated this assumption by showing how Tor usage has increased dramatically since the Snowden leaks, which he feels suggests that Tor is being used by users seeking privacy online (Munson, 2013). Arma (2013) however contradicts Munson’s conclusion stating that, due to the increase of “ESTABLISH_RENDEZVOUS requests", this growth is instead likely to be from a botnet, although he also acknowledges some growth can be attributed to activists in Syria and the United States. The distribution of copyrighted digital material has moved from peer-to-peer to the “Darknet” (Wood, 2010, p. 1); this contradicts the Tor project message that users are only using Tor for good, and legal, means. Wood believes users of the “Darknet” exploit the anonymity provided by Tor to illegally download material without being able to be traced. Biryukov et al. (2013b, p. 2) strengthen the argument that Tor is used for illegal purpose; they show that 44% of hidden services were hosting illegal content such as drugs, pornography, illegal copyright material etc. Arguably the most famous hidden service is Silk Road, also known as the eBay for drugs (Barratt, 2012, p. 683). Reportedly, the Silk Road had between 30,000 and 150,000 active customers (Christin, 2012, p. 2). The Silk Road was used to make significant numbers of transactions; Konrad (2013) reports that the Silk Road’s owner and operator Ross William Ulbricht handled $1.2 billion of transactions in the 2.5 years before FBI seizure. This clearly shows how popular illegal activities are on Tor, greatly contradicting the Tor project’s stance that Tor is used exclusively for good and instead showing that Figure from Dingledine et al., 2004, p. 3 Figure 2 - Hidden service architecture
  14. 14. 8 | P a g e Tor is being used to conduct illegal activity anonymously, thus making traceability and accountability extremely difficult. Cox (2014) cites Bartlett’s radical views about Tor. Bartlett sets out to discover “the range of things that people do under the conditions of anonymity” (Cox, 2014) and implies that, under the cover of anonymity, people will do more distressing and ‘dark’ things, for example talking to “trolls on pro- anorexia forums” (Cox, 2014). He goes on to suggest that human nature finds a haven on the Internet and uses the tools available to enable this; in this example Tor allows these illegal and immoral activities to be conducted anonymously. While this view regarding the reason why Tor is being used for illegal purposes may be a radical one, it does not hide the fact that a large percentage of Tor is used for illegal activities by criminals using the blanket of Tor’s anonymity as protection from prosecution. Current Tor development projects With Tor being open source, developers are free to use the source code to modify and develop their own applications using Tor. As well as being able to modify the source code, there are libraries that exist which allow developers to use Tor for their projects. Stem, a Python controller library for Tor, requires Tor to be installed on the machine and controls Tor through the control port (usually port 9051). Winter demonstrates how this can be used to create circuits as well as stream in these circuits, although he implies the lack of features and functionality of Stem inspired him to create his own application rather than using the existing Stem library (2014, p. 3). However Atagar praises Stem for the friendly API and documentation, while simultaneously criticizing the lack of backward compatibility it offers (Atagar, 2012). Txtorcon, previously TorCtl, is another Python controller library. It works in the exact same way, using the control port to control Tor. TorCtl library was used for the development of the Torbutton Firefox extension. Meejah (2014) describes how Txtorcon communicates with the Tor network as being “an asynchronous API to speak the Tor client protocol in Python” and believes the main goal of Txtorcon is to enable applications to use the Tor network to improve people's privacy and anonymity on the Internet. Atagar draws a comparison between both libraries, praising both for their “extensive test suites and are being very actively maintained” (Atagar, 2012) and noting that both just control the Tor client. Current attacks on Tor With such a diverse client base, it is easy to see who may want to attack Tor, and their reasons for doing so. For example, law enforcement will want to catch paedophiles who access hidden services containing child abuse, whilst regimes such as China or Iran want to prevent users accessing Tor and so may try to take down Tor altogether. Denial-of-Service (DoS) Wood and Stankovic (2002, p. 55) describe a DoS attack is “any event that diminishes or eliminates a network’s capacity to perform its expected function”. Borisov et al. agree with Wood’s statement and also claim that as well as the blanket DoS which effects the whole network, a selective DoS attack can target just one small section of the network (Borisov et al, 2007). Unlike Wood and Stankovic (2002, p. 55), Wang et al. (2004, p. 193) accurately describes a DoS attack to be the flooding of a node with traffic such as requests until the node is unable to function.
  15. 15. 9 | P a g e Jansen (2014) details a DoS which implements a selective DoS attack. This attack targets either the entry or exit node. The attack works by requesting data from a source such as a file server, and then for the client to stop reading from the TCP connection, thus exploiting Tors control flow before requesting more data from the source, causing the memory of the target node to increase, and then the OS terminate the Tor application on the relay. Jansen (2014) acknowledges the attack, and its effectiveness, however is critical of the attack; he discusses several simple defences that would prevent this attack such as the implementation of authenticated SendMe cells to prevent the control flow being misused. De-anonymization of hidden services Rob Jansen expands on his aforementioned DoS attack in order for it to be able to conduct the de- anonymization of hidden services. Jansen expands on the attack first developed by Biryukov, Pustogarov and Weinmann (2013). This attack requires the current guard of a selected hidden service to be known and taken offline, forcing the HS to select a new guard node; this is repeated until the attacker’s guard node is picked. However, this required the attacker to run a compromised guard relay and Jansen is critical of this attack’s success. He suggests that if middle guards was used, then the attack would not work. Jansen also criticises the attack’s method of taking a node offline, stating if the node was correctly configured to prevent a sniper attack, then this would fail. He goes on to further suggest that if the node were to simply reboot after it was taken offline, this alone would be enough to render the attack ineffective (Jansen, 2014). ASN (2013) frame this attack in a new light, instead of trying to prevent the attack, they suggest the core design of Hidden Services is flawed and in need of a redesign if it is to continue to be secure and effective. Conclusion The review has enabled the developer to gain a solid understanding of what Tor is, how it works and who it is used by. In addition, it has provided the opportunity existing Tor libraries to be reviewed. By comparing Stem and Txtorcon it became clear that both offer the same level of functionality, but in some areas this is at such a low level that they can be considered to be severely lacking, as they only allow an application to control the Tor client and not control Tor directly. From this section it is clear to see this is an area of research that is significantly lacking, and would benefit greatly from more research and development. The current attacks on Tor section is also extremely relevant to this project, this section analysed two different attacks, a DoS attack and an attack which de-anonymised hidden services. This section in particular featured analysis from several authors who were not in agreement over the effectiveness of the attacks demonstrated, but all agree on one thing: Tor is vulnerable to attacks. Overall this literature review has demonstrated that while Tor is a network that has been around for over 10 years, and has been heavily researched, there are still many opposing viewpoints regarding certain topics such as attacks. It has shown why Tor is more popular today than it ever has been, and the impact that the anonymity of Tor has on the usage of Tor. Tor development libraries were shown to be an area in which little research has been conducted, and where the current solutions have been met heavily with criticism and opposing views.
  16. 16. 10 | P a g e 3. Project Management 3.1. Brief description of chapter This chapter discusses the project and requirement elicitation methodologies that were used for this project. It also considers the project’s potential risks and the countermeasures that were implemented to negate these, as well as discussing the original schedule designed for the project. 3.2. Select methodology Project management has been used in various forms for centuries, but only since the 1950s has the modern concept of project management existed (Kwak, 2003, p. 1). The origin of modern project management is disputed, although there are records of it being used in the 1950s by US defence, Xerox, Bell Laboratories and even NASA. Modern project management can be defined as the “application of knowledge, skills and techniques to execute projects effectively and efficiently” (PMI, 2014). Since the 1950s, project management has become a necessity on all projects; it reduces the risk of the project failing, or the wrong features being developed and ensures that the project is completed efficiently and on time. The Agile Method of project management was chosen for this project. The Agile Method favours “working software over comprehensive documentation” (Paulk, 2002, p. 15). In this project the client only expects a working application and this report to be produced, no other documentation is required. This makes the Agile Method a good choice for this project, as it permits development to be started immediately and provides the client with what they want, whereas methods such as the Waterfall method focus a lot more on documentation, thus delaying the start of development. Another reason for choosing this method was that it promotes adaptability throughout the project’s lifecycle. Unlike other methods, such as the Waterfall Model, the Agile Method allows and encourages changes to be made as and when they are required. This ensures development is continuous and will not be stopped by a requirement that cannot be implemented. This method also allows for regular testing and for working software to be shown frequently to the client; this promotes client feedback any necessary changes can be immediately implemented with little cost to the design and development of the project. This would not be possible with other traditional methodologies, in which changes can be costly and can usually only be implemented at the end of the development. This feature also mitigates risks and ensures that a quality application is produced; any bugs or issues are identified within an iteration (a short timescale of around three weeks) and can be dealt with at that point, rather than having long-lasting effects on the application as would be the case with the Waterfall model. A further benefit of showing the client frequent iterations of the project is that they get to see the progress that is being made, which on a complex project allows them to gain a sense of the challenges faced by the developer and to feel that they are having an input in the application’s development. If Waterfall or Spiral models are used, the client only receives the end product and thus does not gain the same understanding of the project and the issues that the developer faced. The Agile Method can therefore be argued to lead to better customer relations. Figure from: Stack Exchange, 2013. Figure 3 - Agile methodology diagram
  17. 17. 11 | P a g e The Agile Method also promotes continual improvement of the application by taking positive, and negative, aspects of the current iteration forward to be expanded upon in the next iteration. In other methods, this can only be achieved at the end of development, ready for the next major release. This ensures that positive features are capitalised on and promoted, optimising the quality of the application. A further benefit of this method over other traditional methods, such as the Waterfall or Spiral models, is that if the development is running behind schedule and the deadline is likely to be missed, it is possible to liaise with the client and choose to focus on core requirements. Dropping any non-critical requirements from the development plan ensures that a working application, albeit one with reduced functions, can be handed to the client instead of having an application that is only half-developed. This means that using this method increases the chances of the developer meeting the deadline by having the project moving consistently forward and not grinding to a halt by trying to achieve goals and requirements that cannot be implemented. 3.3. Requirements elicitation Well thought out, well-structured requirements generally lead to more successful project which meets client expectations and the delivered application is fit for purpose (Hickey & Davies, 2002). It is therefore important to ensure that the requirements gathered provide sufficient detail, are realistic and achievable within the timeframe of the project. “Requirements elicitation is the process of seeking, uncovering, acquiring and elaborating requirements for computer based systems” (Zowghi & Coulin, 2005, p. 1). This process may be time consuming, but it helps to prevent a project going over-budget, being delivered late or failing to provide the required functionality (Jones, 1995, p. 86) Using the Agile Method means that requirements will be gathered at the start of iteration. This allows the requirements to take into account any issues faced or lessons learned during the previous iteration and differs from the Waterfall Method, where the requirements for the entire project are thought of prior to starting development. Using the Agile Method means that the requirements elicitation method will need to be run several times throughout the course of the project. This makes choosing an elicitation method critical; a method that takes a long time to get results would be an inappropriate choice as it would delay the development of the project. For this reason a questionnaire would be considered an inappropriate method of requirement elicitation for this project, as the time spent waiting for replies to the questionnaire could eat into valuable development time. There is more to requirement elicitation than the client simply telling the developer what they want; a more detailed research process is required. This involves finding exactly what the client realistically expects from the application as well as looking at similar projects that have been developed to see what features these offer that could be integrated into the development of the application. A combination of this information can be condensed into well-structured and achievable requirements that can be implemented into the project.
  18. 18. 12 | P a g e The elicitation methods that were considered for this project are shown in the table below, along with a summary of their appropriateness for the project: Method Outcome Appropriate? Interview the client (may be formal or informal, in person or via online correspondence) Greater understanding of what is expected, issues with current solution etc., however the quality of answers is dependent on questions asked. Yes – This is a core method that will be used, this will allow us to understand exactly what the client is expecting, the reasoning behind the project and the time scale in which it is required. Keeping in constant communication with the client will promote customer satisfaction. Interview the end users (may be formal or informal, in person or via online correspondence) Understand what the users actually want, if the client and end users expectations are the same, again the quality of answers is dependent on the questions asked. Yes – another core method that will be used to see if the user wants the same features etc. as the client, also allows information to be gathered that could have been missed from the client interview. Prototyping Rapid prototyping would develop a small section of functionality; this could be used to get feedback on the section from users or the client, and could be used to estimate a timescale for the project. No – High cost of failed prototypes, not required due to the chosen agile methodology, however a prototype would allow evaluation of the proposed approach to development. Case study A report which will allow the understanding of the current system/application An example of a case study model is the critical incident technique which observes the human interaction of the current system. (Woolsey, 1986) No – This project does not build upon an existing application, and as case studies are almost always retrospective, it is not appropriate for this project. Brainstorming Could be conducted alongside the interview process; a way for all ideas that may not necessarily have been discussed during the interview to be presented. Yes – Will allow for the members involved to discuss ideas that may not be core to the project, but ideas that are revolutionary, or never before done, but would be welcomed if possible. Figure 4 - Table showing possible elicitation methods As the above table shows, three different methods were selected. These are interviews both with the client and with the potential end-users of the application, as well as brainstorming sessions which will be run in conjunction with the interviews. Using several different methods, and focusing on the end- user as well as the client, ensures that the functions of the application meet the client’s expectations and results in a usable application that meets the requirements of the end-user. By brainstorming with the client, allows the developer to pick up on any implicit requirements that have not be explicitly stated by the client but that would still greatly benefit the application and ensure the client is satisfied with the final product.
  19. 19. 13 | P a g e 3.4. Risk analysis As with any project there is risk involved, and although using the Agile method mitigates risk, there are still some risks to the project. The table below shows the possible risks, their potential impact level and finally what reduction strategy will be in place to ensure they are mitigated as much as possible during the project development. Risk Impact level Reduction strategy Tor network unavailable – Internet access or the Tor network unavailable High Use a Tor simulator such as the Tor Path Simulator (TorPS). Poor productivity – Developer’s motivation inhibits the project’s development High Set 20 hours a week minimum for the project, more when needed. Setting small milestones will increase motivation and productivity. Regular meetings with the client will ensure the milestones are met. Technical risk – Project is too complex to implement High Regular meetings with the client ensuring they are kept up to date with the development, and adjust the requirements to allow for a work around if possible. Programmatic risk – Customer changes their mind about wanting the project developed High Find another client or adapt the project to cater for the client’s change of heart. Inherent schedule flaws – due to the uniqueness of the project, it is difficult to estimate and schedule. Medium Better to overestimate than underestimate timescales; use the Agile methodology to renegotiate the schedule with client. Requirements Inflation - more features that were not identified at the beginning of the project emerge that threaten estimates and timelines. Medium Keep in constant contact with the client with regular meetings etc., only accept more features if timescale allows. Specification Breakdown – Only during the development does a conflicting requirement become apparent. Medium Contact the client, work out a solution that would have the lowest impact. Insufficient resources – Unable to develop the project due to not having access to a required resource. Medium See if the resource is really required, look for ways to reduce resource use previously in the project, and try to gain the required resource. Incorrect budget estimation – Overall cost of the project starts to increase and spiral Low There is a budget of zero for this project, to maintain this open source software and libraries will be used. Figure 5 - Table containing potential risks and countermeasures
  20. 20. 14 | P a g e 3.5. Schedule The project has a hard deadline of September 12th 2014; at which point the application, all documentation and the accompanying report must be completed. A Gantt chart showing the planned schedule can be found in Appendix 3. As the chart shows, some additional time has been allowed to factor in potential delays during the project. However due to the nature of the project and methodology used, it is possible some iterations will take less time than others, and more or less iterations can be added. This is a very adaptable schedule, and is only used as a base as it will likely to change once development begins. 3.6. Professional issues Appendix 2 contains the ethical checklist that accompanies this project. This document revealed that there were no ethical concerns raised by this project. To ensure copyrighted code is not used, only open source libraries and code will be used and, should code be required from other sources, it will only be used after getting the express written permissions from the author/owner. Should any questionnaires or user feedback be required during the testing phases of the project then this will be conducted anonymously, and all respondents will be under no pressure from the developer to take part. User information will never be put at risk and the creation of this application will at no point compromise users data or identity. Any attacks that are developed as part of this project will be conducted on a closed network where the developer has complete control of the Tor node. This ensures that users of the Tor network are not, at any point, affected by the development of this project. Should a situation arise during the development of the application in which a potential professional issue arises, this will be dealt with before it occurs to ensure that the project never breaks any ethical codes or laws. 3.7. Conclusion of the section This section has justified the use of the Agile Method as a project methodology, having fully considered its advantages and disadvantages over more traditional models, like the Waterfall model. Furthermore, the importance of requirements was considered and the methods used to elicit the requirements for this projects were discussed. Potential risks of the project were analysed and countermeasures were implemented to mitigate the possible effects of these risks. Finally an estimated schedule for the development of the application was drawn up, which factors in some unforeseen delays during the development stages. With these aspects of project management in place, a smoother development should be possible and the application should meet the requirements and be delivered to the client by the deadline.
  21. 21. 15 | P a g e 4. Application Development 4.1. Iteration 1 4.1.1. Requirements All requirements for this project will be split into two categories: functional requirements and non- functional requirements. Functional requirements describe what the software should do whilst non- functional requirements judge the operation of the software. By their nature, non-functional requirements can be difficult to evaluate because they tend to be based on the subjective opinion of the assessor rather than being fact-based. During the requirements elicitation for this iteration, all of the non-functional requirements were elicited. Unless otherwise stated, these will be presumed to apply to each iteration of the project, although they will only be discussed in this section of the report. In the case of this project, the non- functional requirements can be considered as principles based on the ISO 9126-1 software quality model (ISO, 2001) which the project should aim to meet and can therefore not be attributed to one specific iteration. The functional requirements for this iteration were elicited using the aforementioned methods and are shown below: Requirement Importance Level Connect to the Tor network High Send and receive a version cell High Decode NetInfo cell to extract data from it High Handle errors from destroy cells Medium Figure 6 - Table containing requirements for iteration 1 The non-functional requirements for this project can be seen in the table below: Quality Characteristics Requirement Importance Level Portability Able to run the application without installing Tor High Able to run on multiple platforms (Windows, Mac, Linux) Medium Not require the application to be installed to run Medium Reliability No more than 10 bugs on delivery High Efficiency Use as little computational resources as possible such as RAM. (No more than a 1gb of RAM) Low Usability No GUI High Precise and constructive error messages High Documentation High Universal naming standard High Dependability Able to operate normally or abnormally without threat to life or environment Med Legal Only use open source software High
  22. 22. 16 | P a g e Maintainability Able to expand the system to incorporate new features, fix defects or deal with new technology. High Adaptability Able to change the system to handle additional domain concepts Med Figure 7 - Non-functional requirements The importance of considering both functional and non-functional requirements when developing the application can be seen from the first functional requirement: to be able to connect to the Tor network. Clearly, this is a critical requirement, failure to connect to the network will prevent the project from being continued. One simple way to connect to the Tor network would be to install Tor and allow the application to use the Tor client. However, this would inhibit the first non-functional requirement: to not need to install a Tor client in order to use the application. Failure to consider both functional and non-functional requirements during the development of the application could result in some of the requirements being contradictory and thus not all of the requirements would be able to be met. The second functional requirement, to be able to send a version cell, requires a packet to be sent to a Tor node informing it of the current version we wish to communicate using. This packet must fulfil the criteria outlined in the Tor protocol specification document (Dingledine & Matthewson, n.d.). This should be a simple requirement to achieve. This requirement is critical to the development of the project as it sets up the communication between the client and a Tor node. The third requirement, to decode a NetInfo cell, will likely prove to be challenging. The data contained within the NetInfo cell must be extracted accurately and in the correct order. The first three functional requirements were all considered to be critical; these requirements provide the base upon which the application can be developed. Failure to meet these requirements at this stage of the project could jeopardise the entire project as they provide key functionality to the application. The fourth functional requirement - to handle data from a destroy cell - is also important, but is not a critical requirement as, although it is desirable, it will not affect the application’s functionality. Therefore, in this iteration, the first three functional requirements should be prioritised. As already established, the non-functional requirements (NFRs) will affect the entire project and their importance should not be under-estimated. The portability NFRs may seem simple to achieve, but fulfilling these requirements will have major impacts on the project, and will, for example, have an effect on the programming language chosen as it must be cross-platform compatible and be capable of being used to achieve the functional requirements. Usability, maintainability and adaptability NFRs should be simple to implement and can be said to be of critical importance to the project. This project intends to create a library suitable for further development, an application with poor usability features would not be chosen over existing Tor libraries and therefore if this application is to be successful the usability NFRs need to be met. The legal NFR of only using open source software needs to be achieved as the project has a budget of zero. This is therefore a simple, but critical, requirement to implement. The reliability NFR, to have no more than ten bugs on delivery, was explicitly mentioned by the client. However, this could prove to be a challenging requirement to assess the success of. Whilst testing may
  23. 23. 17 | P a g e show that there are little or no bugs in the application, this might not be a true representation of the application because there may be bugs in the application that did not show up during testing. 4.1.2. Design By using the Agile Method, the upfront design is minimized; the developer only designs what is required for each iteration, which dramatically reduces the large upfront design cost that other methodologies incur. Moreover, by only implementing the design as and when it is required, risk is reduced and the developer ensures that all the necessary features are designed. Implementing the entire design in one go could lead to features not being used etc. making it confusing to the end user. This does not mean that features designed in earlier iterations will not be carried over to later iterations of the application. Despite being the first iteration, some design decisions made here will impact the rest of the application. An example of a design decision that will affect the entire application is the programming language used as this will not be able to be changed after the first iteration without dramatic consequences. This makes the choice of programming language a critical design decision. There were three key contenders for programming languages: C, Python and Java. C was discounted as the author has considerably less experience in this language than either Python or Java. To decide which of these languages was more suitable for this project, the advantages and disadvantages of each were considered. Python was found to be the more suitable language for this project, as existing Tor libraries use Python and it makes sense to use the language that Tor developers are already using as it will help to achieve the application’s goal of being used for future development. Another reason for choosing Python over Java was that using Python it is much easier and more effective to extract bytes from packets of network data than it is using Java. Despite this, Java was a serious contender due to the developer’s considerable experience in the language and the speed in which Java can run – which can be up to ten times quicker than Python. The decision was further complicated by the fact that both languages are cross-platform compatible and therefore would both be able to achieve the non-functional portability requirements. Python is not without its disadvantages in relation to this project; at the start of the project the developer was relatively inexperienced in this language, and threading in Python is extremely hard and has been strongly criticised as being “fundamentally broken” (Wittber, 2009). The deciding factor was that the client implied that he had a preference for Python being used for this application. The version of Python to be used was also seriously considered, with the final choice being Python 2.x. Despite being the older version of the language, this was deemed the most appropriate version of the language to use as several existing libraries anticipated to be used to provide functionality are currently only fully compatible with Python 2.x. While some libraries have 3.x versions available, these still contain bugs and tend to be considered to be in Beta mode. The operating system used to develop the application is an unimportant decision, as the portability NFRs state that the application must be compatible with all operating systems and Python can be used on all operating systems. The only potential issue is that Python will have to be installed on Mac and Windows operating systems, although it comes preinstalled on Linux. This also applies to the libraries that the developer expects to use throughout the project. However, the developer’s personal preference for developing applications is to use Linux and consequently this operating system will be used to develop the entire application. As discussed in the requirements and specification section of this report, the application does not require a GUI as it would bring no benefit to the application. Designing a user-friendly, efficient and
  24. 24. 18 | P a g e scalable GUI would take considerable time and the absence of a GUI significantly reduces the complexity of the design section. The time saved by not having to design and develop a GUI will be invested into further increasing the quality of the code, as well using the additional time to try and implement more of the requirements. 4.1.2.1. Design features of a library To achieve the usability NFR, it is important to consider the way that the library will be designed. A poorly designed library would likely not be used for future development as developers would probably opt for one of the existing libraries if it were significantly easier to use. It is therefore important to ensure that design structure is simple to use, is intuitive and promotes efficiency. To enhance usability, the single responsibility principle will be implemented; this means that each component implemented in the library should only be responsible for a single section of functionality or a single feature. This makes it easier for the user to understand precisely what they can expect from each function of the application, which should help to make users feel confident in further developing the application in the future. Two popular naming conventions are used for Python, these are mixed case and lower separated with an underscore. The Python PEP 8 documentation recommends that the words be separated by an underscore as it is claimed that this facilitates readability (Van Roussum, 2014). Therefore, this naming convention was chosen as it would further achieve the usability NFR. The names of the relevant functions and variables also needed to be considered. Variable names such as x, y, etc. are extremely poor names - they do not give any information about the data they contain. It was decided that all names should provide as much data as possible whilst remaining a sensible length. This will facilitate easy development and usability as there should be no confusion over what a variable contains or what a function will do; the name should make this information clear to the user. Both the chosen naming convention and the descriptive variable names help to meet the NFR of maintainability – making it easier for the current developer to work on the application as well as for users to further develop it in the future. To further achieve the usability NFRs, detailed comments about all functions within the application will be required. These should provide the user with information concerning the required input, what the function does and what the function will return. It could be argued that the above features are not strictly necessary as a good application would always be favoured over a lesser application, however the amount of effort required to make these significant improvements is negligible and the implementation of these decisions could potentially increase the speed of development by making it easier for the developer to identify pre-existing functions. 4.1.2.2. Version control Version control is an essential feature to be implemented in the application. Although it will not affect the development of the project, it is a safeguard that means were anything to go wrong the code can be retrieved from a specified point. It also offers the ability to track all changes made to the code, which will help locate bugs within the application. For this project, Git was selected over Subversion. This decision was based on the personal preference of both the developer and the client, who also uses Git and therefore it was easy to share code between the parties involved in the application’s creation.
  25. 25. 19 | P a g e 4.1.2.3. Design conclusions This section has required more time than was previously anticipated, this was because so many of the design decisions that needed to be made in the first iteration would have effects upon the entire development of the project. It was therefore essential that sufficient thought and consideration was put into these decisions, as failure to make the right choice would lead to greater delays later in the project development. It could also be argued that creating such a detailed design in the first iteration will speed up the development process and ensure that potential issues will be averted as a result of the decisions made in this section. 4.1.3. Implementation The aim of this iteration was to implement all four functional requirements, as detailed at the start of this section. Furthermore it was hoped that as many of the NFRs as possible would also be achieved during this time. Developing the application required several pieces of software to be selected. The most important tool was the code editor Sublime Text 2, this is a text editor which enables code to be written. While it is argued that an Integrated Development Environment (IDE) is more appropriate for developing code, due to the extensive testing and debugging functionality that they provide, they are more complicated to use than a text editor and the testing environment may not be suitable for this application. It is also the developer’s preference to use a text editor, as he has more experience of this method. By using the extensive testing functionality that Python provides, no negative effects of using a text editor over an IDE will be present in the final application. The developer tried to implement the functional requirements in order of their importance; for example, connecting to Tor was the first step undertaken. To do this an SSL connection was made to the Tor node. It used the Tor node’s IP address and ORPort. This was easily implemented and only required the three lines of code shown below: Figure 8 - Code snippet showing connection to a Tor node While this is a simple method to connect to the Tor node, it works and is less complicated than other methods and it was felt best to avoid over-complicating things where possible. However, an improvement was almost immediately thought of. This method requires the user to know the IP address and ORPort of the Tor node, which may not be easy to find out. To simplify the method, and increase usability, it was decided that users should be able to enter either the nickname or the IP address and ORPort of the node that they want to connect to. This was not implemented during this iteration, as it was felt that this should be suggested to the client at the end of this iteration and, if approved, implemented in the following iteration. The second requirement, to be able to send and receive a version cell, was the next requirement to be implemented. It was decided that, because all cells need to be created in the same format and following the same protocol instructions, a function would be created to automatically pack a cell to
  26. 26. 20 | P a g e the correct format, thus preventing any code duplication. This achieves the usability and maintainability NFRs. A build cell function was therefore created, which takes the command to be used and the payload and correctly packs this into the correct format of the cell. This is shown in the code below:
  27. 27. 21 | P a g e The decoding of the NetInfo cell was perhaps the most challenging requirement to be implemented in this iteration. This was because the developer is still relatively inexperienced with Python and the Tor protocol documentation is not very clear and contains several ambiguities. However, despite these challenges, the NetInfo cell was able to be decoded, although the process overran the estimated timescale dedicated to this section as a result of the aforementioned challenges. To promote efficient code, the developer used an ‘If’ statement to dynamically extract data contained within the packet. The NetInfo cell could contain multiple IP addresses or multiple formats of IP addresses (i.e. IPV4 or IPV6), an appropriate but somewhat inefficient method, would be to run multiple ‘If’ statements for every possible eventuality. This, however, would mean at least eight ‘If’ statements would be required just to extract the client’s IP address. As the code below shows, the developer managed to use a single ‘If-elif’ statement, by doing so dramatically reducing the chances of errors in the code and increasing readability for users. Due to the complexity of decoding the NetInfo cell, this iteration was already starting to fall behind schedule. The decision was therefore made not to implement the handling of the destroy cell as part of this iteration as it does not affect the core functionality and was merely a desirable, rather than a core, requirement. However, it was mentioned to the client and it was agreed that this feature will be implemented in a future iteration. 4.1.4. Testing Although testing may “Often feel like an exercise in futility or at best a waste of time” (Arbuckle, 2010, p. 1), it is a critical area of development. Testing ensures the software functions according to the expectations defined by the requirements/specifications. The overall aim of testing is to find bugs or issues that would negatively affect the functionality of the application, its usability and/or maintainability.
  28. 28. 22 | P a g e For this iteration functional testing, which verifies a function performs as expected using a small subset of inputs as well as white box testing, where the tester has full knowledge of the implementation will be conducted. To enable the application to be thoroughly tested, several testing methods were considered. It was decided that a combination of the unittest framework and testing manually were the most appropriate methods of testing for this application. This is because unittest makes it possible to quickly test a large number of input values and it is also heavily integrated with Python. The results from unittest are also displayed in a very comprehensible manner, making it easy to locate and fix bugs. Manual testing also has its advantages, such as being able to test features of the application that a unittest might not be able to do and manually testing each function will allow a realistic user scenario to be tested. To ensure that there are no anomalies in the results, for each testing round each test will be run three times. Should any issues be presented, these will be investigated and corrected before re-running the tests to ensure the bugs have been removed. This process will be continued until all bugs are eradicated from the application. Test No. Test Test method Succeeded? Comments 1 Can connect to a node Unittest Yes 2 Passes correct value to version function Unittest and Manually Yes 3 Creates the correct version cell Unittest Yes 4 Sends the version cell Manual Yes 5 Receives the Netinfo cell Unittest Yes 6 Able to extract the payload of a Netinfo cell Unittest Yes 7 Successfully able to extract the data contained within the payload Unittest Yes 8 Store the extracted data as a dictionary Unittest Yes 9 Create the payload of a NetInfo cell to be sent Unittest and Manually First round: No Second round: Yes First round of testing showed up an error where the IP addresses was being displayed as negatives, this was because there were not being formatted correctly, once this was fixed, the test was able to be passed 10 Builds the NetInfo cell correctly to be sent Unittest and Manually Yes 11 Send the NetInfo cell to the first node Unittest and Manually Yes Figure 9 - Testing results for iteration 1 As can be seen in the testing results in above, eleven tests were conducted for this iteration. All functions were thoroughly tested, with ten tests being passed first time. One test, however, failed.
  29. 29. 23 | P a g e This test was to ensure that the correct payload of the NetInfo cell was created. The creation of the NetInfo cell proved to be incorrect as negative IP addresses were being passed. This obviously cannot be allowed, and was found to be a result of the formatting of the IP addresses had been done using the signed char method rather than the required unsigned char method. Once this had been changed, the test was rerun and was successfully passed. 4.1.5. Moving forward from Iteration 1 While this iteration has overrun the allotted time by two weeks, and not all of the functional requirements have been met, for the most part it can be considered a success. A new project plan has been created to showing this, and how this delay has been taken into account for the future iterations, to still insure the project is completed on time, this can be found in appendix 4 They delay is because a detailed design section was developed and this should enable future design development to be achieved quicker and more efficiently. The three functional requirements that were implemented have been implemented successfully and to a high standard, for example the use of functions to reduce code duplication was implemented to help achieve many of the usability NFRs. The majority of the NFRs have already been achieved, which is a significant achievement in such a small amount of time. The testing of the implemented features was a success, despite one function requiring a bug to be dealt with. Moving forward to iteration 2, the recommendation of using an Onion router nickname as well as the IP address to connect to a Tor node will be suggested to the client and the timescale will be altered to take the complexity of the Tor protocol documentation into account.
  30. 30. 24 | P a g e 4.2. Iteration 2 4.2.1. Requirements During the demonstration of the previous iteration to the client, he was generally pleased with the development to date. The unachieved requirement of handling data in the destroy cells was mentioned to him and he explicitly requested that this be completed in this iteration. It was also decided to implement the use of Onion router nicknames to identify Tor nodes in addition to the existing IP addresses to facilitate usability. This discussion, as well as several informal interviews conducted with potential end-users of the application, elicited the following requirements. Requirement Importance Level Create a circuit through the Tor Network High Create a circuit of any length High Create a stream through Tor to a web server High Able to retrieve webpages from an internet web server through Tor High Create a circuit using specified nodes Medium Create multiple streams through Tor to a web server Medium Handle errors from destroy cells Low Figure 10 - Functional requirements for iteration 2 It was evident that no further NFRs needed to be added to the original specification and that the existing NFRs should be carried forward into this iteration. The requirement of creating a circuit through the Tor network is perhaps the most challenging requirement faced in the project to date. To achieve this requirement, the calculation of shared keys between nodes will need to be achieved. The encryption of packets will also need to be implemented if this requirement is to be achieved. This is extremely difficult and the Tor documentation is, once again, full of ambiguities and proves a major challenge to developers. In light of this challenging requirement, the predicted timescale has increased from three weeks to four and the hours allocated to the project have been increased in order to develop this aspect of the project. However, once this requirement is implemented it will provide a base upon which the application can be developed. It is therefore critical that this requirement be fulfilled during this iteration. The requirement of being able to create a circuit of any length should be easily achieved once the requirement of being able to create a circuit has been successfully implemented, as it expands on the code used for this process. The creation of a stream through Tor to a web server is also likely to be a simple requirement to fulfil, as this will, once again, expand on the circuit that has been created. Overall, this iteration contains some very difficult requirements to achieve, with the majority of the requirements being dependent on the successful creation of a circuit through Tor. 4.2.2. Design The design for this iteration builds on the design section from the previous iteration, however one small design feature was needed to be considered before implementation. With the greater use of cells, all requiring different commands to be associated with a specified cell, it was important that the
  31. 31. 25 | P a g e method used to identify the cell was clear. There was two possible solutions, to use the command id number of the cell or to use the English command. This was thought at great length, as by implementing the English command for the cell command would make the code easier to use, however it would also increase the chance of errors being implemented into a program, I.E a spelling mistake. Also with so many different commands with similar names this might make it more confusing to the user to decide on which command to use. It was therefore decided to implement the number based commands for setting the commands in packets, due to the lower risk of a user entering the incorrect number, reduction in the chances of errors due to spelling mistakes, and being easier to implement as no formatting issues need to be considered. It was also chosen above the English string version as this is what is currently used in the Tor protocol. 4.2.3. Implementation The key requirement for this iteration was to build a circuit through the Tor network. This needed to be split into two parts. The first challenge was to connect to the first node, and once this was successful the developer had to connect to the second and subsequent nodes. This was split into two parts because the cells needing to be sent are different. For example, the cell sent to the first node needs to be a ‘create’ cell, and a ‘created’ cell would be expected back. For the second node onwards, an ‘extend’ cell would be sent and an ‘extended’ cell would be received back. Here an implementation decision needed to be made. Tor currently uses two encryption methods (Dingledine & Matthewson, n.d.): the NTOR and TAP protocols. There is no particular advantage to either method, however the TAP handshake is slightly easier to implement and is the original Tor encryption method which means that it is able to be used on nodes running older versions of Tor, whereas the newer NTOR protocol may not. In addition, more documentation is available for the TAP handshake and for these reasons this was chosen as the encryption method to be used in the project. To create a ‘create’ cell, the Diffie-Hellman protocol must first be used to create a shared key which only the client and the first node knows. To calculate the client’s data for the handshake to be completed, a function was created to prevent code duplication as this will be required to be used extensively for circuit creation throughout the application. The code below shows the DH protocol being conducted, with the creation of x (the private key) and X the public key that will be sent to the Tor node. The public key is encrypted with the onions remote key. To prevent an errors being implemented into the application the decision was made to use a hybridEncryption that has already been created by Dr Gareth Owen. This was decided over creating our own because the time for this iteration was fast running out due to the complexity. It allowed more time to be spent on other sections rather than trying to re develop something that already has a proven success rate. Finally it was chosen as this is thoroughly tested, and thus will not introduce any bugs in the application regarding its use.
  32. 32. 26 | P a g e Figure 11 - Code snippet showing the creation of Diffie Hellman public and private keys Once the payload and the client’s half of the Diffie-Hellman key had been calculated using the above function, the packet was created using the build cell function as previously implemented, thus dramatically reducing code duplication. The client’s private key x, is important to be stored as this will need to be used to decrypt the packets that is received. For this a variable within the TorCircuit class has been used. This method of storing the key was chosen over storing all keys in an array for example, is this makes the key easier to be used later on in the application, with less chance of error. The Tor circuit class is shown below: Figure 12 - Code snippet showing the Tor Circuit Class It was then important to receive and decode the received ‘created’ cell as this would complete the handshake and provide the server’s public key, and shared key. This means that packets could be encrypted to the first node, the first step of the circuit. To retrieve the data contained within the packet, the payload of the data was extracted and, by using the Tor documentation, which states that where in the packet each piece of data is located, we were able to extract the public key, the derived key data as well as a unique key shared between the client and the first node.
  33. 33. 27 | P a g e This is shown in the code below: Figure 13 - Code snippet showing the decoding of a Create cell As shown in the code above, the calculated KH, Df, Db, Kf and Kb are returned to TorHop. TorHop is a class which handles the creations of the circuits. By being able to save the values of these variables in the class, they will be able to easily call later on, when they will be needed for encrypting and decrypting packets. This method of extracting the shared key data, as well as various other keys, could be reused for decoding received ‘extended’ cells, as these share the same methods of encryption. The only difference is that the ‘extended’ cells would first require decryption as they would be encrypted with other Onion node keys. The method to decrypt the cells is shown below: Figure 14 - Code snippet showing the decryption function This was extremely challenging to implement as it required the public and shared keys of the node as well as other encryption variables to be correctly stored as these would later be used in decryption. It
  34. 34. 28 | P a g e was vital to get these in the right order, as otherwise the packets would not be correctly decrypted or encrypted. During the implementation of this requirement, the Tor protocol became a hindrance to the development. This was mostly down to the feature of Tor that does not send back a cell if an incorrectly configured cell has been sent, thus not giving the user any feedback about what has gone wrong. It was also discovered that, if a cell was received, it would a be a destroy cell that contained little to no information, or finally a relay cell that could not be decrypted to provide any useful information due to not extracting the data properly. This was a very frustrating time for the developer as many days were spent trying to chase down an error without knowing where to even start looking for it. After several weeks chasing down errors, a circuit can finally be created and thus the first two requirements of this iteration were successfully completed. By this point, however, the project was running behind schedule – this can be attributed to the complexity of Tor and the accompanying documentation. Fortunately, from this point onwards, this iteration’s remaining requirements proved simple to implement as creating a stream used the circuit and encryption previously implemented and it was simply a case of sending a specially configured relay cell through the Tor network. This is shown in the code below: Figure 15 - Code snippet showing the creation of the Stream cell As shown in the above code section, the creation of a stream simply required the host name or IP address along with the port. It is then passed to the build cell which builds a relay cell containing the data, before being encrypted with the selected nodes encryption keys and sent. Once again this was created as function within the TorCircuit class. The regularly received destroy cells provided the perfect opportunity to achieve the requirement carried forward from Iteration 1 - to handle the destroy cells, which now inform the user of the cause of the error.
  35. 35. 29 | P a g e Figure 16 - Code snippet showing the conversion of error codes to English The above code snippet shows how the error code contained within the destroy cell is passed to the above function, comparing this code to a dictionary containing all error codes and there meaning before returning to the user the English error. This method was chosen over just simply informing the user that a destroy cell has been received, is because this aims to achieve the NFR of usability. By providing the user with a more detailed and in depth error message, will allow for easier debugging. 4.2.4. Testing As with the first iteration, testing will be conducted using unittest and manually. However unlike the first iteration which only looked at functional and white box testing, this testing section also needs to consider Integration. An Integration test verifies that the parameters passed between modules are handled correctly, used when a module is developed at a later stage than the module it is interacting with. This is an important testing area to complete as this will ensure functions that were development in the first iteration are capable of being used for functionality development in this iteration. Although the effectiveness of this test can be argued, with some suggesting it is a waste of time, the little time it adds to the testing makes it worth it, especially if a bug is found as this can be quickly fixed, and thus not affecting functions later on during the development that may use it. The testing strategy can be seen below, this shows the test to be conducted, how it was conducted and if the test was successful. As with the first iteration in each round, the tests was run three times to ensure no anomalies where present in the results. Test No. Test Test method Succeeded? Comments 12 Connect to the first hop Unittest Yes 13 Calculate the shared key between client and node Unittest 14 Create a CREATE cell containing the relevant data Unittest Yes 15 Send the CREATE cell to the first node. Unittest Yes
  36. 36. 30 | P a g e 16 Receive the CREATED cell back from the first node Unittest Yes 17 Ensure a cell of cmd 3 is received Unittest Yes 18 Extract the payload of the CREATED cell Unittest Yes 19 Extract the first node half of the key Unittest Yes 20 Calculate KH, Df, Db,Kf, Kb from the payload Unittest Yes 21 Check the derived key data is the same as KH Unittest Yes 22 Calculate the shared key Unittest Yes 23 Ensure the nodes entered in the array are correctly passed to the function Unittest / Manually First round: No Second round: Yes Failed the first round do to being passed as a single value for all nodes, rather than a value for each node, this was a simple change to make and by doing so it passed the second round of testing. 24 Search the consensus by a nodes nickname for their IP address and OR port Unittest Yes 25 Packs the IP address and OR port of a selected node in the right format Unittest Yes 26 Calculate the shared key half Unittest Yes 27 Build the EXTEND cell Unittest Yes 28 Correctly encrypt the packet Unittest Yes 29 Ensure the packet count is correct Unittest Yes 30 Send the packet to the correct node Unittest Yes 31 Receive an EXTENDED packet back Unittest Yes 32 Handle a destroy cell correctly Unittest Yes 33 Extract the payload of the EXTENDED packet Unittest Yes 34 Decrypt correctly the payload of an EXTENDED packet Unittest Yes 35 Ensure a RELAY_EXTENDED cell is received Unittest Yes
  37. 37. 31 | P a g e 36 Calculate Shared key Unittest Yes 37 Extract derivative key data from the payload Unittest Yes 38 Ensure KH and derivative key data are the same Unittest Yes 39 Ensure KH, Df, Db, Kf, Kb are updated to the TorHop object Unittest Yes 40 Ensure a stream can be created to the a specified webserver Unittest / Manually First round: No Second round: Yes This failed first time due to incorrect formatting of the target webserver, but by correcting the ip address or web address and correct port, this issue was able to be solved, and the second test was passed 41 Ensure the payload of the stream packet is correctly formatted Unittest / Manually Yes 42 Correctly create the stream relay cell Unittest Yes 43 Correctly encrypts the packet to allow it to be sent through the network Unittest Yes 44 Ensure a packet is received back from the Stream request and handled appropriately Unittest Yes 45 Ensure a RELAY_CONNECTED cell is received Unittest Yes 46 Check the data (GET request) is correctly formatted to a packet Unittest / Manually Yes 47 Ensure the packet is encrypted correctly Unittest Yes 48 Ensure a packet is received back and handled appropriately Unittest Yes 49 Ensure all the data is received Unittest / Manually First round: No Second round: Yes In the first round only a single packet was received and did not contain all the data. This showed we must look for more than a single packet, which was implemented by using a while true loop, which allowed this to pass the second round of testing Figure 17 - Testing results iteration 2
  38. 38. 32 | P a g e As shown from the above test results, there were several tests that failed first time. This was to be expected on such a complex iteration, however fortunately the three tests that did fail were easily corrected. For example test 49 - Ensure all the data is received failed as it was wrongly assumed all data would be received in a single packet, once this was found not to be the case, a simple loop was implemented to ensure all packets was received. Once completed the test was re run and the test was passed. Overall the tests that failed, was not due to issues in the functionality of the application, but rather developer error, this shows the importance of testing so issues such as these can be picked up early on during development and fixed. 4.2.5. Moving forward from Iteration 1 As with the first iteration this iteration also over run the predicted timescale but two week, it will therefore be necessary in the next iterations to increase the amount of time dedicated to the development. A new project plan has been created to showing this, and how this delay has been taken into account for the future iterations, to still insure the project is completed on time, this can be found in appendix 5. The main reason for the delay is due to the project being much harder and trickier than first expected; the lack of error messages provided by the Tor is proving to be the most difficult feature of the Tor protocol to handle and days have been spent trying to debug the software, despite not knowing what is going on. With normal development of an application, if something goes wrong an error message would be presented to the developer to indicate the area where the error occurred, but with the Tor protocol, this does not happen. A further delay is caused by the confusing Tor protocol documentation, which contains many inconsistences, making understanding exactly what is required for each function difficult to understand, and on several occasions help from the Tor community has been required to understand certain points. However all requirements in this iteration was completed, including the requirement that was not completed in iteration 1. All functional requirements have been completed, while also satisfying the NFR of maintainability and usability. It would have been quicker to develop the functions without considering these requirements, but by considering those during development will ensure an application that meets and exceeds the client’s expectation.

×