Section: New Results
Monitoring content cccess in peer-to-peer networks
Participants : Thibault Cholez, Isabelle Chrisment [ contact ] , Olivier Festor.
Peer-to-peer (P2P) networks are now commonly used to share files within the Internet. They offer lots of advantages compared to the client-server scheme by giving possibility to gather and share a large amount of resources with the collaboration of many individual peers. However, peer-to-peer networks also provide support for harmful and malicious activities that can voluntarily propagate strongly undesirable contents.
As peer-to-peer systems are self-organized, dynamic and do not have a centralized infrastructure, it is not obvious to collect information to measure them and to observe the behavior of malicious users. With passive monitoring we can observe, from one point, the P2P traffic without sending additional data into the network. However, these approaches do not allow to study specific contents at the network scale. Active monitoring removes this drawback but is more intrusive in the sense that some traffic (queries, files) is injected in the network to gather more information concerning the P2P system. Many crawlers have been used to study the different P2P protocols like Gnutella, Napster, e-Donkey and KAD. Alone, a crawler can just observe the network without acting on it. In the case of KAD, a crawler just discovers the peers but not the shared contents. To have a better view of the network and to control it, a crawler has been associated to a Sybil attack which consists in creating a very large number ( ) of fake peers, controlled by one computer, and placing them actively in the part of the DHT to observe.
We showed that recent protection mechanisms have been introduced in KAD to make this intrusive approach inefficient  . We assessed the protection mechanisms entered into recent clients to fight against the Sybil attack in KAD, a widely deployed Distributed Hash Table. We studied three main mechanisms: a protection against flooding through packet tracking, an IP address limitation and a verification of identities. We evaluated their efficiency by designing and adapting an attack for several KAD clients with different levels of protection. Our results showed that the new security rules mitigate the Sybil attacks previously launched. However, we proved that it is still possible to control a small part of the network despite the new inserted defenses with a distributed eclipse attack and limited resources.
We proposed then a new P2P Honeynet architecture called HAMACK  ,  that bypasses the Sybil attack protection mechanisms introduced recently in KAD. HAMACK is composed of distributed Honeypeers in charge of monitoring and acting on specific malicious contents in KAD through keywords and files. Those Honeypeers are set very close to malicious references of the DHT and are able to take control over them. Our approach does not rely on the injection of Sybils and is absolutely non-intrusive for the network besides the targeted contents. Quiet monitoring of all the incoming requests and eclipsing the malicious contents are some interesting features of HAMACK, to study and protect the network. The most accomplished feature is the possibility to announce many files for a given keyword with realistic and attractive attributes, in particular the number of sources. Through the announcement of fake files, HAMACK is able to attract and capture all the requests of a malicious peer: from the search of a keyword, to the final download request, assessing the actions of malicious users.
To achieve these features, HAMACK exploits the weakness of KAD  allowing to freely choose the KADID of a peer and relies on the very efficient search process of the KAD DHT. As described in our model, a search launched on a target of HAMACK will be captured by the Honeypeers with a very high probability (93 %). Our work highlights a new dilemna of KAD which has to choose between its routing efficiency and the safety of its indexed contents. HAMACK is implemented by a lightweight architecture and fully functional. It uses modified aMuled clients deployed on PlanetLab nodes, and coupled with a secured database. The first experiments run on the real KAD network helped to set the parameters of the architecture. They showed 3 important results: 1- the coordination between Honeypeers increases the efficiency of HAMACK, 2- the architecture has to fit with the latest constraints inserted in KAD and 3- a low upper bound of needed Honeypeers. Then, several experiments were run and showed that HAMACK is extremely efficient to attract all the requests of the target IDs, resulting in the total control of the contents. Finally, our final experiment poisoning a real content confirmed the great importance of controlling the number of sources to make an efficient honeypot.
HAMACK has been designed to study and fight against pedocriminal contents in P2P networks, and was deployed in this purpose  .