!BuddyCast4

Status: initial design

!BuddyCast4 improves the ability for keyword search by spreading more information about recent downloads. Buddycast3 spreads mere the SHA1 hash of the most recent 25 downloaded torrents. !BuddyCast4 augments this information with the search keywords used to discover this torrent, the ranking which the torrent had in the search results, and the full torrentname of the conducted download. With !BuddyCast4 the Tribler Megacache can be bootstrapped more quickly due to the spreading of torrentname and the sophistication of searching is boosted due to the availability of {search keywords, search result rank, selected torrent SHA1}

!BuddyCast4 improves the distribution of swarm popularity data by including recently checked torrent swarm sizes in a message.

!BuddyCast4 also enables each peer to build a table of surrounding peers that are special, such as in terms of taste similarity, Internet distance, upload capacity, bandwidth altruism, social distance, experience (age), or reliability. We believe that !BuddyCast4 represents the state-of-the-art in operational epidemic or gossip protocols.

To efficiently distribute peer information we transmit very extensive peer records and store them at each peer. By building a web-of-trust we estimate the truthfulness of each peer and use this to build a local view of peer characteristics. Note that this local view does not need to be accurate or reliable, as long as the majority is honest we converge. As we transmit an extensive peer record we only transfer this information once to a peer.

Previous BuddyCast implementations where extended by BarterCast, Moderation, and SocioCast. However, these where single-purpose ad-hoc extensions. Adding more extensions in this fashion is not feasible. With !BuddyCast4 we move towards a generic distributed database architecture with epidemic style synchronization. !BuddyCast4 can be further extended with more information and complexity.

Design principles

!BuddyCast4 is designed to balance complexity, state, and security.

An important design decision is how we deal with peers that are not connectable. Do we gossip information about peers which cannot be validated? For security reasons this should not be possible or do we allow the usage of friends as proxies?

Due to the extend of the information we require binairy encoding. Thus fixed order of fields and efficient (B)encoding.

Protocol

A !BuddyCast4 message contains of information on 16 peers. Those peers are selected from the peer cache on the basis of their properties in terms of taste similarity, latency, etc. Only the top peers for a given property are included, for instance, the top most similar peers and the peers with the smallest Internet distance. The following list shows the complete breakdown for all peers.

ToDo: what is the overhead for the establishment of an overlay connection? This determines the optimal exchange per connection (roughly 1 KByte or 4 KByte)

Top peers Amount
Taste similarity 5
Bartered bytes 5
Social network distance2
Internet distance1
Max upload speed 1
Moderator 1
Random peers 1
Total 16 Peers

For each of those 16 special peers we include the following information in a !BuddyCast4 message.

Field Size in Bytes Remarks
PermID 82
IP address 4
Port number 2
Connectable (boolean) 1
Similarity 2
Last seen (seconds ago)2
Internet distance (ms)2 Measured at the application layer
Max speed from peer (Kbps)2 Use measurements period of several seconds
Total from this peer (KBytes) 4 Zero if have not obtained any bytes from this peers
Total towards peer (KBytes) 4 Zero if have send any bytes to this peers
Social network distance (hops)1 Zero if there is no indirect relation through friends, foaf
Max_flow (KBytes) 2
Total 120+ Bytes

In order not to transfer information multiple times we maintain a database of what information has been sent to which peer. Timestamps are used to keep track of the last timepoint of synchronization with each peer. The delta since each timepoint is stored in the database and can be send to a peer.

After we contact a peer for the first time we send topmost similar peer 1 & 2. In the next Buddycast interval we include most similar peer 3 & 4, etc.