distributed-systems

Kademlia: wth is it? 🤔

A casual explanation of Kademlia, XOR distance, distributed hashmaps, torrents, and why this idea matters for large distributed systems.

#distributed-systems#system-design#ML

The previous week I was reading a paper about Kademlia (wth is it?🤔).

Don't worry it is very clear concept to discuss about, but some prerequisites first.

Prerequisites

1. Bitwise XOR: I know you will say wth but this analogy helps me to remember 😂. Think of XOR as a guy trying to manage a very messy love life (he has a wife and ex-wife😭). In any situation he don't want his wife to know about his ex-wife.

If he is with just his current wife, he is happy (1). If he is with just his ex-wife, he is also happy (1). But if his wife AND his ex-wife are in the exact same room at the exact same time? Absolute disaster (0). If neither of them is there? He is just bored (0). So that is XOR 😂

2. HashMap/Hashtable: If you worked on one programming I am sure you will know it. It is just a dictionary that stores data exactly like this: <Key, Value>. Eg. <Name, Someone>

3. Distributed Systems: In computer science, if a task is too massive for one computer to handle (like storing a petabyte of data or handling a billion requests), we don't build a bigger computer. We get thousands of normal computers to collaborate and share the workload.

About Kademlia

Kademlia is essentially a Distributed Hashmap. In languages like Python or Java, a traditional hashmap stores data strictly on local memory. However, if a dataset is too large for one machine and you need a reliable method to distribute millions of <Key, Value> pairs across a network of independent computers, Kademlia is the blueprint you use to make it happen.

How it works?

In Kademlia, every single file (Key) and every single computer (Node) gets assigned a random 160-bit(20-byte) ID number.

Major takeaway from the paper: Store the file on the computer whose ID number is mathematically closest to the file's ID number.

So to calculate which one is closest we will use some distance metrics aka the Bitwise XOR. To find out how far a computer is from a file, it just XORs their two ID numbers together.

Because XOR math is perfectly symmetrical and consistent, every computer inherently knows exactly which direction to send your search request.

When you search for a file, your computer doesn't check every computer that exists inside the system. It looks at its address book and sends a request to the 3 computers it knows that are mathematically closer to the file's ID. Time Complexity: O(log n), where n is the total number of computers on the entire network.

Where it is used?

Yeah you guessed it! It is in Torrents 😂. The dudes that created torrent are so genius right? The way how it internally works OMG.

I would expect it to be used in LLMs in the future. Since the current models have billions of parameters, we can't store all the layers inside a single computer. Rather, we store the layers across multiple computers then use Kademlia. Just a thought that I have lol 😂

Suggestions

If you want to checkout the math proves why this network never crashes, checkout the Original Paper.

Hope it wasn't boring.