IPFS is Inter Planetary Filesystem. It was presented by Juan Benet of Stanford. IPFS is a P2P based exchange of Git objects using Bittorent protocol in a single swarm in a single repository.In his paper he talks about how it can be the permanent distributed web. IPFS provides a high throughput content-addressed block storage model, with content-addressed hyper links. This forms a generalized Merkle DAG, a data structure upon which one can build versioned file system.
Key features :
- DHT
- IPFS uses S/Kademlia DHT to find peers in the network, query for providers, get and put values.Each node has a Public key and NodeId is hash of the key.
- Kademlia uses the XOR distance to store values in the closest nodes. Resistance to sybill attacks.It requires nodes to create a PKI key pair, derive their identity from it, and sign their messages to each other.
- It also uses some features of Coral.
- Block Exchange
- Bittorrent protocol has block exchange of data. These pieces are exchanged based on some strategy like tit-for-tat or rarest piece first. IPFS uses Bitswap strategy.
- Unlike BitTorrent, BitSwap is not limited to the blocks in one torrent. BitSwap operates as a persistent marketplace where node can acquire the blocks they need, regardless of what files those blocks are part of.
- This strategy makes use of Bitswap credit and debt ratio. debt ratio increases if node receives more bytes than it has sent. Peers send blocks to debtor peers probabilistically.
- Merkle DAG
- Mutable Namespace
IPFS Objects and Merkle DAG
IPFS Object has the following structure
- Links - an array of links it references
- Data - byte array. blob of size < 256 kb
IPFS Link has the following structure
- Name - string name for the link.
- Hash - hash of the linked ipfs object
- Size - total size of target object
So here I have an example directory. file.txt and ss have the same content, their hash have to be same.
ipfs object get QmawZYe7nVgbonstM9YLkbJPrwaSMAJ7nkWsPFxHJbCLRF
on the root object gives this output.
{
"Links": [
{
"Name": "2A94M5J1Z",
"Hash": "QmNhPUwuUQ1uD1n22h2CEBFLKPCExCiVc7rcgHmMftmzsv",
"Size": 12562
},
{
"Name": "bank-full.csv",
"Hash": "QmXhyWEd21XEv4pJGHbxoFq6oud3HhADQjw6f5xR4NwDvo",
"Size": 4611473
},
{
"Name": "file.txt",
"Hash": "QmXrP2yBFo1jvWw2WnY1mdCYJdiabW1WCmQwsYw1Ltfd2M",
"Size": 32
},
{
"Name": "shogun",
"Hash": "QmdWtUhQzAX6e2xpDxZTJEwobHzUTuuVBWaYM8D5rzMTQs",
"Size": 622130
}
],
"Data": "\u0008\u0001"
}
As you can see the link names are the name of the files or directories but for individual file the links don't have names. Also if a file < 256Kb it does not reference any objects i.e links array is empty. file.txt is small and bank-full.csv is large.
ipfs object get QmXhyWEd21XEv4pJGHbxoFq6oud3HhADQjw6f5xR4NwDvo
on bank-full.csv
{
"Links": [
{
"Name": "",
"Hash": "QmRA9jHW1DFa4brtGSSmWeEpXRX5apS7zxvAfgbJ3F599N",
"Size": 262158
},
{
"Name": "",
"Hash": "QmNN8xinNToC6sz7xHMcBe6YPyd8Ryx3wWqkEeRYUTEEhn",
"Size": 262158
},
{
"Name": "",
"Hash": "QmbSXZPGz7GiMz3iP6r7V6zMCxhT2EzTZGVkdJc3mcXPkj",
"Size": 262158
},
{
"Name": "",
"Hash": "QmUEEzoSFDVQwKSZmQMW8U79jUptjLkJAcjMbZjoWrsnKa",
"Size": 262158
},
{
"Name": "",
"Hash": "QmQwWkwAiHDuuuYTX8S1Hbks7USkfaD7A5Vf8Qmpyz1uaP",
"Size": 262158
},
{
"Name": "",
"Hash": "QmY4QEsrrCWdmqAKtSUZWtpTPd58niySdHsq4YXH59ZpiK",
"Size": 262158
},
{
"Name": "",
"Hash": "Qmbp6oskBFZGE3AQhjmm8ZRzZ1rCRaWzUg34zQK7SP8Mxm",
"Size": 262158
},
{
"Name": "",
"Hash": "QmQyn37YawL1mCGs3SNmyLNRi1AuXsaNNwWVxkzuomTvQX",
"Size": 262158
},
{
"Name": "",
"Hash": "QmbjD9fqBk9kGF9W5vFFLHcnfiiXH8zE2pVRwBTWjxGdV3",
"Size": 262158
},
{
"Name": "",
"Hash": "QmU42pLqrKNp3hDNfgw74omWaqLrjMBWw3Uvx98d2CNn2u",
"Size": 262158
},
{
"Name": "",
"Hash": "QmPNrToiZUfUEC2w75bw51GizQPP9xwm6wa56vKgGfHZW3",
"Size": 262158
},
{
"Name": "",
"Hash": "QmSu1UK8xqvDbWZSTvHzYxEPz2qLNTcii5NVd7NSnDcSAm",
"Size": 262158
},
{
"Name": "",
"Hash": "QmVxGKfp77DfPUjvzKfKx8bpYDbSHZtrmSXzz8wyD7t7nH",
"Size": 262158
},
{
"Name": "",
"Hash": "QmZaiEbhTiXt7rvwPSR9FS6WyEosj2KmZdLqxPeZ8WCYrt",
"Size": 262158
},
{
"Name": "",
"Hash": "QmWhkGkiw5REEqntnke2v6SbzqpF5SctuwKtwngu28sARv",
"Size": 262158
},
{
"Name": "",
"Hash": "QmQsv8Nbfvt1RjtxbU5gyQLVprJ6Uz81N5HdAiffJ6zRoX",
"Size": 262158
},
{
"Name": "",
"Hash": "QmTZwqWCVjs876DQBQxZhG5XngVPpXAz8h8fsRonMQGruW",
"Size": 262158
},
{
"Name": "",
"Hash": "QmaMFU4hByFEpAX6ZEvcvia2Su8xwbKMfTTL74VMh3rYRM",
"Size": 153914
}
],
"Data": "\b\u0002\u0018���\u0002 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\u0010 ��\t"
}
ipfs object get QmXrP2yBFo1jvWw2WnY1mdCYJdiabW1WCmQwsYw1Ltfd2M
on file.txt or ss
{
"Links": [],
"Data": "\b\u0002\u0012\u0018dfad\nf\nc\na\nadkfakdfmaaa\n\u0018\u0018"
}
Merkle Dag
Merkle tree/Dag is used in Git objects and bitcoin, cryptography. Each node has a hash and it is hash of its children hashes combined.The root hash is the final hash of the object.
The leaf nodes contain the data. The links array is empty. Large files which has many ipfs objects/blocks does not have have link names for each one.graphmd can also be used to visualize the graph.
Versioning
Ipfs uses git like commit trees. Unchanged files point to previous objects. In Ipfs files are divided so if a part of large file is changed only that new object will be added to tree, rest will be deduplicated.
- block : a variable-size block of data.
- list : a collection of blocks or other lists.
- tree : a collection of blocks, lists, or other trees.
- commit : a snapshot in the version history of a tree.
Sharing Files
Coming to main point use case, sharing files with peers. So I added some files sent the hash to my friend and asked him to get
ipfs get QmawZYe7nVgbonstM9YLkbJPrwaSMAJ7nkWsPFxHJbCLRF
But it did not download on his PC waiting for a long time. I didn't understand why it didn't download.Also he was not in the list of my peers(ipfs swarm peers) but he was able to download via the browser.
https://ipfs.io/ipfs/QmawZYe7nVgbonstM9YLkbJPrwaSMAJ7nkWsPFxHJbCLRF
The problem was he had different version of ipfs than mine. You can check via ipfs id.
"AgentVersion": "go-libp2p/0.1.0",
"ProtocolVersion": "ipfs/0.1.0"
So after downloading the same version. I was able to download the file he sent instantly. His id is the highlighted one.
And also able to download the file via ipfs-java-api .
Things to note : java-ipfs-api requires target jdk 1.8 . When I tried to run my code I got major minor version error.Also before running the code the daemon should be running.
ipfs daemon
ipfs objects are pinned which are added via ipfs add. you can see all the list
ipfs pin ls will show all the pinned ipfs objects and you are serving them when you run the daemon.
IPNS
If the content changes the hash changes, if you need to serve some mutable content you can do via ipns. All you have to do is add the ipfs-path to your public key.
ipfs name publish /ipfs/QmawZYe7nVgbonstM9YLkbJPrwaSMAJ7nkWsPFxHJbCLRF
/ipns/<your pubic key> will download the above linked contents. Hence using this ipns link we can add new ipfs path to our public key and other users do not need to get this new ipfs link.