Networking
Introductionβ
p2panda doesn't have a strict specification around it's networking layer. We want p2panda to be used in all sorts of contexts, let it be Low-Energy Bluetooth, LoRa, Tor, Mesh Networks, internet or anything else.
Terminologyβ
- In p2panda nodes participate actively in the network, storing and replicating data with each other.
- In order to achieve this, certain networking guarantees need to be met:
- Nodes must know or be able to discover the addresses of other nodes on their network
- Nodes must be able to establish (uni- or bi-directional) channels of communication with other nodes at these known addresses
- Patterns needed for achieving these conditions can be described as discovery and connectivity.
While the current replication protocol assumes a bi-directional communication channel it would be theoretically possible to build p2panda on top of a broadcast-only networking topology (for example on top of LoRa or other radio protocols). The append-only nature of the underlying p2panda data type make this a good fit.
Strategiesβ
Rather than giving strict requirements we are listing currently known, implemented and recommended strategies for different networks to achieve connectivity and discovery of nodes.
Developers can see these recommendations as starting points for their own approaches, experiment with their own or fork and build on top of them.
Internet layer with libp2pβ
To achieve discovery and connectivity on top of the internet layer in the internet protocol suite we've successfully implemented a stack based on top of the libp2p protocol using QUIC with TLS encryption as the underlying transport layer.
libp2p is a collection of general purpose, modular, p2p networking protocols. It solves discovery of nodes in local networks (via mDNS) and the internet (via Rendezvous nodes) and establishes connectivity between them, even when they are behind firewalls or NATs (via UDP holepunching and / or Relay nodes).
This strategy is currently implemented in our aquadoggo
reference
implementation.
Core Abstractionsβ
libp2p comes with its own set of core abstractions and data types which are used throughout the system. While p2panda does not make use of them in its own core specification, they are part of this strategy.
- Addressing
- Working with addresses in libp2p.
- Connections and
Upgrading
- Establishing secure, multiplexed connections between nodes, possibly over insecure, single stream transports.
- Peer Ids and Keys - Public key types & encodings, peer id calculation, and message signing semantics
Diagramβ
Transport Layerβ
aquadoggo
uses QUIC as the transport layer for communication between nodeslibp2p
QUIC specification: https://github.com/libp2p/specs/tree/master/quic
Discoveryβ
- Addresses can be added manually if they're known and static
- On the same local network, discovery is achieved via mDNS
- Otherwise we're utilising Rendezvous Nodes to allow discovery over the internet for nodes with dynamic addresses
mDNSβ
- Nodes existing on the same LAN can discover each other over mDNS and then initiate connections
libp2p
mDNS discovery specification: https://github.com/libp2p/specs/blob/master/discovery/mdns.md
Rendezvous Nodesβ
- A rendezvous server handles registering new nodes and making their addresses known to other nodes on the same network
libp2p
rendezvous server specification: https://github.com/libp2p/specs/tree/master/rendezvous- Any node on the network can act as a rendezvous node
Identityβ
libp2p
relies on the identify protocol to exchange basic information between nodes. This includes identification of external addresses, exhange of unique identifiers and supported network protocols- The identify protocol provides a vital mechanism for a node to learn it's external address before registering with a rendezvous server. Without this information, rendezvous registration will fail
- p2panda does not have a strong recommendation around node identities. The identity is derived from the key pair and hashed to not leak the original public key. It is possible to generate the key pair newly on each start of a node
libp2p
identify protocol specification: https://github.com/libp2p/specs/tree/master/identify
Connectivityβ
- Once nodes have discovered each other, then they need to be able to establish
a connection. As stated above,
aquadoggo
uses QUIC as the transport layer for all application data. However, nodes often exhibit different networking capabilities depending on several factors:- Do they have a static ip?
- Are they publicly accessible over the internet?
- Are they behind a public or private NAT or firewall?
- Strategies for answering these questions dynamically and negotiating how a connection can be established are required
Direct connectionβ
- The easiest situation is that one node has a public IP address, in this case it can be dialed by the other node on it's libp2p multiaddress
- Nodes listen on their announced multiaddresses for incoming connections
libp2p
connection specification: https://github.com/libp2p/specs/tree/master/connections
Relayed connectionβ
- If a node wishes to connect to a second node that is not publicly addressable, a third node with a public address can act as a relay for their messages
- Nodes listen on their announced relay multiaddress for incoming, relayed connections
libp2p
relay specification: https://github.com/libp2p/specs/blob/master/relay/circuit-v2.md
Direct Connection Upgrade through Relay (DCUtR)β
- Where possible, relayed traffic will be upgraded to a direct connection
- This involves a process of gathering knowledge about the nature of the NAT a node is hidden behind and then negotiating a hole-punching procedure which ultimately results in a direct connection being established
- This is not always successful, using QUIC improves the chances of success
libp2p
DCUtR specification: https://github.com/libp2p/specs/blob/master/relay/DCUtR.md
aquadoggo
networking modesβ
- In order to enable discovery and facilitate connectivity as a/for edge nodes,
any node on the network can serve the above protocols in "client" and/or
"server" mode. In short, an aquadoggo node can function in the following
modes:
- Rendezvous server
- Rendezvous client
- Relay server
- Relay client
- The network modes can also be combined. For example, a node may run as both a relay client and rendezvous client or both a relay server and rendezvous server.
Security and privacyβ
- These strategies allow very flexible discovery and connectivity building
blocks which vary drastically in terms of privacy and security. While
aquadoggo
by default opts into the most secure setting, depending on the security and privacy requirements of an application different measures should be taken into account - Utilising Rendezvous and Relay nodes might leak IP addresses on the internet, potentially with untrusted and unknown nodes. It is recommended to keep a table of known IP addresses instead and only connect to them. If this is not an option, it is recommended to run p2panda over an anonymization layer like Tor. If Tor is not an option it is possible to only create data in a federated setting where a trusted node is statically hosted and used by multiple clients. This node will not forward IP addresses from clients
- Nodes can connect to each other as soon as they are discovered and speak the same protocol. In case it is required to isolate your network from other "valid" nodes but still keeping dynamic discovery intact, a form of protected overlay network is recommended. This can be achieved with making use of VPNs or adding authentication to centralised and known rendezvous points. "Network keys" and such are currently not supported by libp2p / QUIC and also not recommended (redundant and expensive double encryption of transported data)
Tor Strategyβ
An integration of onion services in aquadoggo
is pending as soon as the Rust
port arti
is ready. For now it can still be achieved with wrapping the node
around an external onion service layer.