White Paper
United States Patent
The Crosspass protocol is protected by the United States patents US11658823B1, US12113906B1, US11870908B1.
Overview
There are two widely used methods to transfer confidential data. The first method constitutes sending URLs containing lookup identifiers that reference data stored in the cloud. When a recipient accesses one of such URLs, the data are retrieved from the cloud and the URL expires. But are the data getting deleted from the cloud afterwards? There is no way to verify this and users must trust the service operator. By either malice or mistake confidential data may remain in the cloud. At some future time the data may be stolen by an attacker.
A variant of this scheme is to store merely encrypted blobs in the cloud and to include the decryption keys in the URLs. Recipients download blobs from the cloud and then decrypt them. However, because these URLs must be sent by some medium (email or IM), there is no guarantee that the decryption keys themselves would not linger in the cloud. Again, the attacker could obtain confidential data, this time by harvesting both the encryption keys and the blobs.
The second method involves asymmetric cryptography and is used by Instant Messengers. They offer end-to-end encryption by storing public keys of users in the cloud and brokering an exchange of these keys. This requires that users trust server APIs to give out correct public keys. A compromised server may send an attacker’s public key, thus positioning him as a Man in the Middle (MITM) who can eavesdrop on the communication and impersonate users.
In order to verify that the public keys are correct, users must use a secondary communication channel to verify the fingerprints of the public keys. However, the fingerprints are as long as four credit card numbers and are bewildering. As a result, public key verification is not enforced by IM apps and users often do not know that they must do it for the end-to-end encryption guarantee.
Crosspass takes a third approach by requiring that users exchange short verification codes (PINs). These codes facilitate negotiating a symmetric encryption key and authenticating the communicating parties. This does not inconvenient users because they are already accustomed to short verification codes when they sign into websites.
The Crosspass server, unlike servers supporting Instant Messengers, does not broker public keys. Consequently, user’s devices need to be online at the time of the transfer in order to communicate peer to peer. Because mobile phones are almost always online either through WiFi or Cellular Data, Crosspass is designed as a mobile app.
Security Model
Crosspass assumes that,
- The mobile device of the sender and the recipient have not been compromised.
- If the PIN is intercepted by a third party, the intended recipient will use the PIN earlier than the third party.
Protocol
The protocol requires two legs of communication between the sender and recipient and takes on average under 5 seconds.
sequenceDiagram
actor Alice
participant Server
actor Bob
Note left of Alice: (a) store payload on device <br> (b) generate random PIN
Alice->>Server: Register device
Note left of Server: reserve new lookup ID
Server->>Alice: lookup ID
Alice-->>Bob: communicate lookup ID and PIN out-of-band
Bob->>Server: retrieve by lookup ID
activate Bob
Note right of Bob: wait for response
Server->>Alice: relay by<br>Push Notification
Alice->>Server: challenge
activate Alice
Note left of Alice: wait for response
Server->>Bob: relay
deactivate Bob
Note right of Bob: derive shared secret
Bob->>Server: challenge answer
activate Bob
Note right of Bob: wait for response
Server->>Alice: relay
deactivate Alice
Note left of Alice: derive shared secret
Alice->>Server: encrypted payload
Server->>Bob: relay
deactivate Bob
Note left of Server: free lookup ID
Note right of Bob: decrypt payload
Password Authenticated Key Exchange (PAKE)
Crosspass uses the oblivious pseudorandom function (OPRF) described in the OPAQUE protocol to authenticate the two mobile phones peer to peer.
Crosspass uses a cryptographic blinding technique to hide the PIN from the relaying servers and to establish a symmetric key securely. To do this, Crosspass uses the Curve25519 Elliptic Curve group for cryptographic operations. This group has $8q$ elements, an identity element $O$, an abstract addition operation $\oplus$ on the elements, and the value $q$ is a large prime number slightly bigger than $2^{252}$. By Lagrange theorem this group has a subgroup $G$ with $q$ elements. Because $q$ is prime, this subgroup is cyclic and each of its elements $P$ is a generator. We can list the elements of $G$ as $O, P, P \oplus P, P\oplus P \oplus P$, … such that the list loops back onto itself once P is added to itself $q$ times. Define $O = 0P$, and $nP$ as $P$ added to itself $n$ times. Now, we can rewrite the same list as, $$0P, 1P, 2P, …, (q-1)P$$ The integer multipliers $0,1,2,…,(q-1)$ of $P$ are called scalars. Because these scalars cycle between 0 and $(q-1)$, they are integers modulo $q$. If $r^{-1}$ is an inverse of $r$ modulo $q$, then for any $P$ in $G$: $$r^{-1}(rP) = P$$
Let $h$ be a hash-to-curve function, hardened with Scrypt, that maps arbitrary strings to the elements of $G$. Designate a PIN by $x$ and let $P_x = h(x)$ be an element of $G$, represented as a 32-byte value.
-
The recipient’s device selects an ephemeral 32 bytes long random scalar $r$ and “mixes” the PIN using scalar multiplication, $$ P_α = rP_x $$ Using the TweetNacl library this would be computed with scalarMult($ r, P_x $). An eavesdropper who captures $P_a$ would not be able to recover $x$ because he is “blinded” by an ephemeral value $r$ which is unknown to him.
-
When the sender’s Crosspass app receives $P_α$, it uses another random scalar $k$ (also 32 bytes long) to compute the following two values which it sends to Bob’s mobile device,
$$ P_β=kP_α $$ $$ V=kP_0 $$
Here $P_0$ is the generator of $G$, also known as the base point. Using the TweetNacl library, $V$ would be computed as scalarMultBase(k).
After these values have been exchanged, both Alice and Bob can derive the same symmetric cryptographic key.
Let $r^{-1}$ be the inverse of $r$ in group $ Z_q^{*} $ (which can be computed in polynomial time) and consider the scalar product of this value with $P_\beta$:
$$ r^{-1} P_β = r^{-1} (kP_α) = r^{-1} (k (rP_x)) $$
We can reorder the scalars as $r^{-1} r k$ to observe that the $r$ blinder “cancels out” and that we get the equation,
$$ r^{-1} P_β = k P_x $$
The left side of this equation can be computed by Bob and the right side can be computed by Alice. Therefore, Alice derives the encryption key by, $$ H(x, V, kP_x) $$ where $H$ be the SHA-256 hash function. Whereas, Bob derives the encryption key by, $$ H(x, V, r^{-1} P_β) $$
Now both Alice and Bob have the same 32 bytes long symmetric key and further encryption proceeds by conventional means. In addition to deriving the same key, the protocol ensures that each side had derived the same key before the encrypted payload is released. This allows to (a) mark the exchange as completed if the two sides have authenticated, and (b) to detect a brute force attack on the PIN and to deny requests after 3 failures.
Symmetric Cryptography
The symmetric cryptography is based on the American Encryption Standard (AES) with 256-bit keys and Galois Counter Mode (GCM) blocks.
Push Notifications
The protocol requires Alice’s mobile device to have an Internet connection (WiFi or Cellular Data) when Bob initiates the first leg of communication. However, her mobile app does not need to be running at that time because it will be activated by a Push Notification. (On iOS this functionality is implemented using a Notification Service Extension.)
Scrypt
As mentioned above, the hash-to-curve function is hardened by Scrypt, which raises the cost of a brute force attack. The input to it is formed from the concatenation of the PIN, note’s lookup ID, and a timestamp nonce. The timestamp helps prevent replay attacks because the sender’s device checks that the retrieval request is fresh.
In and integrated setting, such as with the Crosspass Slack plugin, a salt known to both sender and recipient is added to the Scrypt input.
Pseudo Anonymity
Crosspass does not require users to login and it does not ask for phone numbers or email addresses. Simply opening the app allows one to use it.
However, Crosspass relies on Push Notification delivery via the Apple Push Notification Service (APNS) for iOS and via the Firebase Cloud Messaging (FCM) for Android. On iOS this exposes Alice’s device ID which Apple Inc. could use to link Alice to a Lookup ID of the note.
Also, Crosspass verifies in-app purchase receipt of the sender’s side. If Alice uses iOS, then Apple Inc. can use this information to link Alice to a Lookup ID of the note. Similarly, if Alice uses Android, then Google LLC can use this information to link Alice to a Lookup ID of the note.
Anonymous 2FA
When creating a new share, the sender can opt in to enable second-factor authentication (2FA). It is needed whenever he considers it insufficient that his shared note could be opened by an attacker with a probability of 3 in 10,000. (It’s the same difficulty as guessing a bank ATM PIN, given a stolen ATM card.)
Sometimes, there is an implicit 2FA opportunity that reduces the burden on users. In scenarios where Crosspass is integrated with other systems, the second authentication factor is provided by the context. For example, with Crosspass Slack integration, the sender and recipient share a direct message (DM) channel. A random salt—unique per note—is added to the Retrieve button action inside the DM channel, without exposing the salt to the Crosspass server. This salt is automatically appended to the PIN.
Explicit 2FA is often implemented in other systems as a short OTP code delivered via email or SMS. But in Crosspass, the 4-digit PIN already serves as the first factor. Also, emailing or texting recipients as part of the authentication flow would deanonymize them. Thus, Crosspass uses a different approach that preserves anonymity: it provides two 2FA methods, each suited to different scenarios. This allows for a balance between ease of use for the recipient and robust security.
Method 1: Challenge-Response Scheme
In this 2FA variant, the recipient answers a question about himself that is easy for the true recipient but difficult for an attacker to guess. This could be the recipient’s phone number or email address. After all, the sender will know something about the recipient from the means by which he is communicating with him (e.g., WhatsApp, Telegram, email). Thus, while creating a new share, the sender is asked:
What do you know about the recipient? Pick one:
- His email address
- His phone number
- His username on: [Telegram, Signal, Twitter/X, …]
Note: This will not be exposed to servers and will not deanonymize the recipient.
When the time comes for the recipient to answer the chosen question, he is shown a hint—after all, he might have two phone numbers and several active email addresses. The hint takes one of the following forms:
- email
m...g@gmail.com
- phone number
3xx-7xx-5xxx
- Telegram nickname
@b...z
The question, so as not to be spoiled by the hint, must have at least as much entropy as a 4-digit PIN to avoid weakening the security of the system. For this reason, the hint should also omit the length of the nickname or email prefix. (Note that three letters already provide more entropy.)
The hint is transferred to the recipient in the challenge leg (see flow chart above), and the answer must be returned as part of the challenge answer leg. This way, the hint is never exposed to the Crosspass server. (Otherwise, an attacker could retroactively deanonymize the recipient if he later gains access to the sender’s contact lists.)
This method has the advantage of requiring no additional communication between sender and recipient beyond the lookup ID and PIN. However, there is a potential risk: if the sender’s identity is discovered (e.g., via the device’s push-notification identifier such as APNS or FCM), and if the attacker gains access to the sender’s contacts, the recipient could be deanonymized. Such an attack would require the cooperation of Apple, Google, or cellular providers—as well as access to the sender’s contacts—making it unlikely, though theoretically possible.
Method 2: Visual PIN
This 2FA method strengthens authentication by extending the PIN—not by adding digits, but by appending 4 hexadecimal digits presented to the user as emojis (for example: 👕 🌹 🔨 🍇). This yields 65,536 possible combinations. Combined with the 10,000 possible 4-digit PINs, this results in a total guess space of 1 in 655,360.
The user selects an emoji from each of the following 4×4 grids, presented one at a time:
(For more details, see the Visual PIN demo.)
The advantage of this method, compared to the previous one, is that it relies on numeric probability rather than on the hope that large internet companies won’t be compromised. The disadvantage is that it adds one more thing to communicate to the recipient: the emojis.
Also, emojis add an element of fun. Back in 1995, when one layman was asked why he preferred Windows 95 over Windows 3.11, he replied, “Because when you copy a file, you see it fly.”
Note: Why not simply add more digits to the PIN instead of using emojis? Because the Crosspass app is designed for laymen, the recipients should not be overwhelmed with what they have to type. To achieve this, Crosspass UX relies on the principle of symmetry. If the lookup ID is 4 letters, so is the PIN. If the PIN is 6 digits, then the lookup ID must also be 6 letters. But while users are used to 6-digit OTPs, a 6-letter lookup ID would feel too long to input manually. Thus, the combination of 4 letters (lookup ID), 4 digits (PIN), and 4 emojis (2FA) enhances security without breaking the symmetry in the UX.
Method 3: Proximity by LAN and Mesh
This 2FA method leverages local network proximity to provide an additional layer of security. Even if an attacker intercepts the PIN, he would need to be physically present on the same local network as the sender within the note’s expiration time to successfully retrieve the note.
This includes conventional LAN, as well as WiFi and the Bluetooth mesh communication like BitChat. For more details, see How Crosspass and BitChat can Strengthen Each Other.
Longer Lookup IDs
The lookup ID is designed to be 4 letters for convenience: it’s easy to type it while looking at it on another device or window, without complex copy-paste flows which increase the attack surface. The downside is that there are only so many combinations of 4 letters, which will be exhausted on a busy system. (An attacker can also stumble upon a valid note by guessing a random 4-letter combination; see the section on brute-force attacks.)
To address this, the Crosspass server sometimes issues longer lookup IDs upon share registration, following these rules:
- 4-letter IDs do not use the letter
I
- 8-letter IDs have the letter
I
in the 4th position - 12-letter IDs have the letter
I
in the 4th and 8th positions
Lookup IDs are always case-insensitive and are presented as uppercase letters. Example:
ABCD
ABCI-DEFG
ABCI-DEFI-GHJK
From a UX perspective, the Crosspass mobile app dynamically expands the Lookup ID input to request more letters when the user types I
in the last input square. When Crosspass is integrated with communication apps like Slack, Microsoft Teams, or Gmail, the lookup ID can be passed automatically—so the user never needs to type it manually.
Emoji + 4 letters as lookup ID
When four emojis are used from the user’s perspective for 2FA (representing 16 bits), a portion of these bits can be allocated to the lookup ID. For example, if 4 of the 16 bits are reserved for this purpose, the lookup ID space increases from approximately 460,000 to 7.3 million possible combinations, while the remaining 12 bits still provide sufficient entropy for 2FA — extending the equivalent of PIN space from 10,000 combinations to 40 million.
Self-Hosted Backends
Corporate clients or technically advanced users with their own IT infrastructure may choose to operate a dedicated Crosspass backend. This may be motivated by a desire to monitor all encrypted note traffic within their environment, or to implement custom controls against brute-force attacks and unauthorized access attempts. In such cases, full backend ownership allows them to apply their own policies and infrastructure-level protections while still benefiting from the Crosspass protocol and client applications.
In scenarios where a third party integrates the Crosspass OEM library into its own applications, that third party assumes responsibility for signing all push notification requests to sending devices. It also hosts the complete backend service to support the Crosspass protocol. In this configuration, the third party owns all of the lookup ID namespace and is responsible for directing recipients to its backend servers, without requiring any mediation from the central Crosspass infrastructure.
Alternatively, a third party may rely on users to continue using the standalone Crosspass app, while still managing all traffic through its own backend. In this case, the main Crosspass server maintains a mapping of lookup IDs to the third-party backend endpoints responsible for handling them, along with the push notification tokens of the original sending devices. When a recipient attempts to access a note using the Crosspass app, the app first contacts the main Crosspass server to resolve the lookup ID. The server responds with the address of the appropriate third-party backend, and the app redirects all subsequent requests to that backend. When a push notification needs to be triggered on the sending device, the third-party backend submits an authorized request to the central Crosspass server, which then delivers the push message using the registered token. The payload of the push notification includes the address of the third-party backend, allowing the device—once awakened—to respond directly to that backend.
In this shared-lookup configuration, all third-party backends operate within the same global lookup ID space, coordinated by the central Crosspass server.
On-Device Storage
Data on the device is deleted after two weeks at the latest. When the shared item (note or password) is transfered to recipient, it is deleted on the sender’s mobile device immediately. A received item remains on recipient’s mobile device for a day and then gets deleted.
Crosspass stores a cryptographic key on the mobile device. The key is not backed up to the cloud and is only accessible to the Crosspass app. Confidential data is encrypted with this key and is stored in a ciphertext form in a local database on the device.
Crosspass blanks the screens when the app is placed into background by the operating system. This avoids confidential data leaking via screenshots which the operating systems takes for performance reasons.
Exploits in Files
The Crosspass mobile app allows transferring JPEG images and (soon) PDFs. Both formats can harbor exploits and are carefully handled.
When one normally emails images and PDFs, they are validated by the webmail service (e.g., Gmail). The downside, of course, is that the file attachments are exposed to a third party. In contrast, in an E2EE setting, the safety check must be performed by the recipient’s device.
Crosspass takes the approach of simply stripping anything that could contain exploits before the files are exposed to the recipient.
JPEG
All images are converted to JPEG before they are sent and are reduced in size—sufficient to remain suitable for printing photographs of ID documents. On the receiving side, it is considered an error if the format is not a valid JPEG or if the data is too large. Second, the image is safely checked for a decompression bomb. Finally, all EXIF metadata is removed by redrawing the image on a clean canvas and returning only the rendered result.
For PDFs, all active scripting elements are removed. On iOS, the PDF is completely flattened using the “print-to-PDF” functionality that iOS provides. On Android, this cannot be done, so all scripting elements, launch actions, and most annotations (including embedded files) are removed.
Timing Attacks
By the way of constant time computation, Curve25519 shields the encryption from CPU side-channel timing attacks. However, timing data can associated the sender and recipient at the network level, nullifying the anonymity of the exchange. For maximum anonymity the recipient should avoid using an IP address that could be connected to him. The recipient can dissaciate himself from the IP address by using a public Wi-Fi network. (Note that when a mobile device connects to a Wi-Fi network, it already spoofs the MAC address (iOS: Settings, Wi-Fi, Settings, Private Wi-Fi Address). This way the operator of Wi-Fi router, will not be able to learn the recipient’s identity.)
What’s more, the real-time timing data may also set off a phishing attack against one or both parties. For example, when the recipient starts a Crosspass retrieval, the attacker can make the retrieval fail (for example, due to a forced network outage) and then send the recipient an email asking him to reveal the PIN. By staying anonymous to the attacker, the likelihood of such attacks is reduced. For Push Notifications, the sender divulges his mobile device ID to the Crosspass server, therefore this isn’t entirely feasible in his situation. An adversary who is able to connect this data to the sender’s identity and initiate a Phishing campaign via a secondary channel (email, SMS, etc.). The sender needs to be security aware in order to defend against this attack: he needs to be able to spot a phishing attempt and keep his PIN private. Except from the Crosspass mobile or desktop app, Crosspass will never request that you input your PIN.
Brute Force Attacks
Brute-force attacks cover the cases when a note is attempted to be unlocked (retrieved and decrypted) by an attacker. Crosspass handles it as follows:
-
If Alice’s device observes a wrong PIN three times, the note is locked and cannot be retrieved by Bob even if he provides the correct PIN. Alice must manually unlock the note before Bob can retry. This prevents a brute force attack on Alice’s device.
-
If Bob gets an error message that he entered a wrong PIN five times in a row, then the Crosspass app causes him to wait a day before retrying. This prevents a brute force attack on Bob by a MITM impersonating Alice.
Horizontal Spray Attack
Since Crosspass uses short 4-letter uppercase lookup IDs for usability—and many of these will be used as the number of users grows—an attacker can eventually try all of them exhaustively using the same PIN, without needing a list of active IDs. Because there are only 10,000 possible PINs, there is a statistical chance that some notes will use the attacker’s chosen PIN.
However, from the sender’s perspective, the chance that it is his specific note that gets retrieved this way is less than 1 in 10,000—lower than the probability of a brute-force attack focused solely on his note.
This attack could only occur if the attacker gains full control of the Crosspass server or if he bypasses throttling mechanisms (e.g., via a botnet). Each attempt requires running a resource-intensive Scrypt hash function, which increases the computational cost of the attack.
Ultimately, this scenario is further mitigated by the use of 2FA—either implicit or explicit—as described in the preceding sections.
Denial of Service (DoS)
The number of API requests to retrieve a particular share are throttled irrespective of the originating IP or device. This limits the total number of Push Notifications to a particular device, thereby mitigating a targeted attack against that device. The remaining discussion concerns the case when different lookup IDs are being requested.
Each shared note in Crosspass is uniquely identified by just four letters, allowing for a maximum of 450,000 active shares at any given time. A share locks up if someone tries to retrieve it three times with the same random four-letter code and random PIN. Therefore, a large-scale denial-of-service attack can attempt to lock as many shares as possible by doing this for as many random 4-letter codes as possible. Since DoS attacks are rare, mitigation support should not cause the app to be hard to use when a DoS attack is not occurring. For this reason, Crosspass takes a multistage approach,
- Device Attestation
- CAPTCHA
- Money Hold
First, Crosspass relies on device attestation services of Apple on iOS and Google Services on Android.Crosspass interfaces to their APIs in order to verify that the user is using a physical device and not a simulator.
Second, some cellular network operators place mobile customers behind a single public IP address. A CAPTCHA is displayed if an excessive number of API calls are coming from the same IP address.
Both CAPTCHA and device attestation methods can be circumvented through inexpensive manual labor, rooted mobile devices, or language model generators. The Money Hold mechanism, however, raises the cost of attempting to brute-force PINs. Here’s how it works: the in-app purchase system on both iOS and Android is used to charge a set amount to the recipient—let’s say \$10 for this example. The Crosspass server will only forward the Push Notification once it verifies a proof of purchase by checking with the relevant store server-to-server (either the App Store or Play Store). If the sender releases the secret message, the Crosspass server observes this event (but not the encrypted content) and then refunds the \$10 to the recipient. However, if there are four incorrect PIN attempts, the recipient forfeits the \$10.
The money hold is activated only if it is detected that the server is under a heavy attack. Note that the sender’s device notifies the Crosspass server when the PIN is wrong. The server, in turn, tracks the total count of wrong PINs for active shares. If the number of active shares is $N$ and the number of wrong PINs is $M$, then the ratio $R = M/N$ can be used to detect a DoS attack early. If $R$ is greater than a certain threshold (e.g. $0.1$), then the money hold is activated. The actual amount held must grow in proportion with the attempted abuse of the Crosspass server.
In summary, this layered approach allows the app to remain simple to use and to mitigate DoS attacks with certainty. Please note that CAPTCHA and Money Hold are not yet implemented.
DoS when the Crosspass server is compromised
Above approaches do not prevent abuse of the sending devices if the Crosspass server itself is compromised. However, the Money Hold concept can be enhanced so that the sender requires the Crosspass server to place funds in escrow when initiating a Push Notification. This can be achieved using a cryptocurrency that supports smart contracts or equivalent features. With this setup, upon successful PIN entry, the escrowed amount is returned to the Crosspass server. If the attempt fails, however, the funds remain on hold for an extended period.
The senders’ devices query the Smart Contract to determine the total funds the Crosspass server currently has on hold. Based on this information, the devices set new minimum amounts that the Crosspass server must escrow for subsequent transactions. These required hold amounts increase exponentially relative to the current total held in escrow. Consequently, a brute-force attack would compel the Crosspass server to continually lock up growing amounts of funds, making such attacks increasingly costly.
Policy on Short and Long Lookup IDs
Longer lookup IDs diminish the possibility for attackers to guess lookup IDs. With 8-letter lookup IDs, seven of the letters are random, since the letter I
is fixed in one position. (The letter O
is also excluded.) This gives 24^7 possibilities, no less than 4.5 billion variants. A 12-letter lookup ID increases the space to more than 63 trillion possibilities.
To prevent short IDs from being quickly exhausted, if a user creates many notes in rapid succession (perhaps, while experimenting or testing an integration), the server assigns longer IDs for those notes.
One potential attack vector is to create numerous app installations (or API keys in the Crosspass desktop app) to consume all short lookup IDs. This risk is mitigated by issuing only long lookup IDs for note registrations from new installations. Once a user is verified as human rather than a bot, subsequent notes he creates receive shorter IDs.