Summary on Recently Discovered V2Ray Weaknesses

Several weaknesses were discovered in the V2Ray recently, which could be used to identify V2Ray clients or servers that run VMess, TLS or HTTP protocol. Below is our summary and understanding on these weaknesses.

In general, these weaknesses fall into three categories:

Inappropriate authentications in VMess, making the servers vulnerable to replay attacks.
Hardcoded unique ciphersuites, leading to the rarely-seen fingerprints of the TLS ClientHello messages.
Failed attempt to parrot/mimic the HTTP server.

Replay Attacks against the VMess Protocol

As introduced in the specification (English version) of the VMess protocol, a VMess request looks like this:

16 bytes	X bytes	Other Parts
Authentication Credential	Command	Data

The 16-byte Authentication Credential is a HMAC associated with the user ID and a UTC timestamp.
The Command is encrypted using AES-128-CFB(iv, key), where the iv is the md5 hash value of the UTC timestamp, and key is the preshared one associated with user ID.

The following table shows the structure of Command after decryption:

1 byte	16 bytes	16 bytes	1 byte	1 byte	4 bits	4 bits	1 byte	1 byte	2 byte	1 bytes	N byte	P bytes	4 bytes
Version	Encryption IV	Encryption Key	Response Auth V	Options	Margin P	Encrypt Method	Reserved	CMD	Port	Address Type	Address	Random Value	Checksum F

The Encryption IV and the Encryption Key are used to decrypt Data, not Command.
The Margin P and Random Value are used as a padding scheme. Specifically, the 4-bit Margin P specifies the length of the Random Value to be between 0 and 15 bytes.
The Checksum F, serving as a MAC, should be the FNV1a hash of all plaintext in Command, excluding itself.

Inappropriate authentication

On May 31, 2020, @p4gefau1t reported that VMess servers could be identified by replay-based active probing, due to the inappropriate authentications.

VMess authenticates each request in two steps, using Authentication Credential and checksum. Unfortunately both of them can be circumvented.

First, the VMess server validates whether the timestamp in Authentication Credential is expired. The expiration time is 120 seconds at maximum and 60 seconds on average (see here and here for implementation details). That is to say, an attacker can record and replay a legitimate Authentication Credential within around 60 seconds to bypass this authentication.

Second, since the aes-cfb used to encrypt the Command does not provide any authentication, a MAC-then-Encrypt mechanism is used. As pointed out by @p4gefau1t, VMess fell into the same pitfall as Shadowsocks OTA mode did (See the English summary on the weakness of Shadowsocks OTA mode here). Specifically, since the length of the Random Value varies, the server will not be able to know where the Checksum F (MAC) is located, unless it blindly trusts the value in Margin P without any authentication (see here for implementation details). In other words, only after reading P+4 bytes, V2Ray will be able to validate whether the decrypted content is legal. If not legal, the V2Ray server will close the connection.

VMess server indeed has a replay defense mechanism. In particular, the server records the (Encryption IV, Encryption Key) of each request, regardless of the validity of the requests; and close the connection immediately when the (Encryption IV, Encryption Key) is seen before. Depending on her needs, an attacker can:

bypass this replay defense by alternating the (ciphertext of) Encryption IV or Encryption Key.
or intentionally trigger the replay defense to expect the inconsistency behaviors of the servers when seeing the same (Encryption IV, Encryption Key) first time and more times.

Exploiting these weaknesses, many replay-based probes are creatively crafted to identify the VMess server. We introduce them below in separate sections.

Replays with padding length field changed

Based on @p4gefau1t’s findings, @studentmain proposed and @p4gefau1t improved an attack to identify the VMess servers. For simplicity, below we present this attack in a slightly different way.

The malicious probe is a replay of the legitimate request, with many bytes changed as follows:

16 bytes	41 bytes	M bytes
Auth Info	Malicious Incomplete Command	Zeros

The Malicious Incomplete Command includes:

1 byte	16 bytes	16 bytes	1 byte	1 byte	4 bits	4 bits	1 byte	1 byte	2 byte	1 bytes
Version	Encryption IV	Encryption Key	Response Auth V	Options	Margin P	Encrypt Method	Reserved	CMD	Port	Address Type

In total, the attacker makes 16 connections to the server. In each connection, the attacker:

first sends a replay of the first 16 + 41 bytes of the legitimate connection, with both the last byte of the Encryption Key and the 4-bit Margin P changed to a value different from the ones in other connections;
then sends M bytes of zero (or random) data one byte per second, until the server closes the connection.

If the Ms recorded among 16 connections happen to be a list of non-repeated integers with the delta of max and min is 15, then it is very likely that the server runs VMess protocol.

The explanations of the attack are as follows:

To circumvent the authentication based on Auth Info, the attacker replays an Auth Info sent by the legitimate client in around 60 seconds.
To circumvent the replay defense based on (Encryption IV, Encryption Key), the attacker uses a different value of the Encryption Key in each connection.
To avoid the bit errors propagating to the Margin P, the attacker carefully chooses the last byte of the Encryption Key to alter. This is because this byte happens to be within the same 16-byte cipher block as the Margin P. (Note that, the bit error propagation of AES-128-CFB works as follows: changing a bit in cipher block Ci, will change 1) the specific corresponding bit in plaintext block Pi; 2) as well as the Random bit errors in all subsequent blocks.)
The attacker then exploits the malleability of the stream cipher to enumerate all possible values of the 4-bit Margin P in 16 connections.
After reading the 16+41 bytes, the server waits for the Address, Paddings and Checksum before closing the connection due to checksum error. Thus, the M measured here is actually N-byte address + P-byte padding + 4-byte checksum.
The attacker can thus infer the value of Margin P from M because the Paddings is the only field with varied length. (The length of the Address is a fixed value, because the address type is not changed.)

Replays that trigger inconsistent draining behaviors

After the patches to defeat the probes above, @nametoolong found two more types of replay-based probes that can still detect the VMess servers. Both of them are related to how the server closes the connection. Below, we introduce the first of them, and we leave the explanations of the second attack as an exercise to reader.

@nametoolong described the probes and the behaviors of the server as follows:

    Vector 1:
    Let M1 be the first 54 bytes of a valid session.
    Let M2=M1. Tamper with M2[48] (i.e. alter the 49th byte of M2).
    Replay M1. Connection is closed immediately.
    Replay M2. Connection is not closed.
    Replay M2 again. Connection is closed immediately.

The byte 48 (counting from 0) that got changed is the last byte of the Encryption Key.

In this attack, the attacker intentionally triggers the replay defense, and expects the inconsistent behaviors of the servers when seeing the same (Encryption IV, Encryption Key) for the first time and for more times. The detailed explanations are as follows:

Since the (Encryption IV, Encryption Key) in M1 is the same as the one in the legitimate connection, the server will detect this replay attack and thus close the connection immediately.
When it is the first time to send M2, since the server has never seen the altered (Encryption IV, Encryption Key), it will bypass the replay defense. The server thus waits for more bytes to come, rather than close the connection.
When it is the second time to send M2, since the server has seen the same (Encryption IV, Encryption Key) before, the server will close the connection immediately.

The V2Ray has actually been patched so that it will close the connections after reading a random number of bytes within a certain range, or after waiting for a random amount of time within a certain range. However, this attack is possible because of the inconsistent usage of the draining methods when different types of errors happen.

@nametoolong thus suggested:

    Drain the connection on all types of errors.
    It still needs to be considered whether draining the connection itself is a attack vector.

Our comments

Although we do not know whether the GFW uses active probing against VMess protocol, the attacks proposed above are feasible to the GFW. For example, it is observed that the GFW is capable of sending replay-based probes with no delay or arbitrarily long delay. We will investigate whether the GFW uses active probing against VMess protocol in the following work. At the same time, it will save us a lot of time if users can report which V2Ray servers were blocked when using what settings.

It may be a good idea to use a replay defense mechanism for the auth info that is based on both expiration time and nonce. On one hand, V2Ray uses a replay defense mechanism based on expiration time. It will thus consider a replay sent within the expiration time as valid. On the other hand, Shadowsocks-libev uses a replay defense mechanism based on nonce. But it requires the servers to remember these nonces forever until the key is changed. This seems to be complicated to implement as it should even still remember the nonce after a reboot of the software. Therefore, a replay defense mechanism based on both expiration time and nonce may be a good choice.

Frolov et al. found that various popular circumvention tools, including obfs4, Shadowsocks Outline, Psiphon’s OSSH and Lantern’s Lampshade, can be identified using the TCP flags and timing information when the servers close the connections. Frolov et al. thus suggested that servers should “forever read” on errors, so that the probers will be the first to close the connection. This way, it not only reduces the information leaked by server’s timeout value, but also lets the server close the connection with FIN/ACK consistently (see Fig. 1 here for more details).

Unique TLS ClientHello Fingerprints

On May 30, 2020, @p4gefau1t reported V2Ray clients would send TLS ClientHello messages with very unique fingerprints. Such unique fingerprints not only gave a censor the opportunity to identify the V2Ray clients and servers, but also allowed a censor to accurately block the TLS traffic by V2Ray without much collateral damage.

@p4gefau1t further identified that these unique fingerprints were partially caused by the use of a hardcoded ciphersuite. Specifically, this rarely seen ciphersuite would be used, when the AllowInsecureCiphers flag was its default value false.

V2Ray developer @xiaokangwang mitigated this weakness by using the default settings of go-tls library since v4.23.4 (see patches #2510, #2512, #2518). @tomac4t summarized a form, comparing the ClientHello fingerprints before and after the patches using tlsfingerprint.io. However, the fingerprints seem to be still quite unique.

To our best knowledge, as early as November, 2019, @klzgrad had already investigated the fingerprints of V2Ray v4.21.3 as well as many other TLS-based circumvention tools. The result shows most of them have rarely seen TLS ClientHello fingerprints.

Side notes:

As summarized in the Client Hello Fingerprinting section, many works have used ClientHello messages to fingerprint different TLS implementations. Frolov et al. discovered that the TLS ClientHello fingerprints of many popular circumvention tools were very unique (see Table 2 for more details). Frolov et al. thus developed utls and created tlsfingerprint.io.
@p4gefau1t investigated this issue because @rickyzhang82 demonstrated a machine learning model that can identify the TLS traffic by V2Ray with 0.9999 accuracy. The same model, without additional training, could not accurately identify the new TLS traffic of V2Ray after the developers made changes to the fingerprint.
@DuckSoft demonstrated that the blocking based on TLS ciphersuites can be written in one line of iptables rules.

Failed to Mimic the HTTP Server

On June 2, 2020, @p4gefau1t reported the V2Ray failed to mimic real HTTP communications. In particular, the two reported issues are:

Both V2Ray clients and servers will prepend a HTTP header only to the first TCP payload they send in each connection, making the mimicking traffic easy to be detected.
V2Ray servers use a hardcoded 500 response for various types of failures, making the mimicking server easy to be distinguished by active probes.

Since the parrot is dead since 2013, instead of reviving the parrot, using a real HTTP engine may be a more promising solution here. Many circumvention tools have been using the idea of application fronting, which include forwardproxy, naiveproxy and trojan.

Credits

All credit goes to the authors of the corresponding works.

Thanks

We want to thank @studentmain and @p4gefau1t for helping us understand their proposed replay attacks, and for sharing their inspiring thoughts on the future works. We are also grateful to David Fifield and @studentmain for offering detailed feedback on a draft of this summary.

Contacts

This report first appeared on GFW Report. We also maintain an up-to-date copy of the report on both net4people and ntc.party.

We will investigate whether the GFW uses active probing against VMess protocol in the following work. At the same time, it will save us a lot of time if you, as a user, can report which circumvention services were blocked when using what settings. We encourage you to share your comments publicly or privately. Our private contact information can be found at the footer of GFW Report.