A typical DNS poisoning event consists of three steps:
But what will happen after step 3?
While packets sent to these wrong IP addresses are often believed to be dropped or null-routed; in this report, we document an interesting phenomena: the GFW pretends to be some of the injected IP addresses to accept (or deny) TCP handshakes from the clients. Such behavior tricks censored clients to send data that would never be sent if the handshakes were not accepted. The censor can thus learn more about the effect of its censorship and also the intention of clients.
We characterize this behavior and fingerprint the censoring machine.
Our findings suggest that the censoring machine is likely to be stateless;
it also employs some load-balancing that accepts around 75%
of the traffic.
We locate the injecting point of the censoring machine,
and find that DNS queries or TLS connections to these IP addresses are not affected by the DNS or SNI censorship.
Finally, we suggest that users encrypt their DNS queries and
block all outgoing traffic to these injected IP addresses.
75%
of the traffic. The load-balancing algorithm is associated with (srcIP, dstIP, srcPort, dstPort)
tuple.Our investigation started from something we could not explain.
As show in Figure 3 of Anonymous et al. 2020,
when testing the reachability of the injected IP addresses from China and the US,
around 40%
of the results are different.
For example,
0.4%
of the IP-port pairs are open
when tested from China,
but are filtered
when tested from the US.
(See definitions of port state open
, closed
and filtered
in the Nmap manual.)
To answer why such inconsistency exists, we re-tested their reachability. We first obtained the injected IP addresses from the open dataset. We then used Nmap to SYN ping all 65535 ports of the 215 IP addresses from China:
nmap -iL ips_after_drop.txt -p1-65535 -Pn --min-rate=500 -oX all_ips_65535.xml
In total, we discovered 14 IP addresses that has at least one non-filtered
port;
the rest 201 IP addresses appeared to not respond at any port.
Based on the port states, we categorized these 14 IP addresses into three groups:
open
when tested from China; but are all filtered
when tested from the US:
8.7.198.46
46.82.174.69
59.24.3.174
93.46.8.90
closed
ports; and all other ports are filtered
when tested from China. All ports are filtered
when tested from the US:
8.7.198.45
67.228.126.62
93.46.8.89
118.5.49.6
188.5.4.96
203.98.7.65
208.101.48.171
open
ports; and a few closed
ports; the rest are filtered
when tested from China. Similar results with slight differences were discovered when tested from the US.
31.13.64.49
31.13.72.54
31.13.85.1
We first discuss the four IP addresses in group 1:
8.7.198.46
46.82.174.69
59.24.3.174
93.46.8.90
We call them listening IPs
because
all their ports from 1 to 65535 were observed to accept TCP handshakes,
when tested from China.
Prior work shows that the GFW uses multiple DNS injectors, and each maintain a separate blacklist (see Figure 5) and injects a different set of IP addresses (see Table 3). Therefore, it is intuitive to ask:
listening IPs
?These questions can be answered easily using the open dataset:
grep --max-count 1 "59\.24\.3\.174" injector*.csv
Interestingly,
we found these four listening IPs
exactly match the first row of Table 3.
In other words,
they happen to be the same four IP addresses used by the injector 1,
and they are used to poison 88 domains:
cut -d";" -f2 injector1.csv | sort | uniq
www.8800.org
www.expressvpn.com
www.google.as
www.google.bf
www.google.bi
www.google.bj
www.google.bs
www.google.bt
www.google.by
www.google.cat
www.google.cd
www.google.cg
www.google.ci
www.google.cm
www.google.co.ao
www.google.co.ck
www.google.co.ls
www.google.com.af
www.google.com.ag
www.google.com.ai
www.google.com.ar
www.google.com.bd
www.google.com.bz
www.google.com.cu
www.google.com.do
www.google.com.eg
www.google.com.et
www.google.com.fj
www.google.com.gh
www.google.com.gi
www.google.com.lb
www.google.com.ly
www.google.com.mm
www.google.com.ng
www.google.com.np
www.google.com.pg
www.google.com.pk
www.google.com.py
www.google.com.sb
www.google.com.sl
www.google.com.tj
www.google.com.vc
www.google.co.mz
www.google.co.tz
www.google.co.ug
www.google.co.uz
www.google.co.ve
www.google.co.zw
www.google.cv
www.google.dj
www.google.ga
www.google.gg
www.google.gl
www.google.gm
www.google.gp
www.google.gy
www.google.hn
www.google.ht
www.google.im
www.google.iq
www.google.it
www.google.je
www.google.kg
www.google.ki
www.google.la
www.google.li
www.google.me
www.google.mg
www.google.ml
www.google.mn
www.google.mv
www.google.mw
www.google.ne
www.google.pn
www.google.ps
www.google.rs
www.google.sm
www.google.sn
www.google.so
www.google.sr
www.google.st
www.google.td
www.google.tg
www.google.tl
www.google.tm
www.google.to
www.google.ws
www.kuniao.com
As listed above, most of these domains are related to Google, with three exceptions to be:
www.kuniao.com
(see the related censorship event),www.8800.org
(archive),www.expressvpn.com
The behaviors of the censoring machine are rather simple.
It spoofs these four listening IPs
and:
SYN
flag turned on AND PSH
and ACK
flags turned off, it replies a SYN+ACK
packet.PSH
flag turned on, it replies a single RST
packet.You can also quickly test it using Nping
:
sudo nping -c 0 --tcp 59.24.3.174 -p65535 --flags S
sudo nping -c 0 --tcp 59.24.3.174 -p65535 --flags P
While, in theory, the GFW has the capability of replying to clients’ requests, we did not observe such behaviors. The GFW appears to tear down connections regardless of the data sent by our clients. We tested this by trying to establish typical HTTP or TLS connections with these four IP addresses from China. The three commands we used are as follows:
# HTTP GET
wget http://59.24.3.174 -v
# TLS with SNI=www.google.com
openssl s_client -servername www.google.com -tlsextdebug -msg -connect 59.24.3.174:443
# TLS with SNI=www.baidu.com
openssl s_client -servername www.baidu.com -tlsextdebug -msg -connect 59.24.3.174:443
As a result, all our connections were RSTed after successful handshakes.
We have two different conjectures on why the censor behaves this way.
The first conjecture is that the censor simply intends to disrupt TCP connections to these four IP addresses.
Indeed, the forged SYN+ACK
s and RST
s are reminiscent of
the GFW’s obsolete way to disrupt TCP connections during residual censorship period.
(Note, we say “obsolete” because the GFW does not send any packet during the 60-second residual censorship anymore as of November 10, 2020. We triggered the residual censorship by using a TLS ClientHello with sensitive SNI.)
As introduced by Wang et al. in the TCP connection reset section:
“During this period, any SYN packet between the two end-hosts will trigger a forged SYN/ACK packet with a wrong sequence number from the GFW, which will obstruct the legitimate handshake; any other packets will trigger forged RST and RST/ACK packets, which will tear down the connection.”
However, the first conjecture cannot explain why the censor spends extra resources sending packets, instead of simply dropping or null-routing all packets to these four IP addresses.
The second conjecture is that,
by pretending to be these injected IP addresses to accept TCP handshakes,
the GFW can seduce clients to send data that would never be sent if the handshakes were not accepted.
The censor can thus learn more about the effect of its censorship and also the intention of clients.
Specifically,
the connection attempts to these injected IP addresses
provides the censor with a different angel to measure the effectiveness of its DNS censorship.
The censor would have underestimated the number of DNS censorship events,
if it merely counts the number of sensitive DNS queries it observes at the Internet backbone or border.
This is because the majority of the Chinese clients use a local resolver,
and their sensitive DNS queries are answered by local resolvers with poisoned cache,
not by the GFW.
Thus, by observing the traffic to these injected IP addresses,
the censor will have a better idea on how many clients received DNS poisoned answers.
Furthermore,
by seducing the censored clients to send more data than a SYN
packet,
the GFW can learn more on what the clients would do if its connection was not censored.
We remind our readers that, although the conjectures may sound reasonable, we have no way to validate or falsify them.
Many evidences suggest that the censoring machine appears to be stateless, and its implementation is crude. For example:
SYN+ACK
, even if the client does not ACK
.SYN+ACK
with RST
.First,
to measure the timeout value of the censoring machine,
we used the following command to connect to one of the listening IPs
:
nc -v 59.24.3.174 443
After Connection to 59.24.3.174 443 port [tcp/https] succeeded!
,
we intentionally did not send any data for more than 30 minutes;
however, the censoring machine did not send any packet to close the connection.
As soon as we sent a chunk of data TEST
via nc,
the censoring machine sent a single RST to tear down the connection.
This experiment shows that the censoring machine either does not timeout,
or has an unusually large timeout value.
If the censoring machine was indeed stateful,
having such a large timeout value would have easily drained up its resources.
Second, the censoring machine does not retransmit SYN+ACK, even if the client does not ACK. One can test it with commands like these:
# capture traffic usig tcpdump,
# as nping will not show if RST was sent by kernel
sudo tcpdump -n host 59.24.3.174 and port 442
## open another terminal:
# drop RST sent by kernel due to unexpected SYN+ACK
sudo iptables -A OUTPUT -p tcp --tcp-flags RST RST -s $(hostname -I) -j DROP
# send SYN packets
sudo nping -c 0 --tcp 59.24.3.174 -p443 --flags S
# delete the dropping RST rule
sudo iptables -D OUTPUT -p tcp --tcp-flags RST RST -s $(hostname -I) -j DROP
Third,
the censoring machine does not respond to unexpected SYN+ACK
with RST
:
# Sending SA to open ports of listening IPs will not get RST
sudo nping -c 0 --tcp 59.24.3.174 -p443 --flags SA
# Sending SA to open ports of common TCP server will get RST
sudo nping -c 0 --tcp 1.1.1.1 -p443 --flags SA
Forth,
the censoring machine replies a SYN+ACK
,
even if the SYN
packets has a wrong IP and/or TCP checksum:
sudo nping -c 0 --tcp 59.24.3.174 -p443 --flags S --badsum-ip --badsum
All evidences above suggest the TCP implementation of the censoring machine is simple and crude. Indeed, sometimes, “worse is better”. If all the censor needs is to seduce TCP handshakes, then this stateless implementation means simplicity in code, efficiency in resources, less fingerprints, and a smaller attack surface.
We fuzzed and sent packets to these listening IPs
using the following script:
#!/usr/bin/env python3
from scapy.all import *
packet=Ether() / fuzz(IP(dst="59.24.3.174")) / fuzz(TCP())
sendp(packet, loop=1)
We then compare the packets forged and replied from the censoring machine.
On IP-layer, the fingerprints of the forged replies are:
0x68
(throughput
bit is turned on).Don't Fragment (DF)
bit is same as the triggering packet; the censoring machine does not respond to packets with the More Fragments (MF)
bit turned on.On TCP-layer, the fingerprints of the forged replies are:
0
to 2^32
.SYN
bit turned on AND PSH
and ACK
bits turned off, then the flag is a copy of the triggering packet with the ACK
bit turned on; if the triggering packets has PSH
bit turned on, then the flag is a copy of the triggering packet with RST
bit turned on, and with the ACK
, PSH
, SYN
, FIN
bits turned off.1424
.If you want to do a quick sanity check to see if the censoring machine has the same fingerprints in your area, you may use commands similar to this:
sudo nping -c 0 --tcp 59.24.3.174 -p443 --id 3333 --tos 0x02 -df --flags S --win 1 -v4
We also note that,
some seeming randomly fingerprints of the GFW will exhibits certain pattern,
only when the GFW is probed at a very fast speed.
One example is the Figure 4.a by Anonymous et al..
We thus SYN-flooded one of the listening IPs
(for less than 2 seconds).
Although,
we received GFW’s forged SYN+ACK
at a speed of 15,000 packets per second;
as shown in the figure below,
the TCP sequence numbers still appear to be random.
We find that SYN
or PSH
packets do not always trigger the censoring machine to send the responding SYN+ACK
or RST
.
Further investigation reveals that,
for a given (srcIP, dstIP, srcPort, dstPort)
tuple,
whether the packets can trigger the censoring machine or not is deterministic.
In addition,
when fixing the (srcIP, dstIP, dstPort)
, and enumerating all srcPort
from 1 to 65535,
almost exactly 75%
(min: 49132/65535
; max: 491363/65535
) of the srcPort
s can trigger the censoring machine.
This result suggests that the censoring machine employs some kind of load-balancing.
We could not figure out the exact load balancing algorithm used; but this directory includes the code, data, and analysis we have done. We encourage you to explore this intriguing problem.
As introduced above, the packets sent by the censoring machine has the same IP TTL value as the triggering packet received. This IP TTL mirroring behavior also exits in some DNS poisoning injectors of the GFW (see Figure 8.
One important implication of IP TTL mirroring is that
the censorship machine appears to be much further away from the testing host
than it actually is.
We took this into consideration,
and find the censoring machine is 8-9 hosts away from our host.
We then used limited-TTL approach to locate the DNS injection points from this host to a IP addresses within the same /30
of the G2 IP addresses.
We find the censoring machine may share the same location with the DNS injection points.
We say “may” because we could not accurately locate the censoring machine due to asymmetric routing.
We also find that connections to these listening IPs
are not under DNS or SNI censorship;
while the IP addresses within the same /30
are under censorship.
For example:
# 46.82.174.69 is one of the listening IPs and we get no forged response.
dig @46.82.174.69 www.google.sm
# 46.82.174.70 is within the same /30 and we get forged responses.
dig @46.82.174.70 www.google.sm
The 7 IP addresses have a few closed ports; and all other ports are filtered when tested from China.
When SYN pinging theses addresses from the US,
all 65535 ports appear to be filtered
.
We SYN-pinged these ports with limited TTL.
The result shows that we received RST
s even before reaching the actual IP addresses,
indicating these RST
s were actually sent by the GFW.
The closed
ports are listed as follows:
118.5.49.6,1723,closed,pptp
188.5.4.96,1723,closed,pptp
203.98.7.65,1080,closed,socks
203.98.7.65,1723,closed,pptp
208.101.48.171,5222,closed,xmpp-client
67.228.126.62,443,closed,https
8.7.198.45,443,closed,https
8.7.198.45,1080,closed,socks
8.7.198.45,1723,closed,pptp
93.46.8.89,443,closed,https
Note that 118.5.49.6
and 188.5.4.96
look very similar,
as if they were carelessly picked by someone.
Among the 10 closed
ports,
the typical services running on these ports include PPTP, SOCKS, xmpp-client and HTTPS.
PPTP and SOCKS can be used as censorship circumvention protocols.
We find we could still get ICMP TTL=0 messages while the initial TTL is large enough to trigger three RST
s.
sudo nping -c 0 --tcp 203.98.7.65 -p1080 --flags S --ttl 10
SENT (0.0019s) TCP host:19825 > 203.98.7.65:1080 S ttl=10 id=951 iplen=40 seq=18295394 win=1480
RCVD (0.2008s) TCP 203.98.7.65:1080 > host:19825 RA ttl=251 id=20366 iplen=40 seq=0 win=3509
RCVD (0.2008s) TCP 203.98.7.65:1080 > host:19825 RA ttl=251 id=20366 iplen=40 seq=0 win=3509
RCVD (0.2008s) TCP 203.98.7.65:1080 > host:19825 RA ttl=251 id=20366 iplen=40 seq=0 win=3509
RCVD (0.2008s) ICMP [202.97.90.114 > host TTL=0 during transit (type=11/code=0) ] IP [ttl=243 id=17562 iplen=96 ]
SENT (1.0033s) TCP host:19825 > 203.98.7.65:1080 S ttl=10 id=951 iplen=40 seq=1829538994 win=1480
RCVD (1.2208s) ICMP [202.97.90.114 > host TTL=0 during transit (type=11/code=0) ] IP [ttl=243 id=17732 iplen=96 ]
This suggests that the on-path GFW injects the three RST
s,
without dropping those SYN
packets.
There is a 31-second residual censorship after the GFW sends the RST
s.
During this period of time,
the GFW does not send RST
s to our SYN
s if our SYN
s have the same (srcIP, dstIP, srcPort, dstPort)
tuple.
sudo nping -c 0 --tcp 118.5.49.6 -p1723 -g10001 --flags S
SENT (0.0020s) TCP host:10001 > 118.5.49.6:1723 S ttl=64 id=45322 iplen=40 seq=3600506539 win=1480
RCVD (0.2010s) TCP 118.5.49.6:1723 > host:10001 RA ttl=251 id=29186 iplen=40 seq=0 win=2857
RCVD (0.2010s) TCP 118.5.49.6:1723 > host:10001 RA ttl=251 id=29186 iplen=40 seq=0 win=2857
RCVD (0.2010s) TCP 118.5.49.6:1723 > host:10001 RA ttl=251 id=29186 iplen=40 seq=0 win=2857
SENT (1.0022s) TCP host:10001 > 118.5.49.6:1723 S ttl=64 id=45322 iplen=40 seq=3600506539 win=1480
SENT (2.0037s) TCP host:10001 > 118.5.49.6:1723 S ttl=64 id=45322 iplen=40 seq=3600506539 win=1480
SENT (3.0051s) TCP host:10001 > 118.5.49.6:1723 S ttl=64 id=45322 iplen=40 seq=3600506539 win=1480
....
SENT (31.0406s) TCP host:10001 > 118.5.49.6:1723 S ttl=64 id=45322 iplen=40 seq=3600506539 win=1480
RCVD (31.2090s) TCP 118.5.49.6:1723 > host:10001 RA ttl=251 id=29359 iplen=40 seq=0 win=2904
RCVD (31.2090s) TCP 118.5.49.6:1723 > host:10001 RA ttl=251 id=29359 iplen=40 seq=0 win=2904
SENT (32.0410s) TCP host:10001 > 118.5.49.6:1723 S ttl=64 id=45322 iplen=40 seq=3600506539 win=1480
RCVD (32.0411s) TCP 118.5.49.6:1723 > host:10001 RA ttl=251 id=29359 iplen=40 seq=0 win=2904
Different from the IP addresses in group 1 and group 2, we find no evidence that the group 3 IP addresses are spoofed by the GFW. However, we indeed find packets to some of their ports are dropped by the GFW.
Specifically, we first SYN pinged all their ports from both China and the US:
nmap 31.13.64.49 31.13.72.54 31.13.85.1 -p1-65535 -Pn --min-rate=5000
The result from China is:
31.13.64.49 (443 open; other filtered)
31.13.72.54 (80,443 open; 843,5222,5228,8883 closed; other filtered)
31.13.85.1 (80,443 open; 843,5222,8883 closed; other filtered)
The result from the US is:
31.13.64.49 (80,443 open; 843,5222,5228,8883 closed, other filtered)
31.13.72.54 (80,443 open; 843,5222,5228,8883 closed; other filtered)
31.13.85.1 (80,443 open; 843,5222,5228,8883 closed; other filtered)
We repeated the SYN ping several times to eliminate the inconsistency due to packet losses; however, there are still some ports unreachable from China. Further investigation using a traceroute-like SYN-ping, we find the packets to those ports were dropped by the router at the CHINANET backbone. For example:
# This will get ICMP TTL=0 from the router at the CHINANET backbone
sudo nping -c 0 --tcp 31.13.64.49 -p443 --flags S -ttl 7
# But this will not get such message, suggesting packets drop
sudo nping -c 0 --tcp 31.13.64.49 -p80 --flags S -ttl 7
For those ports that are reachable from China,
we tried to establish HTTP or TLS connections to them.
The result shows that the client can successfully got the corresponding HTTP 400 Bad Request
or the TLS certificate from the server,
without being disrupted by the GFW.
The commands we used are as follows:
wget http://31.13.85.1:80
openssl s_client -tlsextdebug -msg -connect 31.13.85.1:443
We suggest users encrypt their DNS traffic whenever possible. At the same time, one can use iptables rules to block all outgoing traffic to these IPs to avoid their connection attempts being logged by the GFW. Blocking traffic to these IP addresses do little harm to your Internet connectivity, as theses IP addresses are blocked, spoofed or filtered anyway. Specifically, one can try executing the following commands to add the iptables rules:
#!/bin/bash
# Get the 215 injected IP addresses
wget https://gfw.report/publications/foci20_dns/foci20_anonymous/injected_ips/ips_after_drop.txt
# source: https://www.cyberciti.biz/faq/iptables-read-and-block-ips-subnets-from-text-file/
### Setup our black list ###
# Create a new chain
$IPT -N droplist
# Filter out comments and blank lines
# store each ip or subnet in $ip
while IFS="" read -r p || [ -n "$p" ]
do
# Append everything to droplist
iptables -A droplist -s "$ip" -j LOG --log-prefix " Drop Bad IP List "
iptables -A droplist -s "$ip" -j DROP
done <ips_after_drop.txt
# Finally, insert or append our black list
iptables -I INPUT -j droplist
iptables -I OUTPUT -j droplist
iptables -I FORWARD -j droplist