Bypass ILV locks with DNSTap and BGP

The topic is pretty out of the picture, I know. For example, there is an excellent article , but only the IP part of the blocklist is considered there. We will also add domains.

Due to the fact that the courts and ILV block everything right and left, and the providers are trying hard not to fall under the fines issued by “Revizorro” - the associated losses from the locks are quite large. Yes, and among the "legitimately" blocked sites there are many useful ones (hello, rutracker)

I live outside the jurisdiction of the ILV, but my parents, relatives and friends remained at home. So it was decided to come up with a way to bypass locks that is easy for those far from IT, preferably without their participation.

In this article, I will not describe the basic network things in steps, but describe the general principles of how this scheme can be implemented. So knowledge of how the network works in general and in Linux in particular is a must have.

Types of locks

First, let’s refresh what is blocked.

There are several types of locks in paged XML from ILV:

IP
Domain
URL

For simplicity, we will reduce them to two: IP and domain, and we will simply pull the domain out of URL blocking (more precisely, we have already done this for us).

Good people from Roskomsvoboda have implemented a wonderful API through which we can get what we need:

Access blocked sites

To do this, we need some small foreign VPS, preferably with unlimited traffic - there are many such 3-5 bucks. You need to take it in the near abroad so that the ping is not very large, but again take into account that the Internet and geography do not always coincide. And since there is no SLA for 5 bucks, it is better to take 2+ pieces from different providers for fault tolerance.

Next, we need to configure the encrypted tunnel from the client router to the VPS. I use Wireguard as the fastest and easiest to configure since I also have client routers based on Linux ( APU2 or something on OpenWRT). In the case of some Mikrotik / Cisco, you can use the protocols available on them like OpenVPN and GRE-over-IPSEC.

Identification and redirection of traffic of interest

You can, of course, completely wrap up all Internet traffic through abroad. But, most likely, the speed of working with local content will suffer greatly from this. Plus, the bandwidth requirements on the VPS will be much higher.

Therefore, we will need to somehow allocate traffic to blocked sites and selectively send it to the tunnel. Even if some part of the "extra" traffic gets there, it is still much better than driving everything through a tunnel.

To manage traffic, we will use the BGP protocol and announce routes to the necessary networks from our VPS to clients. As a BGP daemon, let's take BIRD, as one of the most functional and convenient.

IP

With IP blocking, everything is clear: we just announce all blocked IPs with VPS. The problem is that there are about 600 thousand subnets in the list that the API gives, and the vast majority of them are / 32 hosts. Such a number of routes can be confusing for weak client routers.

Therefore, it was decided to summarize to the network / 24 when processing the list if there are 2 or more hosts in it. Thus, the number of routes was reduced to ~ 100 thousand. The script for this will be next.

Domains

It is more complicated and there are several ways. For example, you can put transparent Squid on each client router and do HTTP interception and peeping in the TLS handshake in order to get the requested URL in the first case and the domain from SNI in the second.

But because of all the newfangled TLS1.3 + eSNI, HTTPS analysis is becoming less and less real every day. And the infrastructure on the client side is becoming more complicated - you will have to use at least OpenWRT.

Therefore, I decided to take the path of intercepting responses to DNS queries. Here, too, any DNS-over-TLS / HTTPS begins to soar above our heads, but we can (for now) control this part on the client - either disable it or use our own server for DoT / DoH.

How to intercept DNS?

There may also be several approaches.

Interception of DNS traffic through PCAP or NFLOG
Both of these interception methods are implemented in the sidmat utility. But it has not been supported for a long time and the functionality is very primitive, so you need to write a binding to it anyway.
DNS server log analysis
Unfortunately, the recursors I know do not know how to log answers, but only requests. In principle, this is logical, since unlike requests, answers have a complex structure and it is difficult to write them in text form.
DNSTap
Fortunately, many of them already support DNSTap for these purposes.

What is DNSTap?

This is a client-server protocol based on Protocol Buffers and Frame Streams for transferring from a DNS server to a collector of structured DNS queries and responses. In essence, the DNS server transmits metadata of requests and responses (message type, client / server IP, etc.) plus full DNS messages in the (binary) form in which it works with them over the network.

It is important to understand that in the DNSTap paradigm, the DNS server acts as a client, and the collector as a server. That is, the DNS server connects to the collector, and not vice versa.

Today DNSTap is supported in all popular DNS servers. But, for example, BIND in many distributions (like Ubuntu LTS) is often built for some reason without its support. So we will not bother rebuilding, but take a lighter and faster recursor - Unbound.

How to catch DNSTap?

There are a number of CLI utilities for working with a stream of DNSTap events, but they do not work well for our task. So I decided to invent my own bike that will do whatever it takes: dnstap-bgp

Work algorithm:

When launched, it loads a list of domains from a text file, inverts them (habr.com -> com.habr), excludes broken lines, duplicates and subdomains (i.e. if habr.com and www.habr.com are in the list - it will be loaded only the first one) and builds a prefix tree for quick search on this list
Acting as a DNSTap server, it waits for a connection from the DNS server. In principle, it supports both UNIX and TCP sockets, but the DNS servers I know can only use UNIX sockets
The incoming DNSTap packets are deserialized first into the Protobuf structure, and then the binary DNS message itself, located in one of the Protobuf fields, is parsed to the level of DNS RR records
Checks if the requested host (or its parent domain) is in the loaded list, if not, the response is ignored
Only A / AAAA / CNAME RR are selected from the response and the corresponding IPv4 / IPv6 addresses are pulled from them
IP addresses are cached with custom TTLs and advertised to all configured BGP peers
Upon receipt of a response indicating an already cached IP - its TTL is updated
After TTL expiration, the record is deleted from the cache and from BGP announcements

Additional functionality:

Rereading domain list by SIGHUP
Synchronizing cache with other dnstap-bgp instances via HTTP / JSON
Duplication of the cache on the disk (in the BoltDB database) to restore its contents after a restart
Support for switching to a different network namespace (why this will be described below)
IPv6 Support

Limitations:

IDN domains not yet supported
Few BGP Settings

I have compiled RPM and DEB packages for easy installation. Should work on all relatively recent OSs with systemd, as they have no dependencies.

Scheme

So, let's start assembling all the components together. As a result, we should get something like this network topology:

The logic of work, I think, is clear from the diagram:

The client has our server configured as DNS, and DNS queries must also go through the VPN. This is necessary so that the provider can not use DNS interception to block.
The client, when opening the site, sends a DNS query of the form "what are the IPs of xxx.org"
Unbound resolves xxx.org (or takes it from the cache) and sends a response to the client "xxx.org has such and such IPs", duplicating it in parallel through DNSTap
dnstap-bgp announces these addresses in BIRD via BGP if the domain is in the list of blocked
BIRD announces route to these IPs with next-hop self client router
Subsequent packets from the client to these IPs go through the tunnel

On the server, for routes to blocked sites, I have a separate table inside BIRD and it does not intersect with the OS.

There is a drawback in this scheme: the first SYN packet from the client will most likely have time to leave through the domestic provider since the route is not announced instantly. And here options are possible depending on how the provider makes the lock. If he just drops traffic, then there are no problems. And if he redirects it to some DPI, then (theoretically) special effects are possible.

Miracles are also possible with clients not observing DNS TTL, which can lead to the client using some outdated entries from their rotten cache instead of asking Unbound.

In practice, neither the first nor the second caused me problems, but your mileage may vary.

Server Tuning

For ease of rolling, I wrote a role for Ansible . It can configure both servers and Linux-based clients (designed for deb-based distributions). All settings are quite obvious and are set in inventory.yml . This role is cut from my big playbook, so it may contain errors - pull requests welcome :)

Let's go through the main components.

BGP

When launching two BGP daemons on the same host, a fundamental problem arises: BIRD does not want to raise BGP peering with a localhost (or with any local interface). From the word at all. Googling and reading mailing-lists did not help, they claim that it is by design. Perhaps there is some way, but I did not find it.

You can try another BGP daemon, but I like BIRD and it is used everywhere with me, I don’t want to produce entities.

Therefore, I hid dnstap-bgp inside the network namespace, which is connected to the root via the veth interface: it is like a pipe whose ends stick out in different namespace. On each of these ends we hang private p2p IP addresses that do not go outside the host, so they can be any. This is the same mechanism that is used to access processes inside the beloved Docker and other containers.

To do this, a script was written and in dnstap-bgp the functionality described above for dragging oneself by the hair into another namespace was added. Because of this, it must be run as root or returned to the CAP_SYS_ADMIN binary via the setcap command.

Sample script to create namespace

 #!/bin/bash NS="dtap" IP="/sbin/ip" IPNS="$IP netns exec $NS $IP" IF_R="veth-$NS-r" IF_NS="veth-$NS-ns" IP_R="192.168.149.1" IP_NS="192.168.149.2" /bin/systemctl stop dnstap-bgp || true $IP netns del $NS > /dev/null 2>&1 $IP netns add $NS $IP link add $IF_R type veth peer name $IF_NS $IP link set $IF_NS netns $NS $IP addr add $IP_R remote $IP_NS dev $IF_R $IP link set $IF_R up $IPNS addr add $IP_NS remote $IP_R dev $IF_NS $IPNS link set $IF_NS up /bin/systemctl start dnstap-bgp

dnstap-bgp.conf

 namespace = "dtap" domains = "/var/cache/rkn_domains.txt" ttl = "168h" [dnstap] listen = "/tmp/dnstap.sock" perm = "0666" [bgp] as = 65000 routerid = "192.168.149.2" peers = [ "192.168.149.1", ]

bird.conf

 router id 192.168.1.1; table rkn; # Clients protocol bgp bgp_client1 { table rkn; local as 65000; neighbor 192.168.1.2 as 65000; direct; bfd on; next hop self; graceful restart; graceful restart time 60; export all; import none; } # DNSTap-BGP protocol bgp bgp_dnstap { table rkn; local as 65000; neighbor 192.168.149.2 as 65000; direct; passive on; rr client; import all; export none; } # Static routes list protocol static static_rkn { table rkn; include "rkn_routes.list"; import all; export none; }

rkn_routes.list

 route 3.226.79.85/32 via "ens3"; route 18.236.189.0/24 via "ens3"; route 3.224.21.0/24 via "ens3"; ...

DNS

By default, in Ubuntu, the Unbound binary is clamped by the AppArmor profile, which prevents it from connecting to any DNSTap sockets there. You can either remove this profile or disable it:

 # cd /etc/apparmor.d/disable && ln -s ../usr.sbin.unbound . # apparmor_parser -R /etc/apparmor.d/usr.sbin.unbound

This should probably be added to the playbook. It is ideal, of course, to correct the profile and issue the necessary rights, but I was too lazy.

unbound.conf

 server: chroot: "" port: 53 interface: 0.0.0.0 root-hints: "/var/lib/unbound/named.root" auto-trust-anchor-file: "/var/lib/unbound/root.key" access-control: 192.168.0.0/16 allow remote-control: control-enable: yes control-use-cert: no dnstap: dnstap-enable: yes dnstap-socket-path: "/tmp/dnstap.sock" dnstap-send-identity: no dnstap-send-version: no dnstap-log-client-response-messages: yes

Download and list processing

Script for downloading and processing a list of IP addresses
It downloads the list, summarizes before the pfx prefix. In dont_add and dont_summarize, you can tell IP and networks to skip or not to summarize. I needed it because my VPS subnet was in the blocklist :)

The funny thing is that the RosKomSvoboda API blocks requests with the default Python user agent. Looks like a script kiddy got it. Therefore, we change it to Firelis.

So far, it only works with IPv4 because IPv6 is small, but it will be easy to fix. Unless it is necessary to use also bird6.

rkn.py

 #!/usr/bin/python3 import json, urllib.request, ipaddress as ipa url = 'https://api.reserve-rbl.ru/api/v2/ips/json' pfx = '24' dont_summarize = { # ipa.IPv4Network('1.1.1.0/24'), } dont_add = { # ipa.IPv4Address('1.1.1.1'), } req = urllib.request.Request( url, data=None, headers={ 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36' } ) f = urllib.request.urlopen(req) ips = json.loads(f.read().decode('utf-8')) prefix32 = ipa.IPv4Address('255.255.255.255') r = {} for i in ips: ip = ipa.ip_network(i) if not isinstance(ip, ipa.IPv4Network): continue addr = ip.network_address if addr in dont_add: continue m = ip.netmask if m != prefix32: r[m] = [addr, 1] continue sn = ipa.IPv4Network(str(addr) + '/' + pfx, strict=False) if sn in dont_summarize: tgt = addr else: tgt = sn if not sn in r: r[tgt] = [addr, 1] else: r[tgt][1] += 1 o = [] for n, v in r.items(): if v[1] == 1: o.append(str(v[0]) + '/32') else: o.append(n) for k in o: print(k)

Script to update
It starts at my crown once a day, maybe it’s worth pulling it every 4 hours because this, in my opinion, is the update period that ILV requires from providers. Plus, there are some other ultra-urgent locks that they can fly faster.

Does the following:

Runs the first script and updates the route list ( rkn_routes.list ) for BIRD
Reload BIRD
Updates and cleans the domain list for dnstap-bgp
Relocate dnstap-bgp

rkn_update.sh

 #!/bin/bash ROUTES="/etc/bird/rkn_routes.list" DOMAINS="/var/cache/rkn_domains.txt" # Get & summarize routes /opt/rkn.py | sed 's/\(.*\)/route \1 via "ens3";/' > $ROUTES.new if [ $? -ne 0 ]; then rm -f $ROUTES.new echo "Unable to download RKN routes" exit 1 fi if [ -e $ROUTES ]; then mv $ROUTES $ROUTES.old fi mv $ROUTES.new $ROUTES /bin/systemctl try-reload-or-restart bird # Get domains curl -s https://api.reserve-rbl.ru/api/v2/domains/json -o - | jq -r '.[]' | sed 's/^\*\.//' | sort | uniq > $DOMAINS.new if [ $? -ne 0 ]; then rm -f $DOMAINS.new echo "Unable to download RKN domains" exit 1 fi if [ -e $DOMAINS ]; then mv $DOMAINS $DOMAINS.old fi mv $DOMAINS.new $DOMAINS /bin/systemctl try-reload-or-restart dnstap-bgp

They were written without really thinking, so if you see what can be improved, go for it.

Client setup

Here I will give examples for Linux routers, but in the case of Mikrotik / Cisco this should be even simpler.

First, configure BIRD:

bird.conf

 router id 192.168.1.2; table rkn; protocol device { scan time 10; }; # Servers protocol bgp bgp_server1 { table rkn; local as 65000; neighbor 192.168.1.1 as 65000; direct; bfd on; next hop self; graceful restart; graceful restart time 60; rr client; export none; import all; } protocol kernel { table rkn; kernel table 222; scan time 10; export all; import none; }

Thus, we will synchronize the routes received from BGP with the kernel routing table number 222.

After that, it is enough to ask the kernel to look at this plate before looking at the default:

 # ip rule add from all pref 256 lookup 222 # ip rule 0: from all lookup local 256: from all lookup 222 32766: from all lookup main 32767: from all lookup default

All that remains is to configure DHCP on the router to distribute the tunnel IP address of the server as DNS and the scheme is ready.

disadvantages

With the current algorithm for creating and processing a list of domains, youtube.com and its CDNs fall into it.

And this leads to the fact that all the videos will go through the VPN, which can clog the entire channel. Perhaps it’s worth compiling a list of popular exception domains that are blocked in ILV so far that the gut is thin. And skip them while parsing.

Conclusion

The described method allows you to bypass almost any locks that providers currently implement.

In principle, dnstap-bgp can be used for any other purpose where a certain level of traffic control based on a domain name is required. Just keep in mind that nowadays a thousand sites can hang on the same IP address (for some Cloudflare, for example), so this method has rather low accuracy.

But for the needs of bypassing locks, this is quite enough.

Add-ons, edits, pull-quests are welcome!

Source: https://habr.com/ru/post/467547/

All Articles