Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace old waitForTunnelUp function #7458

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open

Conversation

Rawa
Copy link
Contributor

@Rawa Rawa commented Jan 13, 2025

After invoking VpnService.establish() we will get a tunnel file descriptor that corresponds to the interface that was created. However, this has no guarantee of the routing table beeing up to date, and we might thus send traffic outside the tunnel. Previously this was done through looking at the tunFd to see that traffic is sent to verify that the routing table has changed. If no traffic is seen some traffic is induced to a random IP address to ensure traffic can be seen. This new implementation is slower but won't risk sending UDP traffic to a random public address at the internet.


This change is Reviewable

@Rawa Rawa added the Android Issues related to Android label Jan 13, 2025
@Rawa Rawa self-assigned this Jan 13, 2025
@Rawa Rawa force-pushed the remove-wait-for-tunnel-up branch from 283e460 to e1e3f7a Compare January 13, 2025 14:33
@Rawa Rawa marked this pull request as ready for review January 14, 2025 16:08
@Rawa Rawa force-pushed the remove-wait-for-tunnel-up branch 3 times, most recently from 3426108 to 02784e9 Compare January 15, 2025 12:11
Copy link
Contributor

@Pururun Pururun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @Rawa)


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 145 at r1 (raw file):

        // send traffic before the routes are set up. Otherwise we might send traffic
        // through the wrong interface
        runBlocking { waitForRoutesWithTimeout(config) }.bind()

Is this main/ui thread blocking?


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 184 at r1 (raw file):

                connectivityListener.currentNetworkState
                    .map { it?.linkProperties }
                    .distinctUntilChanged()

Is distinct until change required if we have first?


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 210 at r1 (raw file):

    companion object {
        const val FALLBACK_DUMMY_DNS_SERVER = "192.0.2.1"
        private val ROUTES_SETUP_TIMEOUT = 5000.milliseconds

What is the motivation for this timeout length?

@Rawa Rawa force-pushed the remove-wait-for-tunnel-up branch from 02784e9 to 3e379c2 Compare January 15, 2025 13:04
Copy link
Contributor Author

@Rawa Rawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 9 files reviewed, 3 unresolved discussions (waiting on @Pururun)


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 145 at r1 (raw file):

Previously, Pururun (Jonatan Rhodin) wrote…

Is this main/ui thread blocking?

Previously this function has also been blocking (with wait_for_tunnel_up in Rust), I believe the tunnel is opening from daemon with runs on a separate thread, but we should ensure and verify this so we don't block UI.


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 184 at r1 (raw file):

Previously, Pururun (Jonatan Rhodin) wrote…

Is distinct until change required if we have first?

Correct, this was lingering since how it worked in the previous implementation.


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 210 at r1 (raw file):

Previously, Pururun (Jonatan Rhodin) wrote…

What is the motivation for this timeout length?

Generally it should be a timeout for how long we reasonable think a system should take until it's routes are setup, previously there was a timeout of 60 seconds. However, that timeout was based on the expected time to able to send a single udp packet successfully.

Copy link
Contributor Author

@Rawa Rawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 9 files reviewed, 3 unresolved discussions (waiting on @Pururun)


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 145 at r1 (raw file):

Previously, Rawa (David Göransson) wrote…

Previously this function has also been blocking (with wait_for_tunnel_up in Rust), I believe the tunnel is opening from daemon with runs on a separate thread, but we should ensure and verify this so we don't block UI.

Verified by adding a sleep, it does not block UI.

Copy link
Contributor Author

@Rawa Rawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 9 files reviewed, 3 unresolved discussions (waiting on @Pururun)


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 210 at r1 (raw file):

Previously, Rawa (David Göransson) wrote…

Generally it should be a timeout for how long we reasonable think a system should take until it's routes are setup, previously there was a timeout of 60 seconds. However, that timeout was based on the expected time to able to send a single udp packet successfully.

If we reach this timeout I believe we should consider the tunnel broken, and enter error state.

Copy link
Contributor

@kl kl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 9 files reviewed, 4 unresolved discussions (waiting on @Pururun and @Rawa)


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 158 at r2 (raw file):

        }

    private fun Builder.addDnsServerE(dnsServer: InetAddress) =

Personal preference, but I think it's clearer to write out the return type explicitly for all functions. Also maybe call this function addDnsServerEither would be clearer?

Pururun
Pururun previously approved these changes Jan 16, 2025
Copy link
Contributor

@Pururun Pururun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Rawa)


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 145 at r1 (raw file):

Previously, Rawa (David Göransson) wrote…

Verified by adding a sleep, it does not block UI.


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 210 at r1 (raw file):

Previously, Rawa (David Göransson) wrote…

If we reach this timeout I believe we should consider the tunnel broken, and enter error state.

Agree on the error state if timed out. Since it "should" be fast I think it is ok with a rather short timeout, but some phones are still very slow so a good idea to test this on multiple devices.

Copy link
Contributor Author

@Rawa Rawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 9 files reviewed, 1 unresolved discussion (waiting on @kl and @Pururun)


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 210 at r1 (raw file):

Previously, Pururun (Jonatan Rhodin) wrote…

Agree on the error state if timed out. Since it "should" be fast I think it is ok with a rather short timeout, but some phones are still very slow so a good idea to test this on multiple devices.

Agree, it should be tested on multiple devices. Along with leak tests. Once reviewed by us I'll go through it with @pinkisemils as well.


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 158 at r2 (raw file):

Previously, kl (Kalle Lindström) wrote…

Personal preference, but I think it's clearer to write out the return type explicitly for all functions. Also maybe call this function addDnsServerEither would be clearer?

Renamed it to addDnsServerSafe to align with prepareVpnSafe and establishSafe. Would like to avoid the Safe-part and just let the signature speak for it self unfortunately it can't distinguish between the functions then. :(

kl
kl previously approved these changes Jan 16, 2025
Copy link
Contributor

@kl kl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 9 files at r1, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@Rawa Rawa force-pushed the remove-wait-for-tunnel-up branch from 2f2700e to 451ddb0 Compare January 16, 2025 10:41
@Rawa Rawa requested a review from albin-mullvad January 17, 2025 08:56
@Rawa Rawa force-pushed the remove-wait-for-tunnel-up branch from 451ddb0 to 3bec3b1 Compare January 17, 2025 09:30
@Rawa Rawa requested a review from pinkisemils January 17, 2025 09:31
Copy link
Collaborator

@pinkisemils pinkisemils left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 9 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 210 at r1 (raw file):

Previously, Rawa (David Göransson) wrote…

Agree, it should be tested on multiple devices. Along with leak tests. Once reviewed by us I'll go through it with @pinkisemils as well.

If we enter the error state but we failed to setup routes, we should inform the daemon that the error state is not blocking. Are we doing that? It is my current understanding that the error state on Android by default is assumed to be blocking. This is more so that when we fail to establish routes on a tunnel,

Copy link
Collaborator

@pinkisemils pinkisemils left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 9 files at r1.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

Copy link
Collaborator

@albin-mullvad albin-mullvad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 9 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@Rawa Rawa added the On hold Means the PR is paused for some reason. No need to review it for now label Jan 17, 2025
@Rawa Rawa force-pushed the remove-wait-for-tunnel-up branch from ebe027c to 093dac6 Compare January 27, 2025 18:04
After invoking VpnService.establish() we will get a tunnel file
descriptor that corresponds to the interface that was created. However,
this has no guarantee of the routing table beeing up to date, and we
might thus send traffic outside the tunnel. Previously this was done
through looking at the tunFd to see that traffic is sent to verify that
the routing table has changed. If no traffic is seen some traffic is
induced to a random IP address to ensure traffic can be seen. This new
implementation is slower but won't risk sending UDP traffic to a random
public address at the internet.
@Rawa Rawa force-pushed the remove-wait-for-tunnel-up branch from 093dac6 to 1e1b9d0 Compare January 27, 2025 18:27
@MarkusPettersson98 MarkusPettersson98 force-pushed the remove-wait-for-tunnel-up branch from 0d1be33 to 1d886b5 Compare January 30, 2025 09:16
@MarkusPettersson98 MarkusPettersson98 force-pushed the remove-wait-for-tunnel-up branch 2 times, most recently from 0843137 to 858fc2f Compare January 30, 2025 10:22
@MarkusPettersson98 MarkusPettersson98 force-pushed the remove-wait-for-tunnel-up branch from d2f014f to 88cf53a Compare January 30, 2025 10:27
Copy link
Contributor

@Pururun Pururun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 11 files at r5, 20 of 20 files at r6, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @Rawa)


mullvad-management-interface/proto/management_interface.proto line 171 at r6 (raw file):

    INVALID_DNS_SERVERS = 11;
    SPLIT_TUNNEL_ERROR = 12;
    // Android only

Two "Android only"?


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/model/RouteInfo.kt line 5 at r6 (raw file):

import java.net.InetAddress

typealias AndroidRouteInfo = android.net.RouteInfo

I guess this also should be in classes.rs


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/model/RouteInfo.kt line 8 at r6 (raw file):

data class RouteInfo(
    val destination: InetNetwork,

This should be in classes.rs


android/app/src/main/kotlin/net/mullvad/mullvadvpn/compose/component/notificationbanner/NotificationData.kt line 201 at r6 (raw file):

    return when {
        isBlocking -> cause.errorMessageId()
        cause is ErrorStateCause.RoutesTimedOut -> "Yo routes is fukkkd"

Reminder to fix this before merge
This should probably be part of errorMessageId()


android/app/src/main/kotlin/net/mullvad/mullvadvpn/compose/component/notificationbanner/NotificationData.kt line 228 at r6 (raw file):

            )

        ErrorStateCause.RoutesTimedOut -> "Yo routes is fukkked"

Reminder to fix this before merge


talpid-core/src/tunnel_state_machine/connecting_state.rs line 141 at r6 (raw file):

                        retry_attempt,
                    );

Should probably be removed

Copy link
Contributor Author

@Rawa Rawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 28 of 31 files reviewed, 6 unresolved discussions (waiting on @Pururun)


android/app/src/main/kotlin/net/mullvad/mullvadvpn/compose/component/notificationbanner/NotificationData.kt line 201 at r6 (raw file):

Previously, Pururun (Jonatan Rhodin) wrote…

Reminder to fix this before merge
This should probably be part of errorMessageId()

It wouldn't be considered Blocked since we can't guarantee the routes are there, so it possible we would fallback on "Failed to block internet" or another string.


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/TalpidVpnService.kt line 210 at r1 (raw file):

Previously, pinkisemils (Emīls Piņķis) wrote…

If we enter the error state but we failed to setup routes, we should inform the daemon that the error state is not blocking. Are we doing that? It is my current understanding that the error state on Android by default is assumed to be blocking. This is more so that when we fail to establish routes on a tunnel,

There are a critical error state that are not blocking, failing to find the routes we expect to be there should also fall into that category in my opinion.


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/model/RouteInfo.kt line 5 at r6 (raw file):

Previously, Pururun (Jonatan Rhodin) wrote…

I guess this also should be in classes.rs

We don't expose this to daemon, so it should not be exposed in classes.rs


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/model/RouteInfo.kt line 8 at r6 (raw file):

Previously, Pururun (Jonatan Rhodin) wrote…

This should be in classes.rs

It should already be in the classes.rs


mullvad-management-interface/proto/management_interface.proto line 171 at r6 (raw file):

Previously, Pururun (Jonatan Rhodin) wrote…

Two "Android only"?

Yes, INVALID_DNS_SERVERS and ROUTES_TIMED_OUT are android only. E.g SPLIT_TUNNEL_ERRORand NEED_FULL_DISK_PERMISSIONS are not.

Copy link
Contributor

@Pururun Pururun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r7, 1 of 1 files at r8, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @Rawa)


android/app/src/main/kotlin/net/mullvad/mullvadvpn/compose/component/notificationbanner/NotificationData.kt line 201 at r6 (raw file):

Previously, Rawa (David Göransson) wrote…

It wouldn't be considered Blocked since we can't guarantee the routes are there, so it possible we would fallback on "Failed to block internet" or another string.

That sounds reasonable


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/model/RouteInfo.kt line 5 at r6 (raw file):

Previously, Rawa (David Göransson) wrote…

We don't expose this to daemon, so it should not be exposed in classes.rs

👍


android/lib/talpid/src/main/kotlin/net/mullvad/talpid/model/RouteInfo.kt line 8 at r6 (raw file):

Previously, Rawa (David Göransson) wrote…

It should already be in the classes.rs

Hmm somehow missed it.


mullvad-management-interface/proto/management_interface.proto line 171 at r6 (raw file):

Previously, Rawa (David Göransson) wrote…

Yes, INVALID_DNS_SERVERS and ROUTES_TIMED_OUT are android only. E.g SPLIT_TUNNEL_ERRORand NEED_FULL_DISK_PERMISSIONS are not.

Right

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Android Issues related to Android On hold Means the PR is paused for some reason. No need to review it for now
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants