Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sunos support #39

Closed
wants to merge 1 commit into from
Closed

Conversation

nshalman
Copy link

@nshalman nshalman commented Aug 21, 2020

Wireguard (wireguard-go) for sunos (illumos/Solaris)

  • reliability testing via Tailscale (Tailscale for both Solaris and illumos can be built on top of this work and it has been quite reliable.)
  • a wg-quick(8) implementation, like we have for FreeBSD and OpenBSD and such: wg-quick for sunos wireguard-tools#17

Original Description

A currently working cleanup of the code by @jclulow from https://github.com/jclulow/wireguard-go-illumos-wip squashed down to a single commit.

This code works with the caveats noted in the commit message, and with additional modifications, can be added to the https://github.comtailscale/wireguard-go tree and be used to build mostly working tailscale binaries.

I barely know what I'm doing here, so any help carrying this across the finish line would be greatly appreciated.

@nshalman
Copy link
Author

@zx2c4
Copy link
Member

zx2c4 commented Aug 22, 2020

No response from David per usual, so I'll review this one. As a preliminary, it sounds like the current shortcoming is the requirement of -f? Any other quirks?

I'll need to get a Illumos VM running to test this out in.

Also, while we're doing this, we'll also need a wg-quick(8) implementation, like we have for FreeBSD and OpenBSD and such: https://git.zx2c4.com/wireguard-tools/tree/src/wg-quick . Interested in doing that? This can be a lot more fiddly than wireguard-go, depending on what kinds of routing schemes are supported by the Illumos kernel.

@zx2c4 zx2c4 self-requested a review August 22, 2020 19:18
Copy link
Member

@zx2c4 zx2c4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a lot of the heavy lifting is done and now there's just a lot of cleanup and refactoring done, plus that one thing about figuring out tun%d or something similar. There's work to be done, but this is a great start.

@@ -0,0 +1,146 @@
// +build illumos
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering -- is there a reason you couldn't add the illumos build tag to the uapi_bsd.go file? What's different about Illumos here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a diff, unfortunately. The BSD implementation uses kqueue to monitor for deletion of the unix socket (which is how wg-quick signals wireguard-go to shut down, afaict). A quick search tells me that Illumos doesn't have kqueue, but does have its own eventport API we could use.

Or if we're using SMF for lifetime management, we can skip all that and greatly simplify the UAPI listener, because wg-quick can make SMF shut stuff down reliably.

Reviewing the rest of this file on the assumption that we do indeed need a separate illumos implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. So we'll keep it as a separate thing and monitor socket file deletion using an eventport. That's fine with me.

tun/asm_illumos_amd64.s Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved
* descriptor number as the fd number is passed (with no other
* information) through the environment to the child. For now, use the
* "-f" flag.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you already have everything, except for the name of the tun device, right? The tun%d part? Different Unices have various IOCTLs for grabbing this. Or some will expose the PID of the owning application and then you can enumerate. Unless the man pages tell some obvious way, you might be best off just poking at the kernel sources. If you can't find anything in a few days, let me know and I'll trawl around too. Every OS seems to have a different version of this same exercise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some quick rooting around in illumos-gate, but couldn't find net/if_tun.h (which seems to be the expected include file for both Solaris and Illumos). I also couldn't find it in SmartOS or OmniOS's forks, though I did find TritonDataCenter/smartos-live#330 .

This might be a minor issue for adding tuntap things to x/sys/unix, in that you'll have to manually define the constant blocks and whatnot, rather than being able to autogenerate them from the C headers.

Presumably that missing include also has the answer to "how do I get the interface name for this fd".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original source for the if_tun.h is probably here: https://github.com/kaizawa/tuntap/blob/master/if_tun.h

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any updates on these investigations?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the thread below on "cheating", I have thus far been unable to find an IOCTL for getting back the PPA.

tun/tun_illumos.go Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved
Copy link
Contributor

@danderson danderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little disheartening to encourage people to try and upstream useful things to wireguard-go (rather than keep them tailscale-specific), and then get chided for not doing code review on the weekend. I was taking my time because this is a fairly large change, and I'm unfamiliar with illumos's conventions and system facilities. Sending this very incomplete draft now, but I plan to do the bulk of reviewing during business hours.

Thanks @nshalman for sending this!

ipc/uapi_illumos.go Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved
@zx2c4
Copy link
Member

zx2c4 commented Aug 22, 2020

get chided for not doing code review on the weekend

The comment was directed at David, not you Dave, who, by his own repeated admission, is often too busy for this stuff.

to encourage people to try and upstream useful things to wireguard-go (rather than keep them tailscale-specific)

If you consider that a remarkable thing to do, you've made it clear where you stand with regards to this ecosystem. Thanks for making that clear to me.

@danderson
Copy link
Contributor

If you consider that a remarkable thing to do, you've made it clear where you stand with regards to this ecosystem. Thanks for making that clear to me.

I don't consider it remarkable, I don't see how that follows from what I said. Regardless, I'll resume reading this code on Monday, since I'd certainly like Illumos support in wireguard-go, and it sounds like we agree on that.

@danderson
Copy link
Contributor

Oh, and regarding wg-quick: we did some preliminary digging on the facilities available in illumos.

The networking facilities are similar to the BSDs (no policy routing, IP_BOUND_IF socket option), so that part of the wg-quick implementation looks almost identical to the openbsd userspace mode: overlay routes for endpoint IPs, and /etc/resolv.conf replacement for DNS configuration.

For running and monitoring the daemon, SMF obviously springs to mind, but would lead to different semantics than the other wg-quick implementations (notably, wg-quick up would survive reboots until turned back off with wg-quick down). So there again, for uniformity of behavior the implementation would look like what openbsd does when kernel support is missing.

Happy to send the patches for wg-quick.

@nshalman
Copy link
Author

Thank you both, @zx2c4 and @danderson, for the feedback. I'll get started on cleaning up the stuff that I already understand, and work on figuring out the bits I don't understand just yet. :)

In the meantime, replying to some non-code threads:

As a preliminary, it sounds like the current shortcoming is the requirement of -f? Any other quirks?

No others that I noticed, though the only exercise of this code on my end was that it worked connecting to a raspberry pi running the kernel module based version running on a rasberry pi a while back, and seems to mostly work when used by my hacked-up fork of tailscale.

For running and monitoring the daemon, SMF obviously springs to mind, but would lead to different semantics than the other wg-quick implementations (notably, wg-quick up would survive reboots until turned back off with wg-quick down). So there again, for uniformity of behavior the implementation would look like what openbsd does when kernel support is missing.

SMF's svcadm enable does support a -t flag which makes it temporary; svadm enable -t svc:/wireguard/wg-quick:default (or however it would be named) would not remain enabled across a reboot.

Copy link
Contributor

@danderson danderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending the patch! As Jason already pointed out, I think the big thing to adjust is to get a lot of the low-level OS interaction stuff merged into golang.org/x/sys/unix instead of here, at which point this code can be significantly simplified.

I'm happy to take care of the x/sys/unix bits if you'd like, I don't know what your appetite is for learning "deep Go" (the various autogenerators that power x/sys/unix - useful knowledge if you plan to write more Go near the OS interface layer). Lemme know if I should take the lead, or just review the PR when you post it to https://github.com/golang/sys .

The other question we should resolve soon-ish is whether we're going to try for SMF integration in wg-quick, or go down the less-managed route that the BSDs took. My bias would be to go with SMF, since it looks like transient services gives us the semantics we want (wg-quick up not persisting across reboots), but surfaces wireguard-go's existence in system management tooling. That said, I'm not an illumos expert, so I defer to the approach you think is most idiomatic for the system, as long as the behavior of wg-quick matches other OSes.

@@ -0,0 +1,146 @@
// +build illumos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a diff, unfortunately. The BSD implementation uses kqueue to monitor for deletion of the unix socket (which is how wg-quick signals wireguard-go to shut down, afaict). A quick search tells me that Illumos doesn't have kqueue, but does have its own eventport API we could use.

Or if we're using SMF for lifetime management, we can skip all that and greatly simplify the UAPI listener, because wg-quick can make SMF shut stuff down reliably.

Reviewing the rest of this file on the assumption that we do indeed need a separate illumos implementation.

ipc/uapi_illumos.go Outdated Show resolved Hide resolved
ipc/uapi_illumos.go Outdated Show resolved Hide resolved
ipc/uapi_illumos.go Outdated Show resolved Hide resolved
ipc/uapi_illumos.go Outdated Show resolved Hide resolved

/*
* Open another temporary file descriptor to the TUN driver.
* XXX It's not clear if this is actually needed, or if we could
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's surprising semantics in the Illumos API surface, seems like it should be fine to reuse the device FD you extracted from the os.File.

tun/tun_illumos.go Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved
tun/tun_illumos.go Outdated Show resolved Hide resolved

var flags int32 = 0

_, err := getmsg(int(tun.tunFile.Fd()), uintptr(0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do STREAMS offer any kind of batched read/write API? Or does getmsg/putmsg handle batched delivery under the hood? I ask because "one syscall per packet" ends up limiting tuntap performance on many platforms, and I'm wondering if illumos' use of STREAMS would let it sidestep that from the start.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jclulow any insight here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gco pointed me at bufmod(4M)

I haven't explored what implementing that would look like here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a nice optimization. Let's get things working right the vanilla way before we do things the wild way, though. It's already been a few years and this still hasn't shipped.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I just wanted to leave the breadcrumb for that here rather than leaving it sitting in my Twitter DMs. :)

Copy link

@gco gco Oct 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will add latency, though. I suspect that putting the STREAM in RMSGD mode and using readv with sufficiently large buffers will work to read multiple packets with one syscall.

https://docs.oracle.com/cd/E19683-01/806-6546/appmech3-5/index.html

@zx2c4
Copy link
Member

zx2c4 commented Aug 24, 2020

The other question we should resolve soon-ish is whether we're going to try for SMF integration in wg-quick, or go down the less-managed route that the BSDs took.

Thanks for offering to work on this. If you're big into Illumos, taking the lead on the wg-quick integration would be much appreciated. I'd prefer to keep the same semantics as all the other wg-quick implementations. If we want to later add deeper Illumos integration with SMF, that sounds fine, but would be a separate endeavor. For now, let's get everything all lined up.

Usually there are a few problems to solve in wg-quick porting, beyond obvious things:

  • How do you change the DNS server in an exclusive way? Most implementations use resolvconf -m 0 -x, which is preferable. Some use the dns hatchet (see contrib/), which is grotesque. Some just overwrite /etc/resolv.conf, which is insufficient. I assume Illumos has something nice and uniform we can use.
  • How do we exempt the WireGuard socket from the default routing rule? Linux uses fwmarks, for example. [@nshalman - you'll want to add an implementation for this in mark_unix.go, by the way.]
  • How do we overwrite the default route without removing the prior default route? Linux uses multiple routing tables. Some of the BSDs &disown a little route monitor in bash. Hopefully Illumos has a Linux-like solution that doesn't require background processes.
  • How do you give an IP address to an interface without providing a "gateway" IP? Linux has the dev wg0 notation. Some BSDs support a -interface switch to ifconfig. Others rely on gross hacks. Hopefully Illumos has something clean.

I would suggest reading all of the current wg-quick implementations, except perhaps the Android one, before starting, and then maybe start with one of the BSD implementations and tweak it.

If you have additional questions about wg-quick development, I suggest putting this discussion on the mailing list, where wg-quick development happens, and where mailing list lurkers periodically pop up with some nice domain specific knowledge.

@danderson
Copy link
Contributor

Thanks for offering to work on this. If you're big into Illumos, taking the lead on the wg-quick integration would be much appreciated. I'd prefer to keep the same semantics as all the other wg-quick implementations. If we want to later add deeper Illumos integration with SMF, that sounds fine, but would be a separate endeavor. For now, let's get everything all lined up.

Sounds good. In that case, the Illumos implementation of wg-quick is likely to be nearly identical to the OpenBSD implementation:

  • How do you change the DNS server in an exclusive way? Most implementations use resolvconf -m 0 -x, which is preferable. Some use the dns hatchet (see contrib/), which is grotesque. Some just overwrite /etc/resolv.conf, which is insufficient. I assume Illumos has something nice and uniform we can use.

Illumos has no facilities for this that I could find (no resolvconf, and no Illumos-specific facility for managing DNS). It also lacks the piecemeal namespacing support that dns hatchet relies on, so we're stuck with replacing /etc/resolv.conf.

  • How do we exempt the WireGuard socket from the default routing rule? Linux uses fwmarks, for example. [@nshalman - you'll want to add an implementation for this in mark_unix.go, by the way.]

There's no policy routing facilities. Similar to OpenBSD, we'll have to add /32 or /128 routes to the endpoints.

  • How do we overwrite the default route without removing the prior default route? Linux uses multiple routing tables. Some of the BSDs &disown a little route monitor in bash. Hopefully Illumos has a Linux-like solution that doesn't require background processes.

Again, similar situation to the BSDs, no good solution. AFAICT, the current FreeBSD and OpenBSD scripts just add/delete routes per the wg-quick configuration and don't handle conflict at all. The route monitor updates endpoint routes to avoid loops, but does nothing to handle repairing the routing table on exit.

  • How do you give an IP address to an interface without providing a "gateway" IP? Linux has the dev wg0 notation. Some BSDs support a -interface switch to ifconfig. Others rely on gross hacks. Hopefully Illumos has something clean.

Illumos's route(1M) supports -interface, so again should be the same as OpenBSD.

Based on this, once we've merged the initial Illumos support, I'll probably start with a copy of the OpenBSD implementation, then circulate that in the Illumos community to see if there's something we can do better.

Copy link
Author

@nshalman nshalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the etiquette for resolving conversations? Safe for me to mark nits and things as resolved? Preference for the original commenter to resolve?

I've replied to a few threads, and am about to push a commit with some initial nits fixed.
I just realized that I should have run gofmt over this code before opening the PR. My current plan is to delay the noise from making that change until the substantive fixes are in place.

ipc/uapi_illumos.go Outdated Show resolved Hide resolved
* descriptor number as the fd number is passed (with no other
* information) through the environment to the child. For now, use the
* "-f" flag.
*/
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original source for the if_tun.h is probably here: https://github.com/kaizawa/tuntap/blob/master/if_tun.h

tun/tun_illumos.go Outdated Show resolved Hide resolved

var flags int32 = 0

_, err := getmsg(int(tun.tunFile.Fd()), uintptr(0),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jclulow any insight here?

tun/tun_illumos.go Outdated Show resolved Hide resolved
@nshalman nshalman force-pushed the illumos-to-upstream branch from ffbba2c to c6ce31a Compare March 13, 2023 14:20
@nshalman
Copy link
Author

Rebased to clear merge conflicts. Still ready for review by @zx2c4.

return tun.events
}

func (tun *NativeTun) Read(bufs [][]byte, sizes []int, offset int) (int, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buffs is used consistently elsewhere

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thank you for the close read!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Darn. I was rushing home to beat you to it but...

https://git.zx2c4.com/wireguard-go/commit/?id=0ad14a89f5f9da577dae6a63ad196015e51a0381

bufs seems more correct to me, so I fixed it everywhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! I will "unfix" it.

@nshalman nshalman force-pushed the illumos-to-upstream branch from c6ce31a to 15692aa Compare March 13, 2023 16:49
l.connErr <- err
return
}
_, perr := l.evPort.GetOne(nil)
Copy link
Member

@zx2c4 zx2c4 Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_, err := l.evPort.GetOne(nil)

}

go func(l *UAPIListener) {
for {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that you've tested that this loop works as expected in a variety of circumstances?

}(uapi)

// watch for new connections

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this new line.

defer unix.Close(routeSocket)

data := make([]byte, os.Getpagesize())
for {
Copy link
Member

@zx2c4 zx2c4 Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that this loop works in a variety of circumstances? Did you test it?

// a chance to notice
// 2. It doesn't really respect 'ifconfig tunN down' properly so this code
// won't notice, and it will be broken and refuse to come back up
// if it is set down.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand what you mean. It'll get set down but won't notice coming back up? It won't notice going down but it'll come back up just fine?

}
defer unix.Close(fd)

// get the STREAMS muxid for the tun device stream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"get" -- what are you getting? You're putting a constant into a variable, casting it to an int. And then later...


// solaris/illumos defines the ioctl number as a signed int, but the
// common x/sys/unix functions use uint.
if err := unix.IoctlLifreq(fd, uint(reqnum), &l); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and then later you make it into a uint. Why?

Just write IoctlLifreq(fd, uint(unix.SIOG...), &l)


// The "tun" device is a STREAMS module. We need to make a STREAMS ioctl
// request to that module using the TUNNEWPPA command. Returns the newly
// allocated PPA number. Passing -1 will request a dynamically assinged
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*assigned

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that this comment uses capital letters at the beginning of the sentence and periods at the end. All comments should do that unless they really aren't sentences.

But then you mix two spaces between sentences and one space. Just do one space.

assignedName := fmt.Sprintf("tun%d", ppa)

// Clean up anything possibly left behind by a crash
tunDestroy(assignedName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting...

What have you experienced being left behind? What are the circumstances here?

tunDestroy(assignedName)

// The solaris/illumos tun driver doesn't have an ioctl for reading back the PPA.
// Just pass it through to the daemon process via environment variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Darn. Still? Weren't you talking to them about adding this at some point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where's the source code for the tun driver? Is it not part of the base illumos?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weren't you talking to them about adding this at some point?

No, that was other functionality.

Also, I'm told that the code as it is today works on Solaris as well, an operating system that is closed source and over which I have even less influence. As gross as this is, I think it's a decent tradeoff to make to be able to support Solaris too.

If anything, some day illumos might get a newer better TUN driver that doesn't even use STREAMS and the solaris and illumos implementations in here would diverge. But that might not happen until I learn enough C to write one (and find the time and motivation to do so), so that could be (even more) years away...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link to the source code of the tun driver in illumos?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As linked in a much older thread, the header file is here:
https://github.com/kaizawa/tuntap/blob/v1.3.3/if_tun.h

(Linking the version string referenced in e.g. the OmniOS build recipe:
https://github.com/omniosorg/omnios-build/blob/master/build/tuntap/build.sh#L21)

@gco or @alanc might be able to confirm that Solaris is using the same version...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solaris does not include the tuntap driver AFAIK, end-users need to install it if they need it. @danmcd would probably know the history (and merged tuntap into illumos!).

It's not in-gate in illumos. The distros ship it separately.

I linked to the OmniOS build recipe above. The SmartOS one is https://github.com/TritonDataCenter/illumos-extra/tree/master/tun and also references 1.3.
The OpenIndiana one is at https://github.com/OpenIndiana/oi-userland/tree/oi/hipster/components/network/tuntap

I don't see the Solaris build recipe in https://github.com/oracle/solaris-userland/tree/master/components so I'm guessing that it's in a closed repo. I was just wondering what version string you folks have and whether you've made any interesting changes to it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll emphasize "does not include". Solaris users need to get it from the 10 year old github repository.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll emphasize "does not include". Solaris users need to get it from the 10 year old github repository.

Got it. So to be very explicit, in order to use my wireguard-go and tailscale ports you installed the driver directly from the https://github.com/kaizawa/tuntap repo. Is that correct?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got mine from @danmcd many years ago as part of a private tool called punchin. It's built from the same sources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so it sounds like there's a single (old & crusty) codebase for tuntap shared across Solaris, Illumos, and derivatives. So,

Also, I'm told that the code as it is today works on Solaris as well, an operating system that is closed source and over which I have even less influence. As gross as this is, I think it's a decent tradeoff to make to be able to support Solaris too.

This is not correct then, right?

var ppa int
_, err := fmt.Sscanf(tun.name, "tun%d", &ppa)
if err != nil {
return fmt.Errorf("received a bogus tun name: %s", tun.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/bogus/invalid/
s/received a //

return fmt.Errorf("received a bogus tun name: %s", tun.name)
}

// Unclear why, but we need to open a second handle on the tun device for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figure out why and document it.


// Unclear why, but we need to open a second handle on the tun device for
// the ioctls that follow. Reusing the existing one does not work.
tunFD, err := unix.Open(tunNode, unix.O_RDWR, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everywhere else in the codebase, the D in Fd is lowercase.


// select which ppa we're using
req := int(unix.IF_UNITSEL)
err = unix.IoctlSetPointerInt(tunFD, uint(req), ppa)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to make req a separate variable.

return fmt.Errorf("unable to IF_UNITSEL: %v", err)
}

// link the tun stream to the IP multiplexor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiplexer with an e is the more common spelling. Is "o" a illumos thing?

No capital letter and period on this sentence, but it is elsewhere. I'm going to stop commenting on these things throughout the patchset. For next review just make sure they're all fixed per above.

}

// By holding open this file descriptor on /dev/ip for the life of the connection
// and using I_LINK, STREAMS will tear everything down cleanly when we close these
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then I_LINK isn't actually set until deep down in a separate function. Maybe configureSTREAMS should be inlined into this function instead of being standalone, so that it's more clear.


var l unix.Lifreq
if err := l.SetName(tun.name); err != nil {
return fmt.Errorf("failed to set name on Lifreq :%v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed to here, but unable to elsewhere. Look at what wireguard-go does elsewhere and copy that.

}

// set the IP muxid
reqnum := int(unix.SIOCSLIFMUXID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reqnum doesn't need to be a variable.

return err
}

reqnum := int(unix.SIOCSLIFMTU)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reqnum doesn't need to be a separate variable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I've tried to remove all these reqnum assignment before, but to no avail.

We'll use this thread for you to help me satisfy the compiler.
The code as written compiles.
For each of the following variations I get the corresponding failure:

Just move it:

-       reqnum := int(unix.SIOCGLIFMTU)
-       if err := unix.IoctlLifreq(tun.ipfd, uint(reqnum), &l); err != nil {
+       if err := unix.IoctlLifreq(tun.ipfd, uint(int(unix.SIOCGLIFMTU)), &l); err != nil {

Error:

tun/tun_solaris.go:389:44: cannot convert int(unix.SIOCGLIFMTU) (constant -1065850502 of type int) to type uint

Skip the intermediate int conversion:

-       reqnum := int(unix.SIOCGLIFMTU)
-       if err := unix.IoctlLifreq(tun.ipfd, uint(reqnum), &l); err != nil {
+       if err := unix.IoctlLifreq(tun.ipfd, uint(unix.SIOCGLIFMTU), &l); err != nil {

Error:

tun/tun_solaris.go:389:44: cannot convert unix.SIOCGLIFMTU (untyped int constant -1065850502) to type uint

No conversions:

-       reqnum := int(unix.SIOCGLIFMTU)
-       if err := unix.IoctlLifreq(tun.ipfd, uint(reqnum), &l); err != nil {
+       if err := unix.IoctlLifreq(tun.ipfd, unix.SIOCGLIFMTU, &l); err != nil {

Error:

tun/tun_solaris.go:389:39: cannot use unix.SIOCGLIFMTU (untyped int constant -1065850502) as uint value in argument to unix.IoctlLifreq (overflows)

I'm using go 1.20.1.

So. How should I proceed to clean these up?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also: https://go.googlesource.com/sys/+/master/unix/syscall_solaris_test.go#370

Dunno if that counts; I wrote it. lolsob

Copy link
Member

@zx2c4 zx2c4 Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler isn't confused; you are. Your comment in that test isn't accurate and does a disservice.

It looks like ioctl has the wrong signature in Go. According to https://illumos.org/man/2/ioctl req is an int. In Go, that's int32, not int or uint as the Go code has now. Presumably it's too late to fix that, which is unfortunate.

What you can do, though, is just avoid the negative ioctl constant for the ones you added, which probably have no users yet anyway. This way you don't run into ambiguity with sign extension. The easiest way to do this is probably just by adding & 0xffffffff onto the end of the constant definition. While you're at it, you can fix your bogus comment in the test too.

Be sure to submit this to x/sys with a compelling argument that you understand well; don't just link to this comment or something and call it a day.

(The lazy version, by the way, is just unix.IoctlLifreq(tun.ipfd, unix.SIOCGLIFMTU & 0xffffffff, &l)).

@zx2c4
Copy link
Member

zx2c4 commented Mar 13, 2023

From your top message:

reliability testing via Tailscale (Tailscale for both Solaris and illumos can be built on top of this work and it has been quite reliable.)

Testing with third-party software is not sufficient. Test wireguard-go itself and then report back. Please write what you've tested and how.

@nshalman
Copy link
Author

Testing with third-party software is not sufficient. Test wireguard-go itself and then report back. Please write what you've tested and how.

Agreed. I have done testing with wireguard-go itself alongside WireGuard/wireguard-tools#17 too.

They are packaged for OmniOS here:
https://github.com/omniosorg/omnios-extra/tree/master/build/wireguard-go
https://github.com/omniosorg/omnios-extra/tree/master/build/wireguard-tools
And for SmartOS here:
https://github.com/TritonDataCenter/pkgsrc-extra/tree/main/wireguard-go
https://github.com/TritonDataCenter/pkgsrc-extra/tree/main/wireguard-tools

I have also gotten feedback from other users using both (some of which resulted in the recent bug fix in that PR.)

Do you have a list of specific tests you would want to see done?

@zx2c4
Copy link
Member

zx2c4 commented Mar 13, 2023

Do you have a list of specific tests you would want to see done?

I'm primarily concerned about all the weird cases with event listers on the unix socket, tun interface, and all that. So it'd be nice to know that you've printf'd up those code paths at some point to make sure its the behavior you're expecting. The lengthy development here and unsatisfactory earlier results make me slightly more wary of code that looks fine but might not be. So please just make sure that all parts of this PR are intentional and understood, and test that they're doing what that understanding expects them to do.

@zx2c4
Copy link
Member

zx2c4 commented Mar 14, 2023

I just spent a while reading through all the review over the last 3 years and the various revisions and such. I think I'm uncomfortable with this landing through Github and don't have enough trust that you've dotted all your is and crossed your ts. So I think we need to move to a more rigid and less haphazard review process.

Please submit this to the wireguard mailing list when it's ready, as "PATCH v1", with a proper, detailed, and well-written commit message that explains the whole patch, the testing you've done, the provenance of the tun driver this is written for, rudimentary performance measurements, and so on. Be sure to add to the Cc: ... trailers relevant people, such as those various Illumos developers and maybe the author of the tun driver, and whoever else you feel should be there. I'll want to see a review in that thread from somebody who knows Illumos, so if you know anybody there, be sure to have them in the Cc also.

Be sure to read the documentation on submitting patches to mailing lists in the Linux kernel repository.

Yes, this is annoying and probably feels like paperwork. But there's a lot of inconsistency here, and I don't feel alright moving forward with the usual "it's probably fine" stance I ordinarily have.

@zx2c4 zx2c4 closed this Mar 14, 2023
@WireGuard WireGuard locked and limited conversation to collaborators Mar 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants