04 Feb 2019, 17:15

Pkgsrc Buildbots

After talking to Sijmen Mulder on IRC (thanks, TGV Wi-Fi!), I began thinking more about how you could automate the pkgsrc release engineers away.

The basic idea for a buildbot would be this:

  1. Download and unpack latest pkgsrc.tar.gz for the stable branch.
  2. Run the pullup script with the ticket number, then run whatever pullup script it outputs.
  3. Figure out the package that this concerns (perhaps from filenames).
  4. Go to the package in question, install its dependencies from binary packages.
  5. Build (make package is probably enough, or perhaps also install?).
  6. Upload build log to Cloud Storage.
  7. Post an email to the pullup thread with status and a link to the log.

For extra points, do this in a fresh, ephemeral VM, triggered by an incoming mail.

You would also need a buildbot supervisor that receives mails (to know that it should build something) and that launches the VM. I know that Google App Engine could do it, as it can receive emails. But maybe Cloud Functions would be the way to go?

In any case, this would be a cool project for someone, maybe myself :)

Issues with Pull-up Ticket Tracking

This project is largely orthogonal to improvements in the pullup script. Right now, there are a number of issues with it that make it require manual intervention in many cases:

  • The tracker (req) doesn’t do MIME, so sometimes mails are encoded with base64 or quoted-printale. This breaks parsing the commit mails.
  • Sometimes, submitters of tickets insert mail-index.netbsd.org URLs instead of copies of the message.
  • Some pullup tickets include a patch instead of, or in addition to, a list of commits. For instance, this may happen when backporting a fix to an older release instead of pulling up a bigger update.
  • Sometimes, commit messages are truncated, or there are merge conflicts. This mostly happens when there has been a revbump before the change that is to be committed – in the majority of cases, the merge conflicts only concern PKGREVISION lines.

I am wondering how much we could gain, e.g. in terms of MIME support, from changing the request tracking software. admins@ uses RT, which has more features. Perhaps that could be brought to pullup tickets?

29 Dec 2018, 13:12

Supporting Go Modules in pkgsrc, a Proposal

Go 1.11 introduced a new way of building Go code that no longer needs a GOPATH at all. In due course, this will become the default way of building. What’s more, sooner or later, we are going to want to package software that only builds with modules.

There should be some package-settable variable that controls whether you want to use modules or not. If you are going to use modules, then the repo should have a go.mod file. Otherwise (e.g. if there is a dep file or something), the build could start by doing go mod init (which needs to be after make extract).

fetch

There can be two implementations of the fetch phase:

  1. Run go mod download.

It should download required packages into a cache directory, $GOPATH/pkg/mod/cache/download. Then, I propose tarring up the whole tree into a single .tar.gz and putting that into the distfile directory for make checksum. Alternatively, we could have the individual files from the cache as “distfiles”. Note however (see below) that the filenames alone do not contain the module name, so there will be tons of files named v1.0.zip and so on.

  1. “Regular fetch”

Download the .tar.gz (or the set of individual files) from above from the LOCAL_PORTS directory on ftp.n.o, as usual.

The files that go mod download creates are different from any of the ones that upstream provides. Notably, the zip files are based on a VCS checkout followed by re-zipping. Here is an example for the piece of a cache tree corresponding to a single dependency (ignore the lock files):

./github.com/nsf/termbox-go/@v:
list                                           v0.0.0-20180613055208-5c94acc5e6eb.lock        v0.0.0-20180613055208-5c94acc5e6eb.ziphash
list.lock                                      v0.0.0-20180613055208-5c94acc5e6eb.mod
v0.0.0-20180613055208-5c94acc5e6eb.info        v0.0.0-20180613055208-5c94acc5e6eb.zip

As an additional complication, (2) needs to run after “make extract”. Method (1) cannot always be the default, as it needs access to some kind of hosting. A non-developer cannot easily upload the distfile.

extract

In a GOPATH build, we do some gymnastics to move the just-extracted source code into the correct place in a GOPATH. This is no longer necessary, and module builds can just use the same $WRKSRC logic as other software.

build

The dependencies tarball (or individual dependencies files) should be extracted into $GOPATH, which in non-mod builds is propagated through buildlink3.mk files of dependent packages. After this, in all invocations of the go tool, we set GOPROXY=file://$GOPATH/pkg/mod/cache/download, as per this comment from the help:

A Go module proxy is any web server that can respond to GET requests for URLs of a specified form. The requests have no query parameters, so even a site serving from a fixed file system (including a file:/// URL) can be a module proxy.

Even when downloading directly from version control systems, the go command synthesizes explicit info, mod, and zip files and stores them in its local cache, $GOPATH/pkg/mod/cache/download, the same as if it had downloaded them directly from a proxy.

20 Nov 2018, 20:10

Race Condition at the Pool

Recently, I stumbled upon an odd race condition, at the local public pool of all places. The following workflow, which should be standard, does not work:

  1. Buy a 10-entry ticket and pay with debit card.
  2. Immediately try to redeem one entry to, well, go for a swim.

The freshly printed card will be declined, and you have to ask for help. When you leave (because this is Switzerland and everyone is honest, right!?), you hand in the card at the cash desk, and it is perfectly fine.

They tell me that this is a common issue, and they have it a lot.

The only explanation I can find for this behavior is that the system handling the tickets only declares the ticket as valid once it has fully cleared the card transaction – perhaps to reduce the risk of fraud. This fits with the observation that paying with cash does not trigger this issue. Also, I note that the ticket itself is apparently only a record number in a database.

So this is how fraud prevention can annoy the hell out of your customers :)

10 Nov 2018, 18:46

pkgsrc: Upgrading, Part 1

I found this text in my post drafts, where it had been sitting for a bit. Consider this the first part of a series on keeping pkgsrc up to date.

If you have not upgraded the packages in your pkgsrc installation in a while, you might be so far behind on updates that most or all your packages are outdated. Now what?

The easiest way to update you packages in order is to simply use pkg_rolling-replace. Update your pkgsrc tree (either to the latest from cvs, or to a supported quarterly release), then simply run

$ pkg_rolling-replace -uv

This will rebuild the required set of packages, in the right order. This takes a while, as the rebuild is from source, and is somewhat likely to break in the middle. When the compilation of a package fails, the tool just stops and leaves you with an inconsistent (and in the worst case, non-working) set of packages. Good luck fixing things. Making a backup of your /usr/pkg and /var/db/pkg* directories before you start is a good idea.

10 Nov 2018, 17:53

Build Systems: CMake and Autotools

I think I am finally warming up to CMake.

Eight years ago (at FOSDEM 2010), I gave a talk on build systems that explains the fundamentals of automake, autoconf and libtool:

There is nothing in this talk that is no longer valid today as far as I can see, though CMake was “newfangled” then and is a lot less so today. In any case, my conclusion still stands:

  • Don’t try to reinvent the wheel, use a popular build system.
  • You cannot write a portable build system from scratch – so don’t try.

My advice came from pent-up frustration over software that does not build on my platform (MirBSD, at the time) but remains true today. And to be clear:

Autotools is still a good choice for new code.

CMake

However, recent experience has made me like CMake a lot more. For one, it is more common today, which means that packaging systems such as pkgsrc have good support for using it. For instance, in a pkgsrc Makefile, configuring using CMake is as easy as specifying

USE_CMAKE=	yes

As the user of a package (i.e. the person who compiles it), CMake builds are compelling because they (a) configure faster and (b) build faster.

Regarding configuring, it is infuriating (to me) how the run time of the configure script in autotools totally dominates build time, as long as you run make -j12 or similar. CMake typically checks fewer things (I think) and does not run giant blobs of shell, so it is faster.

For the latter, I have noticed that CMake builds typically manage to use all the cores of the machine, while automake-based builds do not. I think (again, this is speculation) that this is because automake encourages one Makefile per directory (which are being run sequentially, not in parallel) and one directory per target, while CMake builds all in one go. Automake can do one Makefile for all directories too, but support for that was added only a few years ago, and it seems rarely used.

CMake builds also have different diagnostics (console output), optionally in color. Some people hate the colors, and they can be garish, but I do like the percentages that are shown for every line.

Concrete case: icewm

When I recently packaged wm/icewm14, I noticed that you now have the choice of CMake or autotools, and I ended up with CMake. There were a few things to fix but its CMakeLists.txt file is reasonably easy to edit. Note that it contains both configuration tests and target declarations. Here is a small example:

ADD_EXECUTABLE(genpref${EXEEXT} genpref.cc ${MISC_SRCS})
TARGET_LINK_LIBRARIES(genpref${EXEEXT} ${EXTRA_LIBS})

# ... other targets ...

INSTALL(TARGETS icewm${EXEEXT} icesh${EXEEXT} icewm-session${EXEEXT} icewmhint${EXEEXT} icewmbg${EXEEXT} DESTINATION ${BINDIR})

Compared to the same thing in automake:

noinst_PROGRAMS = \
	genpref

genpref_SOURCES = \
	intl.h \
	debug.h \
	sysdep.h \
	base.h \
	bindkey.h \
	themable.h \
	default.h \
	genpref.cc
genpref_LDADD = libice.la @LIBINTL@

So if anything, the syntax is no worse but the result is a bit better. I was able to rummage around in CMakeLists.txt without reading any documentation.

05 Aug 2018, 14:43

Working categories

Vacation is a good time for some housekeeping. So I managed to get categories for posts working! In the process, I learned a bit about how hugo works.

Hugo automatically creates so-called taxonomies for tags and categories. The theme I have been using only shows categories in the header itself. And I managed to disable the categories taxonomy in the config file. And I got the syntax for specifying them wrong.

It turns out that the config file is toml and the front matter of posts is yaml, for some inexplicable reason. So the correct syntax for categories is a YAML list:

categories:
 - NetBSD
 - Cloud

Hugo will then create pages for each category under /categories, with a filtered list of posts. Click a category on a post below to see it.

Google Analytics disabled

While here, I also got rid of the Google Analytics include (yay data protection!). However, the whole site remains hosted on Firebase Hosting, which is a Google product.

09 May 2018, 18:44

Windows 10 April Update, unbootable system

A few days ago, I installed the Windows 10 “April update”, and it broke my GRUB installation. What happened?

My primary disk has an MBR partition table. (Apparently, booting from GPT requires using UEFI, which exposes a whole new exciting set of firmware bugs.) GRUB was installed in a small ext2 partition (primary partition #3), while primary partitions #1 and #2 were used by Windows 10. Installing the April update created another primary partition and moved my ext2 partition to slot #4 so grub can no longer find its files.

Rescuing with grub

When this happens, you are greeted with a nice “rescue shell” mode.

grub rescue>

Now what? Unfortunately, contrary to GRUB 1, GRUB 2 cannot do much without loading additional modules. You cannot even chainload Windows from a PBR because the chainloader command needs the chainload module to be loaded.

You can list partitions and directories with the ls command. The set command shows the variables that have been set; there is a variable named prefix that contains the path to the grub files. So after finding out that /boot/grub is now on (hd0,msdos4) instead of (hd0,msdos3) as before, you can re-set the prefix, load the normal module and execute the normal command to do the, well, normal boot:

set prefix=(hd0,msdos4)/boot/grub
insmod normal
normal

Unfortunately, there is no way (that I could find) to make the new prefix permanent, other than running grub-install from a running system. That is, you would be doing this dance on every boot from now on. Also unfortunately,

  • the version of grub2 in pkgsrc is kind of old and
  • the grub2 package has not built, as far as I can tell, for anyone in the last two years.

As a result, I would need to boot a grml.org rescue system and reinstall grub from there.

Working around

There is an easier way to make GRUB find its files again. We just need to make sure the partition table matches what it expects.

A partition table is like an array of pointers (start and length). So you can delete two partitions and recreate them with exactly the same (start, length) tuples but different partition numbers, and things will just work. So I did exactly that, using fdisk -u ld0 on NetBSD. It helps to do this on X, where you can open two terminal windows side by side: one shows the previous partition table, one the interactive fdisk session, and you can copy values around with the mouse. tmux (in the NetBSD base system) on a text screen would also work.

So in my case, I swapped partitions #3 and #4, and GRUB worked again!

And one day later, Windows decided to apply the May update to the April update -.-

Addendum: What new partition!?

Honestly, I am not sure what the new partition is actually doing. Even though I want a single Windows partition, there are now three of them:

  1. an NTFS partition of about 100 MB,
  2. my actual C drive,
  3. A partition of type 0x27 of about 450 MB. Googling shows that this might be MirOS BSD (which I doubt) or a hidden recovery partition of some sort.

I am not sure what is going on here. I would appreciate hints from readers as to why these partitions are there.

18 Jan 2018, 19:44

New Blog!

My new year’s resolution for 2018 has been to blog more. So I decided to create an actual blog!

It started with me closing my Amazon AWS account and writing about it. The posting was up as a Gist on Github, and I shared that URL. This does give a useful viewer and the ability to add comments, but it is hardly discoverable. Anyway, the post was somewhat widely circulated (the provocative title certainly helped) and even made it to Reddit.

After deciding to create a real blog, I started looking into how to do that:

Now I finally have something up and running. The address is simple enough, and based on my Twitter handle:

https://bentsukun.ch/

In true Open Source tradition, the source code for the site is visible online at https://github.com/bsiegert/blog.

For now, I am quite happy with this setup.

The setup

  • Hugo with the purehugo theme,
  • Firebase Hosting,
  • domain at HostTech.

The site itself is created with the excellent Hugo, which has the added advantage of being written in Go :) The installation is as easy as

go get github.com/gohugoio/hugo

or by using the www/hugo package in pkgsrc. Hugo ends up creating a fully static web site, so simple static hosting is enough.

Google Cloud Storage supports hosting a static web site – that’s great, right? Well, not quite. There is no support for HTTPS in plain GCS. If you want HTTPS, apparently you need:

  1. The GCS bucket that holds the files,
  2. A Cloud Load Balancer instance in front
  3. Cloud CDN for delivering content from the edge.

Or, you can use Firebase Hosting, also by Google. This is what I am doing. A simple firebase deploy uploads all the static files, and an SSL certificate is included in the offer.

I tried using Google Domains to register the domain but that service is not available in Switzerland, unfortunately. hosttech.ch to the rescue! It took me a while to figure out that (a) a DNS zone for the domain is included and (b) how to add entries to it.

To get Firebase Hosting to serve directly on the domain, it is necessary to add a google-site-verification TXT record into the DNS for it. And that’s it.

03 Dec 2017, 22:16

Leaving AWS

Today, I deleted my Amazon AWS account.

And done!

I had been on AWS since about 2011. My usage was mainly for two things:

  1. Saving large amounts of files (build logs and such) on S3;
  2. Running NetBSD VMs on EC2.

EC2 is based on Xen, and NetBSD runs really well in PV (paravirtualized) mode on Xen. However, XSA-240 means that a malicious PV guest may crash (or even otherwise exploit) the hypervisor, with the recommended fix being to not run untrusted PV guests. Over night, Amazon disabled PV, making NetBSD VMs useless.

In general, EC2 has been moving away from Xen. The newer instance types already no longer supported PV; there are two higher-performance paravirtualized modes (PVH and PVHVM) that are preferred these days, and that NetBSD does not support. The newest machine types use a custom hypervisor based on KVM.

The way the PV change was rolled out highlighted another long-standing EC2 problem: instances would continue running until the server they ran on got rebooted, at which point they were migrated to a random machine. If the target machine had PV disabled, the VM simply did not come up again. I have had the same type of issue in the past, where your VM randomly landed on a “good” or a “bad” machine and did not come up on the bad one. There is no way (AFAIK) to constrain to a certain subset of servers, e.g. running a certain hypervisor version.

Also, of course, there was no warning or announcement, just that VMs stopped working all of a sudden. A bunch of people were completely caught by surprise when their service became unavailable. I hope you have monitoring!?

The Alternative

Which brings me to where I did take my workloads: Google Cloud Platform.

(This has nothing to do whatsoever with who my employer is. I pay for my GCP usage with my own money.)

These days, NetBSD (8+) runs great on Google Compute Engine. There is a script (that I created) to stage instances at https://github.com/google/netbsd-gce, though there are no official NetBSD images around. My S3 usage works equally well using Google Cloud Storage. And I have always been a fan of App Engine, particularly because of its great Go support. https://bulktracker.appspot.com/ runs on App Engine.

Conclusion

My general impression is: Features are roughly on par, prices on GCP are a bit cheaper, and the Google Cloud SDK and command-line tools are better. So rather than let old, unusable VM images continue to rot and pay Amazon 2$ a month for that bit of storage, I let go of that AWS account. Bye, Amazon.

20 Mar 2012, 20:53

blog @ TNF

So now I am even posting over at TNF on http://blog.NetBSD.org/. Julian Fagir made new NetBSD flyers, and I committed them to the TNF website.

I know that I should write more here but there is not much new on the MirBSD front.

I updated the showcase to NetBSD-6_BETA on the Dom0, and now X refuses to start. Oh well. X does start when using a GENERIC kernel. This is very bad for showcase use, of course :(. pkgsrc is going into freeze very soon, and I did not do a whole lot of MirBSD fixes this time around. This is due to illness, searching for a new job, and working on the Go programming language, which is expected to hit version 1.0 Real Soon Now™.

I brushed up my Algorithms and Data Structures a bit by reading the third volume of TAOCP. Fantastic book.