Kernel Planet

April 20, 2018

Kees Cook: UEFI booting and RAID1

I spent some time yesterday building out a UEFI server that didn’t have on-board hardware RAID for its system drives. In these situations, I always use Linux’s md RAID1 for the root filesystem (and/or /boot). This worked well for BIOS booting since BIOS just transfers control blindly to the MBR of whatever disk it sees (modulo finding a “bootable partition” flag, etc, etc). This means that BIOS doesn’t really care what’s on the drive, it’ll hand over control to the GRUB code in the MBR.

With UEFI, the boot firmware is actually examining the GPT partition table, looking for the partition marked with the “EFI System Partition” (ESP) UUID. Then it looks for a FAT32 filesystem there, and does more things like looking at NVRAM boot entries, or just running BOOT/EFI/BOOTX64.EFI from the FAT32. Under Linux, this .EFI code is either GRUB itself, or Shim which loads GRUB.

So, if I want RAID1 for my root filesystem, that’s fine (GRUB will read md, LVM, etc), but how do I handle /boot/efi (the UEFI ESP)? In everything I found answering this question, the answer was “oh, just manually make an ESP on each drive in your RAID and copy the files around, add a separate NVRAM entry (with efibootmgr) for each drive, and you’re fine!” I did not like this one bit since it meant things could get out of sync between the copies, etc.

The current implementation of Linux’s md RAID puts metadata at the front of a partition. This solves more problems than it creates, but it means the RAID isn’t “invisible” to something that doesn’t know about the metadata. In fact, mdadm warns about this pretty loudly:

# mdadm --create /dev/md0 --level 1 --raid-disks 2 /dev/sda1 /dev/sdb1 mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90

Reading from the mdadm man page:

-e, --metadata= ... 1, 1.0, 1.1, 1.2 default Use the new version-1 format superblock. This has fewer restrictions. It can easily be moved between hosts with different endian-ness, and a recovery operation can be checkpointed and restarted. The different sub-versions store the superblock at different locations on the device, either at the end (for 1.0), at the start (for 1.1) or 4K from the start (for 1.2). "1" is equivalent to "1.2" (the commonly preferred 1.x format). "default" is equivalent to "1.2".

First we toss a FAT32 on the RAID (mkfs.fat -F32 /dev/md0), and looking at the results, the first 4K is entirely zeros, and file doesn’t see a filesystem:

# dd if=/dev/sda1 bs=1K count=5 status=none | hexdump -C 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00001000 fc 4e 2b a9 01 00 00 00 00 00 00 00 00 00 00 00 |.N+.............| ... # file -s /dev/sda1 /dev/sda1: Linux Software RAID version 1.2 ...

So, instead, we’ll use --metadata 1.0 to put the RAID metadata at the end:

# mdadm --create /dev/md0 --level 1 --raid-disks 2 --metadata 1.0 /dev/sda1 /dev/sdb1 ... # mkfs.fat -F32 /dev/md0 # dd if=/dev/sda1 bs=1 skip=80 count=16 status=none | xxd 00000000: 2020 4641 5433 3220 2020 0e1f be77 7cac FAT32 ...w|. # file -s /dev/sda1 /dev/sda1: ... FAT (32 bit)

Now we have a visible FAT32 filesystem on the ESP. UEFI should be able to boot whatever disk hasn’t failed, and grub-install will write to the RAID mounted at /boot/efi.

However, we’re left with a new problem: on (at least) Debian and Ubuntu, grub-install attempts to run efibootmgr to record which disk UEFI should boot from. This fails, though, since it expects a single disk, not a RAID set. In fact, it returns nothing, and tries to run efibootmgr with an empty -d argument:

Installing for x86_64-efi platform. efibootmgr: option requires an argument -- 'd' ... grub-install: error: efibootmgr failed to register the boot entry: Operation not permitted. Failed: grub-install --target=x86_64-efi WARNING: Bootloader is not properly installed, system may not be bootable

Luckily my UEFI boots without NVRAM entries, and I can disable the NVRAM writing via the “Update NVRAM variables to automatically boot into Debian?” debconf prompt when running: dpkg-reconfigure -p low grub-efi-amd64

So, now my system will boot with both or either drive present, and updates from Linux to /boot/efi are visible on all RAID members at boot-time. HOWEVER there is one nasty risk with this setup: if UEFI writes anything to one of the drives (which this firmware did when it wrote out a “boot variable cache” file), it may lead to corrupted results once Linux mounts the RAID (since the member drives won’t have identical block-level copies of the FAT32 any more).

To deal with this “external write” situation, I see some solutions:

Since mdadm has the “--update=resync” assembly option, I can actually do the latter option. This required updating /etc/mdadm/mdadm.conf to add <ignore> on the RAID’s ARRAY line to keep it from auto-starting:

ARRAY <ignore> metadata=1.0 UUID=123...

(Since it’s ignored, I’ve chosen /dev/md100 for the manual assembly below.) Then I added the noauto option to the /boot/efi entry in /etc/fstab:

/dev/md100 /boot/efi vfat noauto,defaults 0 0

And finally I added a systemd oneshot service that assembles the RAID with resync and mounts it:

[Unit] Description=Resync /boot/efi RAID DefaultDependencies=no After=local-fs.target [Service] Type=oneshot ExecStart=/sbin/mdadm -A /dev/md100 --uuid=123... --update=resync ExecStart=/bin/mount /boot/efi RemainAfterExit=yes [Install] WantedBy=sysinit.target

(And don’t forget to run “update-initramfs -u” so the initramfs has an updated copy of /dev/mdadm/mdadm.conf.)

If mdadm.conf supported an “update=” option for ARRAY lines, this would have been trivial. Looking at the source, though, that kind of change doesn’t look easy. I can dream!

And if I wanted to keep a “pristine” version of /boot/efi that UEFI couldn’t update I could rearrange things more dramatically to keep the primary RAID member as a loopback device on a file in the root filesystem (e.g. /boot/efi.img). This would make all external changes in the real ESPs disappear after resync. Something like:

# truncate --size 512M /boot/efi.img # losetup -f --show /boot/efi.img /dev/loop0 # mdadm --create /dev/md100 --level 1 --raid-disks 3 --metadata 1.0 /dev/loop0 /dev/sda1 /dev/sdb1

And at boot just rebuild it from /dev/loop0, though I’m not sure how to “prefer” that partition…

© 2018, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

April 20, 2018 12:34 AM

April 16, 2018

Pete Zaitcev: Suddenly Liferea tonight

Liferea irritated me for many years with a strange behavior when dragging a subscription. You mouse down on the feed, it becomes selected — so far so good. Then you drag it somewhere — possibly far off screen, making the view scroll — then drop it. Drops fine, updates the DB, model, and the view fine. But! The selection then jumps to a completely random feed somewhere.

Well, it's not actually random. What happens instead, the GtkTreeView implements DnD by removing a row, then re-inserting it. When a selected row is removed, obviously the selection has to disappear, but instead it's set to the next row after the removed one. I suppose I may be uniquely vulnerable to this because I have 300+ feeds and I drag them around all the time. If Liferea weren't kind enough to remember the preferred order, this would not matter so much.

I meant to fix this for a long time, but somehow a wrong information got stuck in my head: I thought that Liferea was written in C++, so it took years to gather the motivation. Imagine my surprise when I found plain old C. I spent a good chunk of Sunday figuring out GTK's tree view thingie, but in the end it was quite simple.

April 16, 2018 03:09 PM

April 13, 2018

Kees Cook: security things in Linux v4.16

Previously: v4.15

Linux kernel v4.16 was released last week. I really should write these posts in advance, otherwise I get distracted by the merge window. Regardless, here are some of the security things I think are interesting:

KPTI on arm64

Will Deacon, Catalin Marinas, and several other folks brought Kernel Page Table Isolation (via CONFIG_UNMAP_KERNEL_AT_EL0) to arm64. While most ARMv8+ CPUs were not vulnerable to the primary Meltdown flaw, the Cortex-A75 does need KPTI to be safe from memory content leaks. It’s worth noting, though, that KPTI does protect other ARMv8+ CPU models from having privileged register contents exposed. So, whatever your threat model, it’s very nice to have this clean isolation between kernel and userspace page tables for all ARMv8+ CPUs.

hardened usercopy whitelisting
While whole-object bounds checking was implemented in CONFIG_HARDENED_USERCOPY already, David Windsor and I finished another part of the porting work of grsecurity’s PAX_USERCOPY protection: usercopy whitelisting. This further tightens the scope of slab allocations that can be copied to/from userspace. Now, instead of allowing all objects in slab memory to be copied, only the whitelisted areas (where a subsystem has specifically marked the memory region allowed) can be copied. For example, only the auxv array out of the larger mm_struct.

As mentioned in the first commit from the series, this reduces the scope of slab memory that could be copied out of the kernel in the face of a bug to under 15%. As can be seen, one area of work remaining are the kmalloc regions. Those are regularly used for copying things in and out of userspace, but they’re also used for small simple allocations that aren’t meant to be exposed to userspace. Working to separate these kmalloc users needs some careful auditing.

Total Slab Memory: 48074720 Usercopyable Memory: 6367532 13.2% task_struct 0.2% 4480/1630720 RAW 0.3% 300/96000 RAWv6 2.1% 1408/64768 ext4_inode_cache 3.0% 269760/8740224 dentry 11.1% 585984/5273856 mm_struct 29.1% 54912/188448 kmalloc-8 100.0% 24576/24576 kmalloc-16 100.0% 28672/28672 kmalloc-32 100.0% 81920/81920 kmalloc-192 100.0% 96768/96768 kmalloc-128 100.0% 143360/143360 names_cache 100.0% 163840/163840 kmalloc-64 100.0% 167936/167936 kmalloc-256 100.0% 339968/339968 kmalloc-512 100.0% 350720/350720 kmalloc-96 100.0% 455616/455616 kmalloc-8192 100.0% 655360/655360 kmalloc-1024 100.0% 812032/812032 kmalloc-4096 100.0% 819200/819200 kmalloc-2048 100.0% 1310720/1310720

This series took quite a while to land (you can see David’s original patch date as back in June of last year). Partly this was due to having to spend a lot of time researching the code paths so that each whitelist could be explained for commit logs, partly due to making various adjustments from maintainer feedback, and partly due to the short merge window in v4.15 (when it was originally proposed for merging) combined with some last-minute glitches that made Linus nervous. After baking in linux-next for almost two full development cycles, it finally landed. (Though be sure to disable CONFIG_HARDENED_USERCOPY_FALLBACK to gain enforcement of the whitelists — by default it only warns and falls back to the full-object checking.)

automatic stack-protector

While the stack-protector features of the kernel have existed for quite some time, it has never been enabled by default. This was mainly due to needing to evaluate compiler support for the feature, and Kconfig didn’t have a way to check the compiler features before offering CONFIG_* options. As a defense technology, the stack protector is pretty mature. Having it on by default would have greatly reduced the impact of things like the BlueBorne attack (CVE-2017-1000251), as fewer systems would have lacked the defense.

After spending quite a bit of time fighting with ancient compiler versions (*cough*GCC 4.4.4*cough*), I landed CONFIG_CC_STACKPROTECTOR_AUTO, which is default on, and tries to use the stack protector if it is available. The implementation of the solution, however, did not please Linus, though he allowed it to be merged. In the future, Kconfig will gain the knowledge to make better decisions which lets the kernel expose the availability of (the now default) stack protector directly in Kconfig, rather than depending on rather ugly Makefile hacks.

That’s it for now; let me know if you think I should add anything! The v4.17 merge window is open. :)

Edit: added details on ARM register leaks, thanks to Daniel Micay.

© 2018, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

April 13, 2018 12:04 AM

April 11, 2018

James Morris: Linux Security Summit North America 2018 CFP Announced

lss logo

The CFP for the 2018 Linux Security Summit North America (LSS-NA) is announced.

LSS will be held this year as two separate events, one in North America
(LSS-NA), and one in Europe (LSS-EU), to facilitate broader participation in
Linux Security development. Note that this CFP is for LSS-NA; a separate CFP
will be announced for LSS-EU in May. We encourage everyone to attend both
events.

LSS-NA 2018 will be held in Vancouver, Canada, co-located with the Open Source Summit.

The CFP closes on June 3rd and the event runs from 27th-28th August.

To make a CFP submission, click here.

April 11, 2018 11:29 PM

April 10, 2018

Linux Plumbers Conference: Welcome to the 2018 LPC blog

Planning for the 2018 Linux Plumbers Conference is well underway at this point. The planning committee will be posting various informational blurbs here, including information on hotels, microconference acceptance, evening events, scheduling, and so on. Next up will be a “call for microconferences” that should appear soon.

LPC will be held at the Sheraton Vancouver Wall Center in Vancouver, British Columbia, Canada, November 13-15, colocated with the Linux Kernel Summit.

April 10, 2018 04:40 PM

April 06, 2018

Pete Zaitcev: With Blockchain Technology

Recently it became common to see a mocking of startup founders that add "blockchain" to something, then sell it to gullible VCs and reap the green harvest. Apparently it has become quite a thing. But now they went a step further.

The other day I was watching some anime at Crunchyroll, when a commercial came up. It pitched a fantasy sports site "with blockchain technology" and smart contracts. The remarkable part about it is, it wasn't aimed at investors. It was a consumer advertisement. Its creators apparently expect members of the public — who play fantasy sports, no less — to know that blockchain exists and think about it in positive terms.

April 06, 2018 05:19 AM

April 05, 2018

Matthew Garrett: Linux kernel lockdown and UEFI Secure Boot

David Howells recently published the latest version of his kernel lockdown patchset. This is intended to strengthen the boundary between root and the kernel by imposing additional restrictions that prevent root from modifying the kernel at runtime. It's not the first feature of this sort - /dev/mem no longer allows you to overwrite arbitrary kernel memory, and you can configure the kernel so only signed modules can be loaded. But the present state of things is that these security features can be easily circumvented (by using kexec to modify the kernel security policy, for instance).

Why do you want lockdown? If you've got a setup where you know that your system is booting a trustworthy kernel (you're running a system that does cryptographic verification of its boot chain, or you built and installed the kernel yourself, for instance) then you can trust the kernel to keep secrets safe from even root. But if root is able to modify the running kernel, that guarantee goes away. As a result, it makes sense to extend the security policy from the boot environment up to the running kernel - it's really just an extension of configuring the kernel to require signed modules.

The patchset itself isn't hugely conceptually controversial, although there's disagreement over the precise form of certain restrictions. But one patch has, because it associates whether or not lockdown is enabled with whether or not UEFI Secure Boot is enabled. There's some backstory that's important here.

Most kernel features get turned on or off by either build-time configuration or by passing arguments to the kernel at boot time. There's two ways that this patchset allows a bootloader to tell the kernel to enable lockdown mode - it can either pass the lockdown argument on the kernel command line, or it can set the secure_boot flag in the bootparams structure that's passed to the kernel. If you're running in an environment where you're able to verify the kernel before booting it (either through cryptographic validation of the kernel, or knowing that there's a secret tied to the TPM that will prevent the system booting if the kernel's been tampered with), you can turn on lockdown.

There's a catch on UEFI systems, though - you can build the kernel so that it looks like an EFI executable, and then run it directly from the firmware. The firmware doesn't know about Linux, so can't populate the bootparam structure, and there's no mechanism to enforce command lines so we can't rely on that either. The controversial patch simply adds a kernel configuration option that automatically enables lockdown when UEFI secure boot is enabled and otherwise leaves it up to the user to choose whether or not to turn it on.

Why do we want lockdown enabled when booting via UEFI secure boot? UEFI secure boot is designed to prevent the booting of any bootloaders that the owner of the system doesn't consider trustworthy[1]. But a bootloader is only software - the only thing that distinguishes it from, say, Firefox is that Firefox is running in user mode and has no direct access to the hardware. The kernel does have direct access to the hardware, and so there's no meaningful distinction between what grub can do and what the kernel can do. If you can run arbitrary code in the kernel then you can use the kernel to boot anything you want, which defeats the point of UEFI Secure Boot. Linux distributions don't want their kernels to be used to be used as part of an attack chain against other distributions or operating systems, so they enable lockdown (or equivalent functionality) for kernels booted this way.

So why not enable it everywhere? There's a couple of reasons. The first is that some of the features may break things people need - for instance, some strange embedded apps communicate with PCI devices by mmap()ing resources directly from sysfs[2]. This is blocked by lockdown, which would break them. Distributions would then have to ship an additional kernel that had lockdown disabled (it's not possible to just have a command line argument that disables it, because an attacker could simply pass that), and users would have to disable secure boot to boot that anyway. It's easier to just tie the two together.

The second is that it presents a promise of security that isn't really there if your system didn't verify the kernel. If an attacker can replace your bootloader or kernel then the ability to modify your kernel at runtime is less interesting - they can just wait for the next reboot. Appearing to give users safety assurances that are much less strong than they seem to be isn't good for keeping users safe.

So, what about people whose work is impacted by lockdown? Right now there's two ways to get stuff blocked by lockdown unblocked: either disable secure boot[3] (which will disable it until you enable secure boot again) or press alt-sysrq-x (which will disable it until the next boot). Discussion has suggested that having an additional secure variable that disables lockdown without disabling secure boot validation might be helpful, and it's not difficult to implement that so it'll probably happen.

Overall: the patchset isn't controversial, just the way it's integrated with UEFI secure boot. The reason it's integrated with UEFI secure boot is because that's the policy most distributions want, since the alternative is to enable it everywhere even when it doesn't provide real benefits but does provide additional support overhead. You can use it even if you're not using UEFI secure boot. We should have just called it securelevel.

[1] Of course, if the owner of a system isn't allowed to make that determination themselves, the same technology is restricting the freedom of the user. This is abhorrent, and sadly it's the default situation in many devices outside the PC ecosystem - most of them not using UEFI. But almost any security solution that aims to prevent malicious software from running can also be used to prevent any software from running, and the problem here is the people unwilling to provide that policy to users rather than the security features.
[2] This is how X.org used to work until the advent of kernel modesetting
[3] If your vendor doesn't provide a firmware option for this, run sudo mokutil --disable-validation

comment count unavailable comments

April 05, 2018 01:07 AM

April 04, 2018

Pete Zaitcev: Jim Whitehurst on OpenStack in 2018

Remarks of our CEO, as captured in an interview by TechCrunch:

The other major open-source project Red Hat is betting on is OpenStack . That may come as a bit of a surprise, given that popular opinion in the last year or so has shifted against the massive project that wants to give enterprises an open source on-premise alternative to AWS and other cloud providers. “There was a sense among big enterprise tech companies that OpenStack was going to be their savior from Amazon,” Whitehurst said. “But even OpenStack, flawlessly executed, put you where Amazon was five years ago. If you’re Cisco or HP or any of those big OEMs, you’ll say that OpenStack was a disappointment. But from our view as a software company, we are seeing good traction.”

He's over-simplifying things for the constraints of an interview: the last sencence needs unpacking. Why do you think that "traction" happens? Because OpenStack gives its users something that Amazon does not. For example, Swift isn't trying to match features of S3. Attempting to do that would cause the exact lag he's referring. Instead, Swift works to solve the problem of people who want to own their own data in general. So, it's mostly about the implementation: how to make it scalable, inexpensive, etc. And, of course, keeing it open source, preserving user's freedom to modify. This is why often you see people installing a truncated OpenStack that only has Swift. I'm sure this applies to other parts of OpenStack, in particular the SDN/NFV.

April 04, 2018 04:44 PM

April 03, 2018

Paul E. Mc Kenney: A Linux-kernel memory model!

A big “thank you” to all my partners in LKMM crime, most especially to Jade, Luc, Andrea, and Alan! Jade presented our paper (slides, supplementary material) at ASPLOS, which was well-received. A number of people asked how they could learn more about LKMM, which is what much of this blog post is about.

Approaches to learning LKMM include:


  1. Read the documentation, starting with explanation.txt. This documentation replaces most of the older LWN series.
  2. Go through Ted Cooper's coursework for Portland State University's CS510 Advanced Topics in Concurrency class, taught by Jon Walpole.
  3. Those interested in the history of LKMM might wish to look at my 2017 linux.conf.au presentation (video).
  4. Play with the actual model.
The first three options are straightforward, but playing with the model requires some installation. However, playing with the model is probably key to gaining a full understanding of LKMM, so this installation step is well worth the effort.

Installation instructions may be found here (see the “REQUIREMENTS” section). The ocaml language is a prerequisite, which is fortunately included in many Linux distros. If you choose to install ocaml from source (for example, because you need a more recent version), do yourself a favor and read the instructions completely before starting the build process! Otherwise, you will find yourself learning of the convenient one-step build process only after carrying out the laborious five-step process, which can be a bit frustrating.

Of course, if you come across better methods to quickly, easily, and thoroughly learn LKMM, please do not keep them a secret!

Those wanting a few rules of thumb safely approximating LKMM should look at slide 96 (PDF page 78) of the aforementioned linux.conf.au presentation. Please note that material earlier in the presentation is required to make sense of the three rules of thumb.

We also got some excellent questions during Jade's ASPLOS talk, mainly from the renowned and irrepressible Sarita Adve:There were of course a great many other excellent presentations at ASPLOS, but that is a topic for another post!

April 03, 2018 06:35 PM

April 02, 2018

Pete Zaitcev: Wayland versus Glib in Liferea on F27

I decided to build Liferea over the weekend, and the build crashes at the introspection phase.

Apparently, GTK+ programs are set up to introspect themselves: basically the binary can look at its own types or whatnot, then output the result. I'm not quite clear what the purpose of that is, the online docs imply that it's for API documentation mostly. Anyhow, the build runs the liferea binary itself, with arguments that make it run the introspection, then this happens:

(gdb) where
#0  0x00007fa90a2a93b0 in wl_list_insert_list ()
    at /lib64/libwayland-server.so.0
#1  0x00007fa90a2a4e6f in wl_priv_signal_emit ()
    at /lib64/libwayland-server.so.0
#2  0x00007fa90a2a5477 in wl_display_destroy ()
    at /lib64/libwayland-server.so.0
#3  0x00007fa916d163d9 in \
  WebCore::PlatformDisplayWayland::~PlatformDisplayWayland() () at \
  /lib64/libwebkit2gtk-4.0.so.37
#4  0x00007fa916d163e9 in \
  WebCore::PlatformDisplayWayland::~PlatformDisplayWayland() () at \
  /lib64/libwebkit2gtk-4.0.so.37
#5  0x00007fa91100cb58 in __run_exit_handlers () at /lib64/libc.so.6
#6  0x00007fa91100cbaa in  () at /lib64/libc.so.6
#7  0x00007fa911e9d367 in  () at /lib64/libgirepository-1.0.so.1
#8  0x00007fa91197d188 in parse_arg.isra () at /lib64/libglib-2.0.so.0
#9  0x00007fa91197d8ca in parse_long_option () at /lib64/libglib-2.0.so.0
#10 0x00007fa91197f2d6 in g_option_context_parse () at \
  /lib64/libglib-2.0.so.0
#11 0x00007fa91197fd84 in g_option_context_parse_strv ()
    at /lib64/libglib-2.0.so.0
#12 0x00007fa912164558 in g_application_real_local_command_line ()
    at /lib64/libgio-2.0.so.0
#13 0x00007fa912164bf6 in g_application_run () at /lib64/libgio-2.0.so.0
#14 0x000000000041b9ff in main (argc=2, argv=0x7fff2e1203d8) at main.c:77

As much as I can tell, despite being asked only to do the introspection, Liferea (unknowingly, through GTK+) pokes Wayland, which sets exit handlers. However, Wayland is never used (introspection, duh), and not initialized completely, so when its exit handlers run, it crashes.

Well, now what?

I supplse the cleanest approach might be to modify Glib so it avoids provoking Wayland when merely introspecting. But honestly I have no clue about desktop apps and do not know where to even start looking.

UPDATE: Much thanks to Branko Grubic, who pointed me to a bug in WebKit. Currently building with this as a workaround:

--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -82,6 +82,7 @@ INTROSPECTION_GIRS = Liferea-3.0.gir
 
 Liferea-3.0.gir: liferea$(EXEEXT)
 INTROSPECTION_SCANNER_ARGS = -I$(top_srcdir)/src --warn-all -......
+INTROSPECTION_SCANNER_ENV = WEBKIT_DISABLE_COMPOSITING_MODE=1
 Liferea_3_0_gir_NAMESPACE = Liferea
 Liferea_3_0_gir_VERSION = 3.0
 Liferea_3_0_gir_PROGRAM = $(builddir)/liferea$(EXEEXT)

April 02, 2018 05:59 PM

March 20, 2018

Davidlohr Bueso: Linux v4.15: Performance Goodies

With the Meltdown and Spectre fiascos, performance isn't a very hot topic at the moment. In fact, with Linux v4.15 released, it is one of the rare times I've seen security win over performance in such a one sided way. Normally security features are tucked away under a kernel config option nobody really uses. Of course the software fixes are also backported in one way or another, so this isn't really specific to the latest kernel release.

All this said, v4.15 came out with a few performance enhancements across subsystems. The following is an unsorted and incomplete list of changes that went in. Note that the term 'performance' can be vague in that some gains in one area can negatively affect another, so take everything with a grain of salt and reach your own conclusions.

epoll: scale nested calls

Nested epolls are necessary to allow semantics where a file descriptor in the epoll interested-list is also an epoll instance. Such calls are not all that common, but some real world applications suffered severe performance issues in that it relied on global spinlocks, acquired throughout the callbacks in the epoll state machine. By removing them, we can speed up adding fds to the instance as well as polling, such that epoll_wait() can improve by 100x, scaling linearly when increasing amounts of cores block an an event.
[Commit 57a173bdf5ba,  37b5e5212a44]


pvspinlock: hybrid fairness paravirt semantics

Locking under virtual environments can be tricky, balancing performance and fairness while avoiding artifacts such as starvation and lock holder/waiter preemption. The current paravirtual queued spinlocks, while free from starvation, can perform less optimally than an unfair lock in guests with CPU over-commitment. With Linux v4.15, guest spinlocks now combine the best of both worlds, with an unfair and a queued mode. The idea is that, upon contention, extend the lock stealing attempt in the slowpath (unfair mode) as long as there are queued MCS waiters present, hence improving performance while avoiding starvation. Kernel build experiments show that as a VM becomes more and more over-committed, the ratio of locks acquired in unfair mode increases.
[Commit 11752adb68a3]


mm,x86: avoid saving/restoring interrupts state in gup

When x86 was converted to use the generic get_user_pages_fast() call a performance regression was introduced at a microbenchmark level. The generic gup function attempts to walk the page tables without acquiring any locks, such as the mmap semaphore. In order to do this, interrupts must be disabled, which is where things went different between the arch-specific and generic flavors. The later must save and restore the current state of interrupt, introducing extra overhead when compared to a simple local_irq_enable/disable().
[Commit 5b65c4677a57]



ipc: scale INFO commands

Any syscall used to get info from sysvipc (such as semctl(IPC_INFO) or shmctl(SHM_INFO)) requires internally computing the last ipc identifier. For cases with large amounts of keys, this operation alone can consume a large amount of cycles as it looked up on-demand, in O(N). In order to make this information available in constant time, we keep track of it whenever a new identifier is added.
[Commit 15df03c87983]



ext4:  improve smp scalability for inode generation

The superblock's inode generation number was currently sequentially increased (from a randomly initialized value) and protected by a spinlock, making the usage pattern quite primitive and not very friendly to workloads that are generating files/inodes concurrently. The inode generation path was optimized to remove the lock altogether and simply rely on prandom_u32() such that a fast/seeded pseudo random-number algorithm is used for computing the i_generation.
[Commit 232530680290]

March 20, 2018 05:37 PM

March 15, 2018

Pete Zaitcev: The more you tighten your grip

Seen at the webpage for RancherOS:

Everything in RancherOS is a Docker container. We accomplish this by launching two instances of Docker. One is what we call System Docker, the first process on the system. All other system services, like ntpd, syslog, and console, are running in Docker containers. System Docker replaces traditional init systems like systemd, and can be used to launch additional system services.

March 15, 2018 10:33 PM

March 13, 2018

Pete Zaitcev: You Are Not Uber: Only Uber Are Uber

Remember how FAA shut down the business of NavWorx, with heavy monetary and loss-of-use consequences for its customers? Imagine receiving a letter from U.S. Government telling you that your car is not compatible with roads, and therefore you are prohibited from continuing to drive it. Someone sure forgot that the power to regulate is the power to destroy. This week, we have this report by IEEE Spectrum:

IEEE Spectrum can reveal that the SpaceBees are almost certainly the first spacecraft from a Silicon Valley startup called Swarm Technologies, currently still in stealth mode. Swarm was founded in 2016 by one engineer who developed a spacecraft concept for Google and another who sold his previous company to Apple. The SpaceBees were built as technology demonstrators for a new space-based Internet of Things communications network.

The only problem is, the Federal Communications Commission (FCC) had dismissed Swarm’s application for its experimental satellites a month earlier, on safety grounds.

On Wednesday, the FCC sent Swarm a letter revoking its authorization for a follow-up mission with four more satellites, due to launch next month. A pending application for a large market trial of Swarm’s system with two Fortune 100 companies could also be in jeopardy.

Swarm Technologies, based in Menlo Park, Calif., is the brainchild of two talented young aerospace engineers. Sara Spangelo, its CEO, is a Canadian who worked at NASA’s Jet Propulsion Laboratory, before moving to Google in 2016. Spangelo’s astronaut candidate profile at the Canadian Space Agency says that while at Google, she led a team developing a spacecraft concept for its moonshot X division, including both technical and market analyses.

Swarm CFO Benjamin Longmier has an equally impressive resume. In 2015, he sold his near-space balloon company Aether Industries to Apple, before taking a teaching post at the University of Michigan. He is also co-founder of Apollo Fusion, a company producing an innovative electric propulsion system for satellites.

Although a leading supplier in its market, NavWorx was a bit player at the government level. Not that many people have small private airplanes anymore. But Swarm operates at a different level, an may be able to grease a enough palms in the Washington, D.C., enough to survive this debacle. Or, they may reconstitute as a notionally new company, then claim a clean start. Again unlike the NavWorx, there's no installed base.

March 13, 2018 03:45 PM

March 11, 2018

Greg Kroah-Hartman: My affidavit in the Geniatech vs. McHardy case

As many people know, last week there was a court hearing in the Geniatech vs. McHardy case. This was a case brought claiming a license violation of the Linux kernel in Geniatech devices in the German court of OLG Cologne.

Harald Welte has written up a wonderful summary of the hearing, I strongly recommend that everyone go read that first.

In Harald’s summary, he refers to an affidavit that I provided to the court. Because the case was withdrawn by McHardy, my affidavit was not entered into the public record. I had always assumed that my affidavit would be made public, and since I have had a number of people ask me about what it contained, I figured it was good to just publish it for everyone to be able to see it.

There are some minor edits from what was exactly submitted to the court such as the side-by-side German translation of the English text, and some reformatting around some footnotes in the text, because I don’t know how to do that directly here, and they really were not all that relevant for anyone who reads this blog. Exhibit A is also not reproduced as it’s just a huge list of all of the kernel releases in which I felt that were no evidence of any contribution by Patrick McHardy.

AFFIDAVIT

I, the undersigned, Greg Kroah-Hartman,
declare in lieu of an oath and in the
knowledge that a wrong declaration in
lieu of an oath is punishable, to be
submitted before the Court:

I. With regard to me personally:

1. I have been an active contributor to
   the Linux Kernel since 1999.

2. Since February 1, 2012 I have been a
   Linux Foundation Fellow.  I am currently
   one of five Linux Foundation Fellows
   devoted to full time maintenance and
   advancement of Linux. In particular, I am
   the current Linux stable Kernel maintainer
   and manage the stable Kernel releases. I
   am also the maintainer for a variety of
   different subsystems that include USB,
   staging, driver core, tty, and sysfs,
   among others.

3. I have been a member of the Linux
   Technical Advisory Board since 2005.

4. I have authored two books on Linux Kernel
   development including Linux Kernel in a
   Nutshell (2006) and Linux Device Drivers
   (co-authored Third Edition in 2009.)

5. I have been a contributing editor to Linux
   Journal from 2003 - 2006.

6. I am a co-author of every Linux Kernel
   Development Report. The first report was
   based on my Ottawa Linux Symposium keynote
   in 2006, and the report has been published
   every few years since then. I have been
   one of the co-author on all of them. This
   report includes a periodic in-depth
   analysis of who is currently contributing
   to Linux. Because of this work, I have an
   in-depth knowledge of the various records
   of contributions that have been maintained
   over the course of the Linux Kernel
   project.

   For many years, Linus Torvalds compiled a
   list of contributors to the Linux kernel
   with each release. There are also usenet
   and email records of contributions made
   prior to 2005. In April of 2005, Linus
   Torvalds created a program now known as
   “Git” which is a version control system
   for tracking changes in computer files and
   coordinating work on those files among
   multiple people. Every Git directory on
   every computer contains an accurate
   repository with complete history and full
   version tracking abilities.  Every Git
   directory captures the identity of
   contributors.  Development of the Linux
   kernel has been tracked and managed using
   Git since April of 2005.

   One of the findings in the report is that
   since the 2.6.11 release in 2005, a total
   of 15,637 developers have contributed to
   the Linux Kernel.

7. I have been an advisor on the Cregit
   project and compared its results to other
   methods that have been used to identify
   contributors and contributions to the
   Linux Kernel, such as a tool known as “git
   blame” that is used by developers to
   identify contributions to a git repository
   such as the repositories used by the Linux
   Kernel project.

8. I have been shown documents related to
   court actions by Patrick McHardy to
   enforce copyright claims regarding the
   Linux Kernel. I have heard many people
   familiar with the court actions discuss
   the cases and the threats of injunction
   McHardy leverages to obtain financial
   settlements. I have not otherwise been
   involved in any of the previous court
   actions.

II. With regard to the facts:

1. The Linux Kernel project started in 1991
   with a release of code authored entirely
   by Linus Torvalds (who is also currently a
   Linux Foundation Fellow).  Since that time
   there have been a variety of ways in which
   contributions and contributors to the
   Linux Kernel have been tracked and
   identified. I am familiar with these
   records.

2. The first record of any contribution
   explicitly attributed to Patrick McHardy
   to the Linux kernel is April 23, 2002.
   McHardy’s last contribution to the Linux
   Kernel was made on November 24, 2015.

3. The Linux Kernel 2.5.12 was released by
   Linus Torvalds on April 30, 2002.

4. After review of the relevant records, I
   conclude that there is no evidence in the
   records that the Kernel community relies
   upon to identify contributions and
   contributors that Patrick McHardy made any
   code contributions to versions of the
   Linux Kernel earlier than 2.4.18 and
   2.5.12. Attached as Exhibit A is a list of
   Kernel releases which have no evidence in
   the relevant records of any contribution
   by Patrick McHardy.

March 11, 2018 01:51 AM

March 07, 2018

Dave Airlie (blogspot): radv - Vulkan 1.1 conformant on launch day

Vulkan 1.1 was officially released today, and thanks to a big effort by Bas and a lot of shared work from the Intel anv developers, radv is a launch day conformant implementation.

https://www.khronos.org/conformance/adopters/conformant-products#submission_308

is a link to the conformance results. This is also radv's first time to be officially conformant on Vega GPUs. 

https://patchwork.freedesktop.org/series/39535/
is the patch series, it requires a bunch of common anv patches to land first. This stuff should all be landing in Mesa shortly or most likely already will have by the time you read this.

In order to advertise 1.1 you need at least a 4.15 Linux kernel.

Thanks to the all involved in making this happen, including the behind the scenes effort to allow radv to participate in the launch day!

March 07, 2018 07:13 PM

March 04, 2018

Pete Zaitcev: MITM in Ireland

I'm just back from OpenStack PTG (Project Technical Gathering) in Dublin, Ireland and while I was there, Firefox reported wrong TLS certificates for some obscure websites, although not others. Example: zaitcev.us retains old certificate, as does wrk.ru. But sealion.club goes bad. I presume that Irish authorities and/or ISPs deemed it proper to MITM these sites. The question is, why such a strange choice of targets?

The sealion.club is a free speech and discussion site, named, as much as I can tell, after an old (possibly classic or memetic) Wondermark cartoon. Maybe the Irish just hate the free speech.

Or, they do not MITM sites that have TLS settings that are too easy to break... and Gmail.

March 04, 2018 07:14 AM

February 21, 2018

Paul E. Mc Kenney: Exit Libris

I have only so many bookshelves, and I have not yet bought into ereaders, so from time to time books must leave. Here is the current batch:



It is a bit sad to abandon some old friends, but such is life with physical books!

February 21, 2018 05:06 AM

February 16, 2018

Pete Zaitcev: ARM servers apparently exist at last

Check out what I found at Pogo Linux (h/t Bryan Lunduke):

ARM R150-T62
2 x Cavium® ThunderX™ 48 Core ARM processors
16 x DDR4 DIMM slots
3 x 40GbE QSFP+ LAN ports
4 x 10GbE SFP+ LAN ports
4 x 3.5” hot-swappable HDD/SSD bays
650W 80 PLUS Platinum redundant PSU
$5,638.82

The prices are ridiculouts, but at least it's a server with CentOS.

February 16, 2018 06:42 AM

Dave Airlie (blogspot): virgl caps - oops I messed.up

When I designed virgl I added a capability system to pass some info about the host GL to the guest driver along the lines of gallium caps. The design was at the virtio GPU level you have a number of capsets each of which has a max version and max size.

The virgl capset is capset 1 with max version 1 and size 308 bytes.

Until now we've happily been using version 1 at 308 bytes. Recently we decided we wanted to have a v2 at 380 bytes, and the world fell apart.

It turned out there is a bug in the guest kernel driver, it asks the host for a list of capsets and allows guest userspace to retrieve from it. The guest userspace has it's own copy of the struct.

The flow is:
Guest mesa driver gives kernel a caps struct to fill out for capset 1.
Kernel driver asks the host over virtio for latest capset 1 info, max size, version.
Host gives it the max_size, version for capset 1.
Kernel driver asks host to fill out malloced memory of the max_size with the
caps struct.
Kernel driver copies the returned caps struct to userspace, using the size of the returned host struct.

The bug is the last line, it uses the size of the returned host struct which ends up corrupting the guest in the scenario where the host has a capset 1 v2, size 380, but the host is still running old userspace which understands capset v1, size 308.

The 380 bytes gets memcpy over the 308 byte struct and boom.

Now we can fix the kernel to not do this, but we can't upgrade every kernel in an existing VM. So if we allow the virglrenderer process to expose a v2 all older sw will explode unless it is also upgraded which isn't really something you want in a VM world.

I came up with some virglrenderer workarounds, but due to another bug where qemu doesn't reset virglrenderer when it should, there was no way to make it reliable, and things like kexec old kernel from new kernel would blow up.

I decided in the end to bite the bullet and just make capset 2 be a repaired one. Unfortunately this needs patches in all 4 components before it can be used.

1) virglrenderer needs to expose capset 2 with the new version/size to qemu.
2) qemu needs to allow the virtio-gpu to transfer capset 2 as a virgl capset to the host.
3) The kernel on the host needs fixing to make sure we copy the minimum of the host caps and the guest caps into the guest userspace driver, then it needs to
provide a way that guest userspace knows the fixed version is in place.
4) The guest userspace needs to check if the guest kernel has the fix, and then query capset 2 first, and fallback to querying capset 1.

After talking to a few other devs in virgl land, they pointed out we could probably just never add a new version of capset 2, and grow the struct endlessly.

The guest driver would fill out the struct it wants to use with it's copy of default minimum values.
It would then call the kernel ioctl to copy over the host caps. The kernel ioctl would copy the minimum size of the host caps and the guest caps.

In this case if the host has a 400 byte capset 2, and the guest still only has 380 byte capset 2, the new fields from the host won't get copied into the guest struct
and it will be fine.

If the guest has the 400 byte capset 2, but the host only has the 380 byte capset 2, the guest would preinit the extra 20 bytes with it's default values (0 or whatever) and the kernel would only copy 380 bytes into the start of the 400 bytes and leave the extra bytes alone.

Now I just have to got write the patches and confirm it all.

Thanks to Stephane at google for creating the patch that showed how broken it was, and to others in the virgl community who noticed how badly it broke old guests! Now to go write the patches...

February 16, 2018 12:11 AM

February 14, 2018

Pete Zaitcev: More system administration in the age of SystemD

I'm tinkering with OpenStack TripleO in a simulated environment. It uses a dedicated non-privileged user, "stack", which can do things such as list VMs with "virsh list". So, yesterday I stopped the undercloud VM, and went to sleep. Today, I want to restart it... but virsh says:

error: failed to connect to the hypervisor
error: Cannot create user runtime directory '/run/user/1000/libvirt': Permission denied

What seems to happen is that when one logs into the stack@ user over ssh, systemd-logind mounts that /run/user/UID thing, but if I log as zaitcev@ and then do "su - stack", this fails to occur.

I have no idea what to do about this. It's probably trivial for someone more knowledgeable to throw the right pam_systemd line into /etc/pam.d/su. But su-l includes system-auth, which invokes pam_systemd.so, and yet... Oh well.

February 14, 2018 11:23 PM

February 06, 2018

Eric Sandeen: LEAF battery replacement update

New LEAF battery

Just a quick note here – the LEAF battery did finally go under warranty on Sept 24, 2017, and I got it replaced with minimal hassle back in great shape on October 3.  The LeafSPY stats on the new battery actually dropped fairly quickly after I got it which was worrisome, but now (in the very cold weather) it’s holding steady at about 97% state of health, with 62.3Ahr and 90.35Hx.

The stats when it finally dropped the 9th bar were:

Miles: 40623
Ahr: 43.51
Hx: 45.25

I’ve definitely needed that fresh capacity for this harsh winter, it’s been fine, but frigid mornings still show the Guess-o-Meter at as low as 50-60 miles at times.

February 06, 2018 08:25 PM

February 05, 2018

Kees Cook: security things in Linux v4.15

Previously: v4.14.

Linux kernel v4.15 was released last week, and there’s a bunch of security things I think are interesting:

Kernel Page Table Isolation
PTI has already gotten plenty of reporting, but to summarize, it is mainly to protect against CPU cache timing side-channel attacks that can expose kernel memory contents to userspace (CVE-2017-5754, the speculative execution “rogue data cache load” or “Meltdown” flaw).

Even for just x86_64 (as CONFIG_PAGE_TABLE_ISOLATION), this was a giant amount of work, and tons of people helped with it over several months. PowerPC also had mitigations land, and arm64 (as CONFIG_UNMAP_KERNEL_AT_EL0) will have PTI in v4.16 (though only the Cortex-A75 is vulnerable). For anyone with really old hardware, x86_32 is under development, too.

An additional benefit of the x86_64 PTI is that since there are now two copies of the page tables, the kernel-mode copy of the userspace mappings can be marked entirely non-executable, which means pre-SMEP hardware now gains SMEP emulation. Kernel exploits that try to jump into userspace memory to continue running malicious code are dead (even if the attacker manages to turn SMEP off first). With some more work, SMAP emulation could also be introduced (to stop even just reading malicious userspace memory), which would close the door on these common attack vectors. It’s worth noting that arm64 has had the equivalent (PAN emulation) since v4.10.

retpoline
In addition to the PTI work above, the retpoline kernel mitigations for CVE-2017-5715 (“branch target injection” or “Spectre variant 2”) started landing. (Note that to gain full retpoline support, you’ll need a patched compiler, as appearing in gcc 7.3/8+, and currently queued for release in clang.)

This work continues to evolve, and clean-ups are continuing into v4.16. Also in v4.16 we’ll start to see mitigations for the other speculative execution variant (i.e. CVE-2017-5753, “bounds check bypass” or “Spectre variant 1”).

x86 fast refcount_t overflow protection
In v4.13 the CONFIG_REFCOUNT_FULL code was added to stop many types of reference counting flaws (with a tiny performance loss). In v4.14 the infrastructure for a fast overflow-only refcount_t protection on x86 (based on grsecurity’s PAX_REFCOUNT) landed, but it was disabled at the last minute due to a bug that was finally fixed in v4.15. Since it was a tiny change, the fast refcount_t protection was backported and enabled for the Longterm maintenance kernel in v4.14.5. Conversions from atomic_t to refcount_t have also continued, and are now above 168, with a handful remaining.

%p hashing
One of the many sources of kernel information exposures has been the use of the %p format string specifier. The strings end up in all kinds of places (dmesg, /sys files, /proc files, etc), and usage is scattered through-out the kernel, which had made it a very hard exposure to fix. Earlier efforts like kptr_restrict‘s %pK didn’t really work since it was opt-in. While a few recent attempts (by William C Roberts, Greg KH, and others) had been made to provide toggles for %p to act like %pK, Linus finally stepped in and declared that %p should be used so rarely that it shouldn’t used at all, and Tobin Harding took on the task of finding the right path forward, which resulted in %p output getting hashed with a per-boot secret. The result is that simple debugging continues to work (two reports of the same hash value can confirm the same address without saying what the address actually is) but frustrates attacker’s ability to use such information exposures as building blocks for exploits.

For developers needing an unhashed %p, %px was introduced but, as Linus cautioned, either your %p remains useful when hashed, your %p was never actually useful to begin with and should be removed, or you need to strongly justify using %px with sane permissions.

It remains to be seen if we’ve just kicked the information exposure can down the road and in 5 years we’ll be fighting with %px and %lx, but hopefully the attitudes about such exposures will have changed enough to better guide developers and their code.

struct timer_list refactoring
The kernel’s timer (struct timer_list) infrastructure is, unsurprisingly, used to create callbacks that execute after a certain amount of time. They are one of the more fundamental pieces of the kernel, and as such have existed for a very long time, with over 1000 call sites. Improvements to the API have been made over time, but old ways of doing things have stuck around. Modern callbacks in the kernel take an argument pointing to the structure associated with the callback, so that a callback has context for which instance of the callback has been triggered. The timer callbacks didn’t, and took an unsigned long that was cast back to whatever arbitrary context the code setting up the timer wanted to associate with the callback, and this variable was stored in struct timer_list along with the function pointer for the callback. This creates an opportunity for an attacker looking to exploit a memory corruption vulnerability (e.g. heap overflow), where they’re able to overwrite not only the function pointer, but also the argument, as stored in memory. This elevates the attack into a weak ROP, and has been used as the basis for disabling SMEP in modern exploits (see retire_blk_timer). To remove this weakness in the kernel’s design, I refactored the timer callback API and and all its callers, for a whopping:

1128 files changed, 4834 insertions(+), 5926 deletions(-)

Another benefit of the refactoring is that once the kernel starts getting built by compilers with Control Flow Integrity support, timer callbacks won’t be lumped together with all the other functions that take a single unsigned long argument. (In other words, some CFI implementations wouldn’t have caught the kind of attack described above since the attacker’s target function still matched its original prototype.)

That’s it for now; please let me know if I missed anything. The v4.16 merge window is now open!

© 2018, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

February 05, 2018 11:45 PM

Greg Kroah-Hartman: Linux Kernel Release Model

Note

This post is based on a whitepaper I wrote at the beginning of 2016 to be used to help many different companies understand the Linux kernel release model and encourage them to start taking the LTS stable updates more often. I then used it as a basis of a presentation I gave at the Linux Recipes conference in September 2017 which can be seen here.

With the recent craziness of Meltdown and Spectre , I’ve seen lots of things written about how Linux is released and how we handle handles security patches that are totally incorrect, so I figured it is time to dust off the text, update it in a few places, and publish this here for everyone to benefit from.

I would like to thank the reviewers who helped shape the original whitepaper, which has helped many companies understand that they need to stop “cherry picking” random patches into their device kernels. Without their help, this post would be a total mess. All problems and mistakes in here are, of course, all mine. If you notice any, or have any questions about this, please let me know.

Overview

This post describes how the Linux kernel development model works, what a long term supported kernel is, how the kernel developers approach security bugs, and why all systems that use Linux should be using all of the stable releases and not attempting to pick and choose random patches.

Linux Kernel development model

The Linux kernel is the largest collaborative software project ever. In 2017, over 4,300 different developers from over 530 different companies contributed to the project. There were 5 different releases in 2017, with each release containing between 12,000 and 14,500 different changes. On average, 8.5 changes are accepted into the Linux kernel every hour, every hour of the day. A non-scientific study (i.e. Greg’s mailbox) shows that each change needs to be submitted 2-3 times before it is accepted into the kernel source tree due to the rigorous review and testing process that all kernel changes are put through, so the engineering effort happening is much larger than the 8 changes per hour.

At the end of 2017 the size of the Linux kernel was just over 61 thousand files consisting of 25 million lines of code, build scripts, and documentation (kernel release 4.14). The Linux kernel contains the code for all of the different chip architectures and hardware drivers that it supports. Because of this, an individual system only runs a fraction of the whole codebase. An average laptop uses around 2 million lines of kernel code from 5 thousand files to function properly, while the Pixel phone uses 3.2 million lines of kernel code from 6 thousand files due to the increased complexity of a SoC.

Kernel release model

With the release of the 2.6 kernel in December of 2003, the kernel developer community switched from the previous model of having a separate development and stable kernel branch, and moved to a “stable only” branch model. A new release happened every 2 to 3 months, and that release was declared “stable” and recommended for all users to run. This change in development model was due to the very long release cycle prior to the 2.6 kernel (almost 3 years), and the struggle to maintain two different branches of the codebase at the same time.

The numbering of the kernel releases started out being 2.6.x, where x was an incrementing number that changed on every release The value of the number has no meaning, other than it is newer than the previous kernel release. In July 2011, Linus Torvalds changed the version number to 3.x after the 2.6.39 kernel was released. This was done because the higher numbers were starting to cause confusion among users, and because Greg Kroah-Hartman, the stable kernel maintainer, was getting tired of the large numbers and bribed Linus with a fine bottle of Japanese whisky.

The change to the 3.x numbering series did not mean anything other than a change of the major release number, and this happened again in April 2015 with the movement from the 3.19 release to the 4.0 release number. It is not remembered if any whisky exchanged hands when this happened. At the current kernel release rate, the number will change to 5.x sometime in 2018.

Stable kernel releases

The Linux kernel stable release model started in 2005, when the existing development model of the kernel (a new release every 2-3 months) was determined to not be meeting the needs of most users. Users wanted bugfixes that were made during those 2-3 months, and the Linux distributions were getting tired of trying to keep their kernels up to date without any feedback from the kernel community. Trying to keep individual kernels secure and with the latest bugfixes was a large and confusing effort by lots of different individuals.

Because of this, the stable kernel releases were started. These releases are based directly on Linus’s releases, and are released every week or so, depending on various external factors (time of year, available patches, maintainer workload, etc.)

The numbering of the stable releases starts with the number of the kernel release, and an additional number is added to the end of it.

For example, the 4.9 kernel is released by Linus, and then the stable kernel releases based on this kernel are numbered 4.9.1, 4.9.2, 4.9.3, and so on. This sequence is usually shortened with the number “4.9.y” when referring to a stable kernel release tree. Each stable kernel release tree is maintained by a single kernel developer, who is responsible for picking the needed patches for the release, and doing the review/release process. Where these changes are found is described below.

Stable kernels are maintained for as long as the current development cycle is happening. After Linus releases a new kernel, the previous stable kernel release tree is stopped and users must move to the newer released kernel.

Long-Term Stable kernels

After a year of this new stable release process, it was determined that many different users of Linux wanted a kernel to be supported for longer than just a few months. Because of this, the Long Term Supported (LTS) kernel release came about. The first LTS kernel was 2.6.16, released in 2006. Since then, a new LTS kernel has been picked once a year. That kernel will be maintained by the kernel community for at least 2 years. See the next section for how a kernel is chosen to be a LTS release.

Currently the LTS kernels are the 4.4.y, 4.9.y, and 4.14.y releases, and a new kernel is released on average, once a week. Along with these three kernel releases, a few older kernels are still being maintained by some kernel developers at a slower release cycle due to the needs of some users and distributions.

Information about all long-term stable kernels, who is in charge of them, and how long they will be maintained, can be found on the kernel.org release page.

LTS kernel releases average 9-10 patches accepted per day, while the normal stable kernel releases contain 10-15 patches per day. The number of patches fluctuates per release given the current time of the corresponding development kernel release, and other external variables. The older a LTS kernel is, the less patches are applicable to it, because many recent bugfixes are not relevant to older kernels. However, the older a kernel is, the harder it is to backport the changes that are needed to be applied, due to the changes in the codebase. So while there might be a lower number of overall patches being applied, the effort involved in maintaining a LTS kernel is greater than maintaining the normal stable kernel.

Choosing the LTS kernel

The method of picking which kernel the LTS release will be, and who will maintain it, has changed over the years from an semi-random method, to something that is hopefully more reliable.

Originally it was merely based on what kernel the stable maintainer’s employer was using for their product (2.6.16.y and 2.6.27.y) in order to make the effort of maintaining that kernel easier. Other distribution maintainers saw the benefit of this model and got together and colluded to get their companies to all release a product based on the same kernel version without realizing it (2.6.32.y). After that was very successful, and allowed developers to share work across companies, those companies decided to not do that anymore, so future LTS kernels were picked on an individual distribution’s needs and maintained by different developers (3.0.y, 3.2.y, 3.12.y, 3.16.y, and 3.18.y) creating more work and confusion for everyone involved.

This ad-hoc method of catering to only specific Linux distributions was not beneficial to the millions of devices that used Linux in an embedded system and were not based on a traditional Linux distribution. Because of this, Greg Kroah-Hartman decided that the choice of the LTS kernel needed to change to a method in which companies can plan on using the LTS kernel in their products. The rule became “one kernel will be picked each year, and will be maintained for two years.” With that rule, the 3.4.y, 3.10.y, and 3.14.y kernels were picked.

Due to a large number of different LTS kernels being released all in the same year, causing lots of confusion for vendors and users, the rule of no new LTS kernels being based on an individual distribution’s needs was created. This was agreed upon at the annual Linux kernel summit and started with the 4.1.y LTS choice.

During this process, the LTS kernel would only be announced after the release happened, making it hard for companies to plan ahead of time what to use in their new product, causing lots of guessing and misinformation to be spread around. This was done on purpose as previously, when companies and kernel developers knew ahead of time what the next LTS kernel was going to be, they relaxed their normal stringent review process and allowed lots of untested code to be merged (2.6.32.y). The fallout of that mess took many months to unwind and stabilize the kernel to a proper level.

The kernel community discussed this issue at its annual meeting and decided to mark the 4.4.y kernel as a LTS kernel release, much to the surprise of everyone involved, with the goal that the next LTS kernel would be planned ahead of time to be based on the last kernel release of 2016 in order to provide enough time for companies to release products based on it in the next holiday season (2017). This is how the 4.9.y and 4.14.y kernels were picked as the LTS kernel releases.

This process seems to have worked out well, without many problems being reported against the 4.9.y tree, despite it containing over 16,000 changes, making it the largest kernel to ever be released.

Future LTS kernels should be planned based on this release cycle (the last kernel of the year). This should allow SoC vendors to plan ahead on their development cycle to not release new chipsets based on older, and soon to be obsolete, LTS kernel versions.

Stable kernel patch rules

The rules for what can be added to a stable kernel release have remained almost identical for the past 12 years. The full list of the rules for patches to be accepted into a stable kernel release can be found in the Documentation/process/stable_kernel_rules.rst kernel file and are summarized here. A stable kernel change:

The last rule, “a change must be in Linus’s tree”, prevents the kernel community from losing fixes. The community never wants a fix to go into a stable kernel release that is not already in Linus’s tree so that anyone who upgrades should never see a regression. This prevents many problems that other projects who maintain a stable and development branch can have.

Kernel Updates

The Linux kernel community has promised its userbase that no upgrade will ever break anything that is currently working in a previous release. That promise was made in 2007 at the annual Kernel developer summit in Cambridge, England, and still holds true today. Regressions do happen, but those are the highest priority bugs and are either quickly fixed, or the change that caused the regression is quickly reverted from the Linux kernel tree.

This promise holds true for both the incremental stable kernel updates, as well as the larger “major” updates that happen every three months.

The kernel community can only make this promise for the code that is merged into the Linux kernel tree. Any code that is merged into a device’s kernel that is not in the kernel.org releases is unknown and interactions with it can never be planned for, or even considered. Devices based on Linux that have large patchsets can have major issues when updating to newer kernels, because of the huge number of changes between each release. SoC patchsets are especially known to have issues with updating to newer kernels due to their large size and heavy modification of architecture specific, and sometimes core, kernel code.

Most SoC vendors do want to get their code merged upstream before their chips are released, but the reality of project-planning cycles and ultimately the business priorities of these companies prevent them from dedicating sufficient resources to the task. This, combined with the historical difficulty of pushing updates to embedded devices, results in almost all of them being stuck on a specific kernel release for the entire lifespan of the device.

Because of the large out-of-tree patchsets, most SoC vendors are starting to standardize on using the LTS releases for their devices. This allows devices to receive bug and security updates directly from the Linux kernel community, without having to rely on the SoC vendor’s backporting efforts, which traditionally are very slow to respond to problems.

It is encouraging to see that the Android project has standardized on the LTS kernels as a “minimum kernel version requirement”. Hopefully that will allow the SoC vendors to continue to update their device kernels in order to provide more secure devices for their users.

Security

When doing kernel releases, the Linux kernel community almost never declares specific changes as “security fixes”. This is due to the basic problem of the difficulty in determining if a bugfix is a security fix or not at the time of creation. Also, many bugfixes are only determined to be security related after much time has passed, so to keep users from getting a false sense of security by not taking patches, the kernel community strongly recommends always taking all bugfixes that are released.

Linus summarized the reasoning behind this behavior in an email to the Linux Kernel mailing list in 2008:

On Wed, 16 Jul 2008, pageexec@freemail.hu wrote:
>
> you should check out the last few -stable releases then and see how
> the announcement doesn't ever mention the word 'security' while fixing
> security bugs

Umm. What part of "they are just normal bugs" did you have issues with?

I expressly told you that security bugs should not be marked as such,
because bugs are bugs.

> in other words, it's all the more reason to have the commit say it's
> fixing a security issue.

No.

> > I'm just saying that why mark things, when the marking have no meaning?
> > People who believe in them are just _wrong_.
>
> what is wrong in particular?

You have two cases:

 - people think the marking is somehow trustworthy.

   People are WRONG, and are misled by the partial markings, thinking that
   unmarked bugfixes are "less important". They aren't.

 - People don't think it matters

   People are right, and the marking is pointless.

In either case it's just stupid to mark them. I don't want to do it,
because I don't want to perpetuate the myth of "security fixes" as a
separate thing from "plain regular bug fixes".

They're all fixes. They're all important. As are new features, for that
matter.

> when you know that you're about to commit a patch that fixes a security
> bug, why is it wrong to say so in the commit?

It's pointless and wrong because it makes people think that other bugs
aren't potential security fixes.

What was unclear about that?

    Linus

This email can be found here, and the whole thread is recommended reading for anyone who is curious about this topic.

When security problems are reported to the kernel community, they are fixed as soon as possible and pushed out publicly to the development tree and the stable releases. As described above, the changes are almost never described as a “security fix”, but rather look like any other bugfix for the kernel. This is done to allow affected parties the ability to update their systems before the reporter of the problem announces it.

Linus describes this method of development in the same email thread:

On Wed, 16 Jul 2008, pageexec@freemail.hu wrote:
>
> we went through this and you yourself said that security bugs are *not*
> treated as normal bugs because you do omit relevant information from such
> commits

Actually, we disagree on one fundamental thing. We disagree on
that single word: "relevant".

I do not think it's helpful _or_ relevant to explicitly point out how to
tigger a bug. It's very helpful and relevant when we're trying to chase
the bug down, but once it is fixed, it becomes irrelevant.

You think that explicitly pointing something out as a security issue is
really important, so you think it's always "relevant". And I take mostly
the opposite view. I think pointing it out is actually likely to be
counter-productive.

For example, the way I prefer to work is to have people send me and the
kernel list a patch for a fix, and then in the very next email send (in
private) an example exploit of the problem to the security mailing list
(and that one goes to the private security list just because we don't want
all the people at universities rushing in to test it). THAT is how things
should work.

Should I document the exploit in the commit message? Hell no. It's
private for a reason, even if it's real information. It was real
information for the developers to explain why a patch is needed, but once
explained, it shouldn't be spread around unnecessarily.

    Linus

Full details of how security bugs can be reported to the kernel community in order to get them resolved and fixed as soon as possible can be found in the kernel file Documentation/admin-guide/security-bugs.rst

Because security bugs are not announced to the public by the kernel team, CVE numbers for Linux kernel-related issues are usually released weeks, months, and sometimes years after the fix was merged into the stable and development branches, if at all.

Keeping a secure system

When deploying a device that uses Linux, it is strongly recommended that all LTS kernel updates be taken by the manufacturer and pushed out to their users after proper testing shows the update works well. As was described above, it is not wise to try to pick and choose various patches from the LTS releases because:

Note, this author has audited many SoC kernel trees that attempt to cherry-pick random patches from the upstream LTS releases. In every case, severe security fixes have been ignored and not applied.

As proof of this, I demoed at the Kernel Recipes talk referenced above how trivial it was to crash all of the latest flagship Android phones on the market with a tiny userspace program. The fix for this issue was released 6 months prior in the LTS kernel that the devices were based on, however none of the devices had upgraded or fixed their kernels for this problem. As of this writing (5 months later) only two devices have fixed their kernel and are now not vulnerable to that specific bug.

February 05, 2018 05:13 PM

February 04, 2018

Pete Zaitcev: Farewell Nexus 7, Hello Huawei M3

Flying a photoshoot of the Carlson, I stuffed my Nexus 7 under my thighs and cracked the screen. In my defense, I did it several times before, because I hate leaving it on the cockpit floor. I had to fly uncoordinated for the photoshoot, which causes anything that's not fixed in place slide around, and I'm paranoid about a controls interference. Anyway, the cracked screen caused a significant dead zone where touch didn't register anymore, and that made the tablet useless. I had to replace it.

In the years since I had the Nexus (apparently since 2014), the industry stopped making good 7-inch tablets. Well, you can still buy $100 tablets in that size. But because the Garmin Pilot was getting spec-hungry recently, I had no choice but to step up. Sad, really. Naturally, I'm having trouble fitting the M3 into pockets where Nexus lived comfortably before. {It's a full-size iPad in the picture, not a Mini.}

The most annoying problem that I encountered was Chrome not liking the SSL certificate of www.zaitcev.us. It bails with ERR_SSL_SERVER_CERT_BAD_FORMAT. I have my own fake CA, so I install my CA certificate on clients and I sign my hosts. I accept the consequences and inconventice. The annoyance arises because Chrome does not tell what it does not like about the certificate. Firefox works fine with it, as do other applications (like IMAP clients). Chrome in the Nexus worked fine. A cursory web search suggests that Chrome may want alternative names keyed with "DNS.1" instead of "DNS". Dunno what it means and if it is true.

UPDATE: "Top FBI, CIA, and NSA officials all agree: Stay away from Huawei phones"

February 04, 2018 05:17 AM

February 02, 2018

Michael Kerrisk (manpages): man-pages-4.15 is released

I've released man-pages-4.15. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from 26 contributors. Just over 200 commits changed around 75 pages. In addition, 3 new manual pages were added.

Among the more significant changes in man-pages-4.15 are the following:

February 02, 2018 03:21 PM

Daniel Vetter: LCA Sydney: Burning Down the Castle

I’ve done a talk about the kernel community. It’s a hot take, but with the feedback I’ve received thus far I think it was on the spot, and started a lot of uncomfortable, but necessary discussion. I don’t think it’s time yet to give up on this project, even if it will take years.

Without further ado the recording of my talk “Burning Down the Castle is on youtueb”. For those who prefer reading, LWN has you covered with “Too many lords, not enough stewards”. I think Jake Edge and Jon Corbet have done an excellent job in capturing my talk in a balanced fashion. I have also uploaded my slides.

Further Discussion

For understanding abuse dynamics I can’t recommend “Why Does He Do That?: Inside the Minds of Angry and Controlling Men” by Lundy Bancroft enough. All the examples are derived from a few decades of working with abusers in personal relationships, but the patterns and archetypes that Lundy Bancroft extracts transfers extremely well to any other kind of relationship, whether that’s work, family or open source communities.

There’s endless amounts of stellar talks about building better communities. I’d like to highlight just two: “Life is better with Rust’s community automation” by Emily Dunham and “Have It Your Way: Maximizing Drive-Thru Contribution” by VM Brasseur. For learning more there’s lots of great community topic tracks at various conferences, but also dedicated ones - often as unconferences: Community Leadership Summit, including its various offsprings and maintainerati are two I’ve been at and learned a lot.

Finally there’s the fun of trying to change a huge existing organization with lots of inertia. “Leading Change” by John Kotter has some good insights and frameworks to approach this challenge.

Despite what it might look like I’m not quitting kernel hacking nor the X.org community, and I’m happy to discuss my talk over mail and in upcoming hallway tracks.

February 02, 2018 12:00 AM

January 23, 2018

Pete Zaitcev: 400 gigabits, every second

I keep waiting for RJ-45 to fail to keep the pace with the gigabits, for many years. And it always catches up. But maybe not anymore. Here's what the connector looks for QSFP-DD, a standard module connector for 400GbE:

Two rows, baby, same as on USB3.

These speeds are mostly used between leaf and spine switches, but I'm sure we'll see them in the upstream routers, too.

January 23, 2018 07:43 PM

January 22, 2018

James Morris: LCA 2018 Kernel Miniconf – SELinux Namespacing Slides

I gave a short talk on SELinux namespacing today at the Linux.conf.au Kernel Miniconf in Sydney — the slides from the talk are here: http://namei.org/presentations/selinux_namespacing_lca2018.pdf

This is a work in progress to which I’ve been contributing, following on from initial discussions at Linux Plumbers 2017.

In brief, there’s a growing need to be able to provide SELinux confinement within containers: typically, SELinux appears disabled within a container on Fedora-based systems, as a workaround for a lack of container support.  Underlying this is a requirement to provide per-namespace SELinux instances,  where each container has its own SELinux policy and private kernel SELinux APIs.

A prototype for SELinux namespacing was developed by Stephen Smalley, who released the code via https://github.com/stephensmalley/selinux-kernel/tree/selinuxns.  There were and still are many TODO items.  I’ve since been working on providing namespacing support to on-disk inode labels, which are represented by security xattrs.  See the v0.2 patch post for more details.

Much of this work will be of interest to other LSMs such as Smack, and many architectural and technical issues remain to be solved.  For those interested in this work, please see the slides, which include a couple of overflow pages detailing some known but as yet unsolved issues (supplied by Stephen Smalley).

I anticipate discussions on this and related topics (LSM stacking, core namespaces) later in the year at Plumbers and the Linux Security Summit(s), at least.

The session was live streamed — I gather a standalone video will be available soon!

ETA: the video is up! See:

January 22, 2018 08:38 AM

January 20, 2018

Pete Zaitcev: NUC versus laptop

When I split off the router, I received a bit of a breather from the Fedora killing i686, because I do not have to upgrade the non-routing server as faithfully as an Internet-facing firewall. Still, eventually I must switch from the ASUS EEEPC to something viable.

So, I considered a NUC, just like the one that Richard W.M. Jones bought. It beats an old laptop in every way. In particular, it's increasingly difficult to disassemble laptops nowadays, and the candidate I have now has is hard drive buried in a particularly vexing way: the whole thing must be taken apart, with a dozen of tiny connectors carefully pried off, before the disk can be extracted. Still, a laptop offers a couple of features. #1: it always has a monitor and keyboard, an #2: it comes with its own uninterruptible power supply. And the cost is already amortized.

Long term, I am inclined to believe that Atwood is right and all user-facing computers will morph into tablets. When that happens, a supply of useful laptops will dry up and I will have to resort to whatever microserver box is available.But that today is not that day.

January 20, 2018 05:03 PM

January 19, 2018

Greg Kroah-Hartman: Meltdown and Spectre Linux kernel status - update

I keep getting a lot of private emails about my previous post about the latest status of the Linux kernel patches to resolve both the Meltdown and Spectre issues.

These questions all seem to break down into two different categories, “What is the state of the Spectre kernel patches?”, and “Is my machine vunlerable?”

State of the kernel patches

As always, lwn.net covers the technical details about the latest state of the kernel patches to resolve the Spectre issues, so please go read that to find out that type of information.

And yes, it is behind a paywall for a few more weeks. You should be buying a subscription to get this type of thing!

Is my machine vunlerable?

For this question, it’s now a very simple answer, you can check it yourself.

Just run the following command at a terminal window to determine what the state of your machine is:

1
$ grep . /sys/devices/system/cpu/vulnerabilities/*

On my laptop, right now, this shows:

1
2
3
4
$ grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Vulnerable
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Vulnerable: Minimal generic ASM retpoline

This shows that my kernel is properly mitigating the Meltdown problem by implementing PTI (Page Table Isolation), and that my system is still vulnerable to the Spectre variant 1, but is trying really hard to resolve the variant 2, but is not quite there (because I did not build my kernel with a compiler to properly support the retpoline feature).

If your kernel does not have that sysfs directory or files, then obviously there is a problem and you need to upgrade your kernel!

Some “enterprise” distributions did not backport the changes for this reporting, so if you are running one of those types of kernels, go bug the vendor to fix that, you really want a unified way of knowing the state of your system.

Note that right now, these files are only valid for the x86-64 based kernels, all other processor types will show something like “Not affected”. As everyone knows, that is not true for the Spectre issues, except for very old CPUs, so it’s a huge hint that your kernel really is not up to date yet. Give it a few months for all other processor types to catch up and implement the correct kernel hooks to properly report this.

And yes, I need to go find a working microcode update to fix my laptop’s CPU so that it is not vulnerable against Spectre…

January 19, 2018 10:30 AM

Gustavo F. Padovan: Save the date! Annoucing linuxdev-br Conference 2018

We are proud to tell you that the second edition of the linuxdev-br conference will happen on August 25th and 26th, 2018 again at the University of Campinas. The first edition, last November, was a massive success and now the second edition will happen in a bigger place to fit more people with a duration of two days, so it can fit a wider range of talks without preventing the attendees from connecting to each other during the coffee-breaks and happy hours!

Stay tuned for more updates, soon we will publish a call for talks and open the registrations. We want to make linuxdev-br always better! See you there! :)

January 19, 2018 12:44 AM

January 18, 2018

Pavel Machek: Fun with Rust (not spinning this time)

Rust... took me while to install. I decided I did not like curl | sh, so I created fresh VM for that. That took a while, and in the end I ran curl | sh, anyway. I coded weather forecast core in Rust... And I feel like every second line needs explicit typecast. Not nice, but ok; result will be fast, right? Rust: 6m45 seconds, python less then 1m7 seconds. Ouch. Ok, rust really needs optimalizations to be anywhere near reasonable run-time speed. 7 seconds optimized. Compile time is... 4 seconds for 450 lines of code. Hmm. Not great. .. but I guess better than alternatives.

January 18, 2018 06:42 PM

Pavel Machek: Hey Intel, what about an apology?

Hey, Intel. You were selling faulty CPUs for 15+ years, you are still selling faulty CPUs, and there are no signs you even intend to fix them. You sold faulty CPUs for half a year, knowing they are faulty, without telling you customers. You helped develop band-aids for subset of problems, and subset of configurations. Yeah, so there's work around for Meltdown on 64-bit Linux. Where's work around for Meltdown on 32-bit? What about BSDs? MINIX? L4? Where are work arounds for Spectre? And more importantly -- where are real fixes? You know, your CPUs fail to do security checks in time. Somehow I think that maybe you should fix your CPUs? I hearing you want to achieve “quantum supremacy". But maybe I'd like to hear how you intend to fix the mess you created, first? I actually started creating a workaround for x86-32, but I somehow feel like I should not be the one fixing this. I'm willing to test the patches...

(And yes, Spectre is industry-wide problem. Meltdown is -- you screwed it up.)

January 18, 2018 06:38 PM

January 17, 2018

Matthew Garrett: Privacy expectations and the connected home

Traditionally, devices that were tied to logins tended to indicate that in some way - turn on someone's xbox and it'll show you their account name, run Netflix and it'll ask which profile you want to use. The increasing prevalence of smart devices in the home changes that, in ways that may not be immediately obvious to the majority of people. You can configure a Philips Hue with wall-mounted dimmers, meaning that someone unfamiliar with the system may not recognise that it's a smart lighting system at all. Without any actively malicious intent, you end up with a situation where the account holder is able to infer whether someone is home without that person necessarily having any idea that that's possible. A visitor who uses an Amazon Echo is not necessarily going to know that it's tied to somebody's Amazon account, and even if they do they may not know that the log (and recorded audio!) of all interactions is available to the account holder. And someone grabbing an egg out of your fridge is almost certainly not going to think that your smart egg tray will trigger an immediate notification on the account owner's phone that they need to buy new eggs.

Things get even more complicated when there's multiple account support. Google Home supports multiple users on a single device, using voice recognition to determine which queries should be associated with which account. But the account that was used to initially configure the device remains as the fallback, with unrecognised voices ended up being logged to it. If a voice is misidentified, the query may end up being logged to an unexpected account.

There's some interesting questions about consent and expectations of privacy here. If someone sets up a smart device in their home then at some point they'll agree to the manufacturer's privacy policy. But if someone else makes use of the system (by pressing a lightswitch, making a spoken query or, uh, picking up an egg), have they consented? Who has the social obligation to explain to them that the information they're producing may be stored elsewhere and visible to someone else? If I use an Echo in a hotel room, who has access to the Amazon account it's associated with? How do you explain to a teenager that there's a chance that when they asked their Home for contact details for an abortion clinic, it ended up in their parent's activity log? Who's going to be the first person divorced for claiming that they were vegan but having been the only person home when an egg was taken out of the fridge?

To be clear, I'm not arguing against the design choices involved in the implementation of these devices. In many cases it's hard to see how the desired functionality could be implemented without this sort of issue arising. But we're gradually shifting to a place where the data we generate is not only available to corporations who probably don't care about us as individuals, it's also becoming available to people who own the more private spaces we inhabit. We have social norms against bugging our houseguests, but we have no social norms that require us to explain to them that there'll be a record of every light that they turn on or off. This feels like it's going to end badly.

(Thanks to Nikki Everett for conversations that inspired this post)

(Disclaimer: while I work for Google, I am not involved in any of the products or teams described in this post and my opinions are my own rather than those of my employer's)

comment count unavailable comments

January 17, 2018 09:45 PM

January 15, 2018

Pete Zaitcev: New toy

Guess what.

A Russian pillowcase is much wider (or squar-er) than tubular American ones, so it works perfectly as a cover.

January 15, 2018 09:38 PM

January 12, 2018

Pete Zaitcev: Old news

Per U.S. News:

Alphabet Inc's (GOOG, GOOGL) Google said in 2016 that it was designing a server based on International Business Machines Corp's (IBM) Power9 processor.

Have they put anything into production since then? If not, why bring this up?

UPDATE: R. Hubbell writes by e-mail:

So yes I think the move to the IBM is due to their encounter of the exploits.

A lot of lip service is given to the hazards of the monoculture. But why PPC of all things? Is Google becoming incapable of dealing with any supplier that is not a megacorp?

January 12, 2018 04:07 PM

January 11, 2018

Pete Zaitcev: A split home network

Real quick, why a 4-port router was needed.

  1. Red: Upstream link to ISP
  2. Grey: WiFi
  3. Blue: Entertainment stack
  4. Green: General Ethernet

The only reason to split the blue network is to prevent TiVo from attacking other boxes, such as desktops and printers. Yes, this is clearly not paranoid enough for a guy who insists on a dumb TV.

January 11, 2018 04:53 AM

January 10, 2018

Pete Zaitcev: Buying a dumb TV in 2018 America

I wanted to buy a TV a month ago and found that almost all of them are "Smart" nowadays. When I asked for a conventional TV, people ranging from a floor worker at Best Buy to Nikita Danilov at Facebook implied that I was an idiot. Still, I succeeded.

At first, I started looking at what is positioned as "conference room monitor". The NEC E506 is far away the leader, but it's expensive at $800 or so.

Then, I went to Fry's, who advertise quasi-brands like SILO. They had TVs on display, but were out. I was even desperate enough to be upsold to Athyme for $450, but they fortunately were out of that one too.

At that point, I headed to Best Buy, who have an exclusive agreement with Toshiba (h/t Matt Kern on Facebook). I was not happy to support this kind of distasteful arrangement, but very few options remained. There, it was either waiting for delivery, or driving 3 hours to a warehouse store. Considering how much my Jeep burns per mile, I declined.

Finally, I headed to a local Wal-Mart and bought a VISIO for $400 out the door. No fuss, no problem, easy peasy. Should've done that from the start.

P.S. Some people suggested buying a Smart TV and then not plugging it in. It includes not giving it the password for the house WiFi. Unfortunately, it is still problematic, as some of these TVs will associate with any open wireless network by default. An attacker drives by with a passwordless AP, and roots all TVs on the block. Unfortunately, I live an high-tech area where stuff like that happens all the time. When I mentioned it to Nikita, he thought that I was an idiot for sure. It's like a Russian joke about "dropping everything and moving to Uryupinsk."

January 10, 2018 06:29 PM

James Bottomley: GPL as the Best Licence – Governance and Philosophy

In the first part I discussed the balancing mechanisms the GPL provides for enabling corporate contributions, giving users a voice and the right one for mutually competing corporations to collaborate on equal terms.  In this part I’ll look at how the legal elements of the GPL licences make it pretty much the perfect one for supporting a community of developers co-operating with corporations and users.

As far as a summary of my talk goes, this series is complete.  However, I’ve been asked to add some elaboration on the legal structure of GPL+DCO contrasted to other CLAs and also include AGPL, so I’ll likely do some other one off posts in the Legal category about this.

Free Software vs Open Source

There have been many definitions of both of these.  Rather than review them, in the spirit of Humpty Dumpty, I’ll give you mine: Free Software, to me, means espousing a set of underlying beliefs about the code (for instance the four freedoms of the FSF).  While this isn’t problematic for many developers (code freedom, of course, is what enables developer driven communities) it is an anathema to most corporations and in particular their lawyers because, generally applied, it would require the release of all software based intellectual property.  Open Source on the other hand, to me, means that you follow all the rules of the project (usually licensing and contribution requirements) but don’t necessarily sign up to the philosophy underlying the project (if there is one; most Open Source projects won’t have one).

Open Source projects are compatible with Corporations because, provided they have some commonality in goals, even a corporation seeking to exploit a market can march a long way with a developer driven community before the goals diverge.  This period of marching together can be extremely beneficial for both the project and the corporation and if corporate priorities change, the corporation can simply stop contributing.  As I have stated before, Community Managers serve an essential purpose in keeping this goal alignment by making the necessary internal business adjustments within a corporation and by explaining the alignment externally.

The irony of the above is that collaborating within the framework of the project, as Open Source encourages, could work just as well for a Free Software project, provided the philosophical differences could be overcome (or overlooked).  In fact, one could go a stage further and theorize that the four freedoms as well as being input axioms to Free Software are, in fact, the generated end points of corporate pursuit of Open Source, so if the Open Source model wins in business, there won’t actually be a discernible difference between Open Source and Free Software.

Licences and Philosophy

It has often been said that the licence embodies the philosophy of the project (I’ve said it myself on more than one occasion, for which I’d now like to apologize).  However, it is an extremely reckless statement because it’s manifestly untrue in the case of GPL.  Neither v2 nor v3 does anything to require that adopters also espouse the four freedoms, although it could be said that the Tivoization Clause of v3, to which the kernel developers objected, goes slightly further down the road of trying to embed philosophy in the licence.  The reason for avoiding this statement is that it’s very easy for an inexperienced corporation (or pretty much any corporate legal counsel with lack of Open Source familiarity) to take this statement at face value and assume adopting the code or the licence will force some sort of viral adoption of a philosophy which is incompatible with their current business model and thus reject the use of reciprocal licences altogether.  Whenever any corporation debates using or contributing to Open Source, there’s inevitably an internal debate and this licence embeds philosophy argument is a powerful weapon for the Open Source opponents.

Equity in Contribution Models

Some licensing models, like those pioneered by Apache, are thought to require a foundation to pass the rights through under the licence: developers (or their corporations) sign a Contributor Licence Agreement (CLA) which basically grants the foundation redistributable licences to both copyrights and patents in the code and then the the Foundation licenses the contribution to the Project under Apache-2.  The net result is the outbound rights (what consumers of the project gets) are Apache-2 but the inbound rights (what contributors are required to give away) are considerably more.  The danger in this model is that control of the foundation gives control of the inbound rights, so who controls the foundation and how control may be transferred forms an important part of the analysis of what happens to contributor rights.  Note that this model is also the basis of open core, with a corporation taking the place of the foundation.

Inequity in the inbound versus the outbound rights creates an imbalance of power within the project between those who possess the inbound rights and everyone else (who only possess the outbound rights) and can damage developer driven communities by creating an alternate power structure (the one which controls the IP rights).  Further, the IP rights tend to be a focus for corporations, so simply joining the controlling entity (or taking a licence from it) instead of actually contributing to the project can become an end goal, thus weakening the technical contributions to the project and breaking the link with end users.

Creating equity in the licensing framework is thus a key to preserving the developer driven nature of a community.  This equity can be preserved by using the Inbound = Outbound principle, first pioneered by Richard Fontana, the essential element being that contributors should only give away exactly the rights that downstream recipients require under the licence.  This structure means there is no need for a formal CLA and instead a model like the Developer Certificate of Origin (DCO) can be used whereby the contributor simply places a statement in the source control of the project itself attesting to giving away exactly the rights required by the licence.  In this model, there’s no requirement to store non-electronic copies of the the contribution attestation (which inevitably seem to get lost), because the source control system used by the project does this.  Additionally, the source browsing functions of the source control system can trace a single line of code back exactly to all the contributor attestations thus allowing fully transparent inspection and independent verification of all the inbound contribution grants.

The Dangers of Foundations

Foundations which have no special inbound contribution rights can still present a threat to the project by becoming an alternate power structure.  In the worst case, the alternate power structure is cemented by the Foundation having a direct control link with the project, usually via some Technical Oversight Committee (TOC).  In this case, the natural Developer Driven nature of the project is sapped by the TOC creating uncertainty over whether a contribution should be accepted or not, so now the object isn’t to enthuse fellow developers, it’s to please the TOC.  The control confusion created by this type of foundation directly atrophies the project.

Even if a Foundation specifically doesn’t create any form of control link with the project, there’s still the danger that a corporation’s marketing department sees joining the Foundation as a way of linking itself with the project without having to engage the engineering department, and thus still causing a weakening in both potential contributions and the link between the project and its end users.

There are specific reasons why projects need foundations (anything requiring financial resources like conferences or grants requires some entity to hold the cash) but they should be driven by the need of the community for a service and not by the need of corporations for an entity.

GPL+DCO as the Perfect Licence and Contribution Framework

Reciprocity is the key to this: the requirement to give back the modifications levels the playing field for corporations by ensuring that they each see what the others are doing.  Since there’s little benefit (and often considerable down side) to hiding modifications and doing a dump at release time, it actively encourages collaboration between competitors on shared features.  Reciprocity also contains patent leakage as we saw in Part 1.  Coupled with the DCO using the Inbound = Outbound principle, means that the Licence and DCO process are everything you need to form an effective and equal community.

Equality enforced by licensing coupled with reciprocity also provides a level playing field for corporate contributors as we saw in part 1, so equality before the community ensures equity among all participants.  Since this is analogous to the equity principles that underlie a lot of the world’s legal systems, it should be no real surprise that it generates the best contribution framework for the project.  Best of all, the model works simply and effectively for a group of contributors without necessity for any more formal body.

Contributions and Commits

Although GPL+DCO can ensure equity in contribution, some human agency is still required to go from contribution to commit.  The application of this agency is one of the most important aspects to the vibrancy of the project and the community.  The agency can be exercised by an individual or a group; however, the composition of the agency doesn’t much matter, what does is that the commit decisions of the agency essentially (and impartially) judge the technical merit of the contribution in relation to the project.

A bad commit agency can be even more atrophying to a community than a Foundation because it directly saps the confidence the community has in the ability of good (or interesting) code to get into the tree.  Conversely, a good agency is simply required to make sound technical decisions about the contribution, which directly preserves the confidence of the community that good code gets into the tree.   As such, the role requires leadership, impartiality and sound judgment rather than any particular structure.

Governance and Enforcement

Governance seems to have many meanings depending on context, so lets narrow it to the rules by which the project is run (this necessarily includes gathering the IP contribution rights) and how they get followed. In a GPL+DCO framework, the only additional governance component required is the commit agency.

However, having rules isn’t sufficient unless you also follow them; in other words you need some sort of enforcement mechanism.  In a non-GPL+DCO system, this usually involves having an elaborate set of sanctions and some sort of adjudication system, which, if not set up correctly, can also be a source of inequity and project atrophy.  In a GPL+DCO system, most of the adjudication system and sanctions can be replaced by copyright law (this was the design of the licence, after all), which means licence enforcement (or at least the threat of it) becomes the enforcement mechanism.  The only aspect of governance this doesn’t cover is the commit agency.  However, with no other formal mechanisms to support its authority, the commit agency depends on the trust of the community to function and could easily be replaced by that community simply forking the tree and trusting a new commit agency.

The two essential corollaries of the above is that enforcement does serve an essential governance purpose in a GPL+DCO ecosystem and lack of a formal power structure keeps the commit agency honest because the community could replace it.

The final thing worth noting is that too many formal rules can also seriously weaken a project by encouraging infighting over rule interpretations, how exactly they should be followed and who did or did not dot the i’s and cross the t’s.  This makes the very lack of formality and lack of a formalised power structure which the GPL+DCO encourages a key strength of the model.

Conclusions

In the first part I concluded that the GPL fostered the best ecosystem between developers, corporations and users by virtue of the essential ecosystem fairness it engenders.  In this part I conclude that formal control structures are actually detrimental to a developer driven community and thus the best structural mechanism is pure GPL+DCO with no additional formality.  Finally I conclude that this lack of ecosystem control is no bar to strong governance, since that can be enforced by any contributor through the copyright mechanism, and the very lack of control is what keeps the commit agency correctly serving the community.

January 10, 2018 04:38 PM

January 08, 2018

Pete Zaitcev: Caches are like the government

From an anonymous author, a follow-up to the discussion about the cache etc.:

counterpoint 1: Itanium, which was EPIC like Elbrus, failed even with Intel behind it. And it added prefetching before the end. Source: https://en.wikipedia.org/wiki/Itanium#Itanium_9500_(Poulson):_2012

counterpoint 2: To get fast, Elbrus has also added at least one kind of prefetch (APB, "Array Prefetch Buffer") and has the multimegabyte cache that Zaitcev decries. Source: [kozhin2016, 10.1109/EnT.2016.027]

counterpoint 3: "According to Keith Diefendorff, in 1978 almost 15 years ahead of Western superscalar processors, Elbrus implemented a two-issue out-of-order processor with register renaming and speculative execution" https://www.theregister.co.uk/1999/06/07/intel_uses_russia_military_technologies/

1. Itanium, as I recall, suffered from the poor initial implementation too much. Remember that 1st implementation was designed in Intel, while the 2nd implementation was designed at HP. Intel's chip stunk on ice. By the time HP came along, AMD64 became a thing, and then it was over.

Would Itanium win over the AMD64 if it were better established, burned less power, and were faster, sooner? There's no telling. The compatibility is an important consideration, and the binary translation was very shaky back then, unless you count Crusoe.

2. It's quite true that modern Elbrus runs with a large cache. That is because cache is obviously beneficial. All this is about is to consider once again if better software control of caches, and their better architecture in general, would disrupt side-channel signalling and bring performance advantages.

By the way, people might not remember it now, but a large chunk of Opteron's performance derived from its excellent memory controller. It's a component of CPU that tended not to get noticed, but it's essential. Fortunately, the Rowhammer vulnerability drew some much-needed attention to it, as well as a possible role for software control there.

3. Well, Prof. Babayan's own outlook at Elbrus-2 and its superscalar, out-of-order core was, "As you can see, I tried this first, and found that VLIW was better", which is why Elbrus-3 disposed with all that stuff. Naturally, all that stuff came back when we started to find the limits of EPIC (nee VLIW), just like the cache did.

January 08, 2018 10:51 PM

January 06, 2018

Greg Kroah-Hartman: Meltdown and Spectre Linux kernel status

By now, everyone knows that something “big” just got announced regarding computer security. Heck, when the Daily Mail does a report on it , you know something is bad…

Anyway, I’m not going to go into the details about the problems being reported, other than to point you at the wonderfully written Project Zero paper on the issues involved here. They should just give out the 2018 Pwnie award right now, it’s that amazingly good.

If you do want technical details for how we are resolving those issues in the kernel, see the always awesome lwn.net writeup for the details.

Also, here’s a good summary of lots of other postings that includes announcements from various vendors.

As for how this was all handled by the companies involved, well this could be described as a textbook example of how NOT to interact with the Linux kernel community properly. The people and companies involved know what happened, and I’m sure it will all come out eventually, but right now we need to focus on fixing the issues involved, and not pointing blame, no matter how much we want to.

What you can do right now

If your Linux systems are running a normal Linux distribution, go update your kernel. They should all have the updates in them already. And then keep updating them over the next few weeks, we are still working out lots of corner case bugs given that the testing involved here is complex given the huge variety of systems and workloads this affects. If your distro does not have kernel updates, then I strongly suggest changing distros right now.

However there are lots of systems out there that are not running “normal” Linux distributions for various reasons (rumor has it that it is way more than the “traditional” corporate distros). They rely on the LTS kernel updates, or the normal stable kernel updates, or they are in-house franken-kernels. For those people here’s the status of what is going on regarding all of this mess in the upstream kernels you can use.

Meltdown – x86

Right now, Linus’s kernel tree contains all of the fixes we currently know about to handle the Meltdown vulnerability for the x86 architecture. Go enable the CONFIG_PAGE_TABLE_ISOLATION kernel build option, and rebuild and reboot and all should be fine.

However, Linus’s tree is currently at 4.15-rc6 + some outstanding patches. 4.15-rc7 should be out tomorrow, with those outstanding patches to resolve some issues, but most people do not run a -rc kernel in a “normal” environment.

Because of this, the x86 kernel developers have done a wonderful job in their development of the page table isolation code, so much so that the backport to the latest stable kernel, 4.14, has been almost trivial for me to do. This means that the latest 4.14 release (4.14.12 at this moment in time), is what you should be running. 4.14.13 will be out in a few more days, with some additional fixes in it that are needed for some systems that have boot-time problems with 4.14.12 (it’s an obvious problem, if it does not boot, just add the patches now queued up.)

I would personally like to thank Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, Peter Zijlstra, Josh Poimboeuf, Juergen Gross, and Linus Torvalds for all of the work they have done in getting these fixes developed and merged upstream in a form that was so easy for me to consume to allow the stable releases to work properly. Without that effort, I don’t even want to think about what would have happened.

For the older long term stable (LTS) kernels, I have leaned heavily on the wonderful work of Hugh Dickins, Dave Hansen, Jiri Kosina and Borislav Petkov to bring the same functionality to the 4.4 and 4.9 stable kernel trees. I had also had immense help from Guenter Roeck, Kees Cook, Jamie Iles, and many others in tracking down nasty bugs and missing patches. I want to also call out David Woodhouse, Eduardo Valentin, Laura Abbott, and Rik van Riel for their help with the backporting and integration as well, their help was essential in numerous tricky places.

These LTS kernels also have the CONFIG_PAGE_TABLE_ISOLATION build option that should be enabled to get complete protection.

As this backport is very different from the mainline version that is in 4.14 and 4.15, there are different bugs happening, right now we know of some VDSO issues that are getting worked on, and some odd virtual machine setups are reporting strange errors, but those are the minority at the moment, and should not stop you from upgrading at all right now. If you do run into problems with these releases, please let us know on the stable kernel mailing list.

If you rely on any other kernel tree other than 4.4, 4.9, or 4.14 right now, and you do not have a distribution supporting you, you are out of luck. The lack of patches to resolve the Meltdown problem is so minor compared to the hundreds of other known exploits and bugs that your kernel version currently contains. You need to worry about that more than anything else at this moment, and get your systems up to date first.

Also, go yell at the people who forced you to run an obsoleted and insecure kernel version, they are the ones that need to learn that doing so is a totally reckless act.

Meltdown – ARM64

Right now the ARM64 set of patches for the Meltdown issue are not merged into Linus’s tree. They are staged and ready to be merged into 4.16-rc1 once 4.15 is released in a few weeks. Because these patches are not in a released kernel from Linus yet, I can not backport them into the stable kernel releases (hey, we have rules for a reason…)

Due to them not being in a released kernel, if you rely on ARM64 for your systems (i.e. Android), I point you at the Android Common Kernel tree All of the ARM64 fixes have been merged into the 3.18, 4.4, and 4.9 branches as of this point in time.

I would strongly recommend just tracking those branches as more fixes get added over time due to testing and things catch up with what gets merged into the upstream kernel releases over time, especially as I do not know when these patches will land in the stable and LTS kernel releases at this point in time.

For the 4.4 and 4.9 LTS kernels, odds are these patches will never get merged into them, due to the large number of prerequisite patches required. All of those prerequisite patches have been long merged and tested in the android-common kernels, so I think it is a better idea to just rely on those kernel branches instead of the LTS release for ARM systems at this point in time.

Also note, I merge all of the LTS kernel updates into those branches usually within a day or so of being released, so you should be following those branches no matter what, to ensure your ARM systems are up to date and secure.

Spectre

Now things get “interesting”…

Again, if you are running a distro kernel, you might be covered as some of the distros have merged various patches into them that they claim mitigate most of the problems here. I suggest updating and testing for yourself to see if you are worried about this attack vector

For upstream, well, the status is there is no fixes merged into any upstream tree for these types of issues yet. There are numerous patches floating around on the different mailing lists that are proposing solutions for how to resolve them, but they are under heavy development, some of the patch series do not even build or apply to any known trees, the series conflict with each other, and it’s a general mess.

This is due to the fact that the Spectre issues were the last to be addressed by the kernel developers. All of us were working on the Meltdown issue, and we had no real information on exactly what the Spectre problem was at all, and what patches were floating around were in even worse shape than what have been publicly posted.

Because of all of this, it is going to take us in the kernel community a few weeks to resolve these issues and get them merged upstream. The fixes are coming in to various subsystems all over the kernel, and will be collected and released in the stable kernel updates as they are merged, so again, you are best off just staying up to date with either your distribution’s kernel releases, or the LTS and stable kernel releases.

It’s not the best news, I know, but it’s reality. If it’s any consolation, it does not seem that any other operating system has full solutions for these issues either, the whole industry is in the same boat right now, and we just need to wait and let the developers solve the problem as quickly as they can.

The proposed solutions are not trivial, but some of them are amazingly good. The Retpoline post from Paul Turner is an example of some of the new concepts being created to help resolve these issues. This is going to be an area of lots of research over the next years to come up with ways to mitigate the potential problems involved in hardware that wants to try to predict the future before it happens.

Other arches

Right now, I have not seen patches for any other architectures than x86 and arm64. There are rumors of patches floating around in some of the enterprise distributions for some of the other processor types, and hopefully they will surface in the weeks to come to get merged properly upstream. I have no idea when that will happen, if you are dependant on a specific architecture, I suggest asking on the arch-specific mailing list about this to get a straight answer.

Conclusion

Again, update your kernels, don’t delay, and don’t stop. The updates to resolve these problems will be continuing to come for a long period of time. Also, there are still lots of other bugs and security issues being resolved in the stable and LTS kernel releases that are totally independent of these types of issues, so keeping up to date is always a good idea.

Right now, there are a lot of very overworked, grumpy, sleepless, and just generally pissed off kernel developers working as hard as they can to resolve these issues that they themselves did not cause at all. Please be considerate of their situation right now. They need all the love and support and free supply of their favorite beverage that we can provide them to ensure that we all end up with fixed systems as soon as possible.

January 06, 2018 12:36 PM

January 05, 2018

Pete Zaitcev: Police action in the drone-to-helicopter collision

The year 2017 was the first year when a civilian multicopter drone collided with a manned aircraft. It was expected for a while and there were several false starts. One thing is curious though - how did they find the operator of the drone? I presume it wasn't something simple like a post on Facebook with a video of the collision. They must've polled witnesses in the area, then looked at surveilance cameras or whatnot, to get it narrowed to vehicles.

UPDATE: Readers mkevac and veelmore inform that a serialized part of the drone was recovered, and the investigators worked through seller records to identify the buyer.

January 05, 2018 11:06 PM

Pete Zaitcev: Prof. Babayan's Revenge

Someone at GNUsocial posted:

I suspect people trying to find alternate CPU architectures that don't suffer from #Spectre - like bugs have misunderstood how fundamental the problem is.Your CPU will not go fast without caches. Your CPU will not go fast without speculative execution. Solving the problem will require more silicon, not less. I don't think the market will accept the performance hit implied by simpler architectures. OS, compiler and VM (including the browser) workarounds are the way this will get mitigated.

CPUs will not go fast without caches and speculative execution, you say? Prof. Babayan may have something to say about that. Back when I worked under him in the 1990s, he considered caches a primitive workaround.

The work on Narch was informed by the observation that the submicron feature size provided designers with more silicon they knew what to do with. So, the task of a CPU designer was to identify ways to use massive amounts of gates productively. But instead, mediocre designers simply added more cache, even multi-level cache.

Talking about it was not enough, so he set out to design and implement his CPU, called "Narch" (later commercialized as "Elbrus-2000"). And he did. The performance was generally on par with its contemporaries, such as Pentium III and UltraSparc. It had a cache, but measured in kilobytes, not megabytes. But there were problems beyond the cache.

The second part of the Bee Yarn Knee's objection deals with the speculative execution. Knocking that out required a software known as a binary translator, which did basically the same thing, only in software[*]. Frankly at this point I cannot guarantee that it weren't possible to abuse that mechanism for unintentional signaling in the same ways Meltdown works. You don't have cache for timing signals in Narch, but you do have the translator, which can be timed if it runs at run time like in Transmeta Crusoe. In Narch's case it only ran ahead of time, so not exploitable, but the result turned out to be not fast enough for workloads that make a good use of speculative execution today (such as LISP and gcc).

Still, I think that a blanket objection that CPU cannot run fast with no cache and no speculative execution, IMHO, is informed by ignorance of alternatives. I cannot guarantee that E2k would solve the problem for good, after all its later models sit on top of a cache. But at least we have a hint.

[*] The translator grew from a language toolchain and could be used in creative ways to translate source. It would not be binary in such case. I omit a lot of detail here.

UPDATE: Oh, boy:

But the speedup from speculative execution IS from parallelism. We're just asking the CPU to find it instead of the compiler. So couldn't you move the smarts into the compiler?

Sean, this is literally what they said 30 years ago.

January 05, 2018 04:56 PM

January 04, 2018

Kees Cook: SMEP emulation in PTI

An nice additional benefit of the recent Kernel Page Table Isolation (CONFIG_PAGE_TABLE_ISOLATION) patches (to defend against CVE-2017-5754, the speculative execution “rogue data cache load” or “Meltdown” flaw) is that the userspace page tables visible while running in kernel mode lack the executable bit. As a result, systems without the SMEP CPU feature (before Ivy-Bridge) get it emulated for “free”.

Here’s a non-SMEP system with PTI disabled (booted with “pti=off“), running the EXEC_USERSPACE LKDTM test:

# grep smep /proc/cpuinfo
# dmesg -c | grep isolation
[    0.000000] Kernel/User page tables isolation: disabled on command line.
# cat <(echo EXEC_USERSPACE) > /sys/kernel/debug/provoke-crash/DIRECT
# dmesg
[   17.883754] lkdtm: Performing direct entry EXEC_USERSPACE
[   17.885149] lkdtm: attempting ok execution at ffffffff9f6293a0
[   17.886350] lkdtm: attempting bad execution at 00007f6a2f84d000

No crash! The kernel was happily executing userspace memory.

But with PTI enabled:

# grep smep /proc/cpuinfo
# dmesg -c | grep isolation
[    0.000000] Kernel/User page tables isolation: enabled
# cat <(echo EXEC_USERSPACE) > /sys/kernel/debug/provoke-crash/DIRECT
Killed
# dmesg
[   33.657695] lkdtm: Performing direct entry EXEC_USERSPACE
[   33.658800] lkdtm: attempting ok execution at ffffffff926293a0
[   33.660110] lkdtm: attempting bad execution at 00007f7c64546000
[   33.661301] BUG: unable to handle kernel paging request at 00007f7c64546000
[   33.662554] IP: 0x7f7c64546000
...

It should only take a little more work to leave the userspace page tables entirely unmapped while in kernel mode, and only map them in during copy_to_user()/copy_from_user() as ARM already does with ARM64_SW_TTBR0_PAN (or CONFIG_CPU_SW_DOMAIN_PAN on arm32).

© 2018, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

January 04, 2018 09:43 PM

Pete Zaitcev: More bugs

Speaking of stupid bugs that make no sense to report, Anaconda fails immediately in F27 if one of the disks has an exported volume group on it — in case you thought it was a clever way to protect some data from being overwritten accidentally by the installation. The workaround was to unplug the drives that contained the PVs in the problematic VG. Now, why not to report this? It's 100% reproducible. But reporting presumes a responsibility to re-test, and I'm not going to install a fresh Fedora in a long time again, hopefully, so I'm not in a position to discharge my bug reporter's responsibilities.

January 04, 2018 08:27 PM

January 03, 2018

James Bottomley: GPL as the best licence – Community, Code and Licensing

This article is the first of  a set supporting the conclusion that the GPL family of copy left licences are the best ones for maintaining a healthy development pace while providing a framework for corporations and users to influence the code base.  It is based on an expansion of the thoughts behind the presentation GPL: The Best Business Licence for Corporate Code at the Compliance Summit 2017 in Yokohama.

A Community of Developers

The standard definition of any group of people building some form of open source software is usually that they’re developers (people with the necessary technical skills to create or contribute to the project).  In pretty much every developer driven community, they’re doing it because they get something out of the project itself (this is the scratch your own itch idea in the Cathedral and the Bazaar): usually because they use the project in some form, but sometimes because they’re fascinated by the ideas it embodies (this latter is really how the Linux Kernel got started because ordinarily a kernel on its own isn’t particularly useful but, for a lot of the developers, the ideas that went into creating unix were enormously fascinating and implementations were completely inaccessible in Europe thanks to the USL vs BSDi lawsuit).

The reason for discussing developer driven communities is very simple: they’re the predominant type of community in open source (Think Linux Kernel, Gnome, KDE etc) which implies that they’re the natural type of community that forms around shared code collaboration.  In this model of interaction, community and code are interlinked: Caring for the code means you also care for the community.  The health of this type of developer community is very easily checked: ask how many contributors would still contribute to the project if they weren’t paid to do it (reduction in patch volume doesn’t matter, just the desire to continue sending patches).  If fewer than 50% of the core contributors would cease contributing if they weren’t paid then the community is unhealthy.

Developer driven communities suffer from three specific drawbacks:

  1. They’re fractious: people who care about stuff tend to spend a lot of time arguing about it.  Usually some form of self organising leadership fixes a significant part of this, but it’s not guaranteed.
  2. Since the code is built by developers for developers (which is why they care about it) there’s no room for users who aren’t also developers in this model.
  3. The community is informal so there’s no organisation for corporations to have a peer relationship with, plus developers don’t tend to trust corporate motives anyway making it very difficult for corporations to join the community.

Trusting Corporations and Involving Users

Developer communities often distrust the motives of corporations because they think corporations don’t care about the code in the same way as developers do.  This is actually completely true: developers care about code for its own sake but corporations care about code only as far as it furthers their business interests.  However, this business interest motivation does provide the basis for trust within the community: as long as the developer community can see and understand the business motivation, they can trust the Corporation to do the right thing; within limits, of course, for instance code quality requirements of developers often conflict with time to release requirements for market opportunity.  This shared interest in the code base becomes the framework for limited trust.

Enter the community manager:  A community manager’s job, when executed properly, is twofold: one is to take corporate business plans and realign them so that some of the corporate goals align with those of useful open source communities and the second is to explain this goal alignment to the relevant communities.  This means that a good community manager never touts corporate “community credentials” but instead explains in terms developers can understand the business reasons why the community and the corporation should work together.  Once the goals are visible and aligned, the developer community will usually welcome the addition of paid corporate developers to work on the code.  Paying for contributions is the most effective path for Corporations to exert significant influence on the community and assurance of goal alignment is how the community understands how this influence benefits the community.

Involving users is another benefit corporations can add to the developer ecosystem.  Users who aren’t developers don’t have the technical skills necessary to make their voices and opinions heard within the developer driven community but corporations, which usually have paying users in some form consuming the packaged code, can respond to user input and could act as a proxy between the user base and the developer community.  For some corporations responding to user feedback which enhances uptake of the product is a natural business goal.  For others, it could be a goal the community manager pushes for within the corporation as a useful goal that would help business and which could be aligned with the developer community.  In either case, as long as the motives and goals are clearly understood, the corporation can exert influence in the community directly on behalf of users.

Corporate Fear around Community Code

All corporations have a significant worry about investing in something which they don’t control. However, these worries become definite fears around community code because not only might it take a significant investment to exert the needed influence, there’s also the possibility that the investment might enable a competitor to secure market advantage.

Another big potential fear is loss of intellectual property in the form of patent grants.  Specifically, permissive licences with patent grants allow any other entity to take the code on which the patent reads, incorporate it into a proprietary code base and then claim the benefit of the patent grant under the licence.  This problem, essentially, means that, unless it doesn’t care about IP leakage (or the benefit gained outweighs the problem caused), no corporation should contribute code to which they own patents under a permissive licence with a patent grant.

Both of these fears are significant drivers of “privatisation”, the behaviour whereby a corporation takes community code but does all of its enhancements and modifications in secret and never contributes them back to the community, under the assumption that bearing the forking cost of doing this as less onerous than the problems above.

GPL is the Key to Allaying these Fears

The IP leak fear is easily allayed: whether the version of GPL that includes an explicit or implicit patent licence, the IP can only leak as far as the code can go and the code cannot be included in a proprietary product because of the reciprocal code release requirements, thus the Corporation always has visibility into how far the IP rights might leak by following licence mandated code releases.

GPL cannot entirely allay the fear of being out competed with your own code but it can, at least, ensure that if a competitor is using a modification of your code, you know about it (as do your competition), so everyone has a level playing field.  Most customers tend to prefer active participants in open code bases, so to be competitive in the market places, corporations using the same code base tend to be trying to contribute actively.  The reciprocal requirements of GPL provide assurance that no-one can go to market with a secret modification of the code base that they haven’t shared with others.  Therefore, although corporations would prefer dominance and control, they’re prepared to settle for a fully level playing field, which the GPL provides.

Finally, from the community’s point of view, reciprocal licences prevent code privatisation (you can still work from a fork, but you must publish it) and thus encourage code sharing which tends to be a key community requirement.

Conclusions

In this first part, I conclude that the GPL, by ensuring fairness between mutually distrustful contributors and stemming IP leaks, can act as a guarantor of a workable code ecosystem for both developers and corporations and, by using the natural desire of corporations to appeal to customers, can use corporations to bridge the gap between user requirements and the developer community.

In the second part of this series, I’ll look at philosophy and governance and why GPL creates self regulating ecosystems which give corporations and users a useful function while not constraining the natural desire of developers to contribute and contrast this with other possible ecosystem models.

January 03, 2018 11:43 PM

January 02, 2018

Pete Zaitcev: The gdm spamming logs in F27, RHbz#1322588

Speaking of the futility of reporting bugs, check out the 1322588. Basically, gdm tries to adjust the screen brightness when a user is already logged in on that screen (fortunately, it fails). Fedora users report the bug, the maintainer asks them to report it upstream. They report it upstream. The upstream tinkers with something tangentially related, closes the bug. Maintainer closes the bug in Fedora. The issue is not fixed, users re-open the bug and the process continues. It was going on for coming up to 2 years now. I don't know why the GNOME upstream cannot program gdm not to screw with the screen after the very same gdm has logged a user in. It's beyond stupid, and I don't know what can be done. I can buy a Mac, I suppose.

UPDATE:

-- Comment #71 from Cédric Bellegarde
Simple workaround:
- Disable auto brightness in your gnome session
- Logout and stop gdm
- Copy ~/.config/dconf/user to /var/lib/gdm/.config/dconf/user

UPDATE 2018-01-10:

-- Comment #75 from Bastien Nocera
Maybe you can do something there, instead of posting passive aggressive blog entries.

Back when Bastien maintained xine, we enjoyed a cordial working relationship, but I guess that does not count for anything.

January 02, 2018 11:36 PM

Pete Zaitcev: No more VLAN in the home network

Thanks to Fedora dropping the 32-bit x86 (i686) in F27, I had no choice but to upgrade the home router. I used this opportunity to get rid of VLANs and return to a conventional setup with 4 Eithernet ports. The main reason is, VLANs were not entirely stable in Fedora. Yes, they mostly worked, but I could never be sure that they would continue to work. Also, mostly in this context means, for example, that some time around F24 the boot-up process started hanging on the "Starting the LSB Networking" job for about a minute. It never was worth the trouble raising any bugs or tickets with upstreams, I never was able to resolve a single one of them. Not in Zebra, not in radvd, not in NetworkManager. Besides, if something is broken, I need a solution right now, not when developers turn it around. I suppose VLANs could be allright if I stuck to initscripts, but I needed NetworkManager to interact properly with the upstream ISP at some point. So, whatever. Fedora costed me $150 for the router and killed my VLAN setup.

I looked at ARM routers, but there was nothing. Or, nothing affordable that was SBSA and RHEL compatible. Sorry, ARM, you're still immature. Give me a call when you grow up.

Buying from Chinese was a mostly typical experience. They try to do good, but... Look at the questions about the console pinout at Amazon. The official answer is, "Hello,the pinouts is 232." Yes, really. When I tried to contact them by e-mail, they sent me a bunch of pictures that included pinouts for Ethernet RJ-45, pinout for motherboard header, and a photograph of a Cisco console cable. No, they don't use Cisco pinout. Instead, they use DB9 pin numbers on RJ-45 (obviously, pin 9 is not connected). It was easy to figure out using a multimeter, but I thought I'd ask properly first. The result was very stereotypical.

P.S. The bright green light is blink(1), a Christmas present from my daughter. I'm not yet using it to its full potential. The problem is, if it only shows a static light, it cannot indicate if the router hangs or fails to boot. It needs some kind of daemon job that constantly changes it.

P.P.S. The SG200 is probably going into the On-Q closet, where it may actually come useful.

P.P.P.S. There's a PoE injector under the white cable loop somewhere. It powers a standalone Cisco AP, a 1040 model.

January 02, 2018 11:08 PM

January 01, 2018

Paul E. Mc Kenney: 2017 Year-End Advice

One of the occupational hazard of being an old man is the urge to provide unsolicited advice on any number of topics. This time, the topic is weight lifting.

Some years ago, I decided to start lifting weights. My body no longer tolerated running, so I had long since substituted various low-impact mechanical means of aerobic exercise. But there was growing evidence that higher muscle mass is a good thing as one ages, so I figured I should give it a try. This posting lists a couple of my mistakes, which could enable you to avoid them, which in turn could enable you to make brand-spanking new mistakes of your very own design!

The first mistake resulted in sporadic pains in my left palm and wrist, which appeared after many months of upper-body weight workouts. In my experience, at my age, any mention of this sort of thing to medical professionals will result in a tentative diagnosis of arthritis, with the only prescription being continued observation. This experience motivated me to do a bit of self-debugging beforehand, which led me to notice that the pain was only in my left wrist and only in the center of my left palm. This focused my attention on my two middle fingers, especially the one on which I have been wearing a wedding ring pretty much non-stop since late 1985. (Of course, those prone to making a certain impolite hand gesture might have reason to suspect their middle finger.)

So I tried removing my wedding ring. I was unable to do so, even after soaking my hand for some minutes in a bath of water, soap, and ice. This situation seemed like a very bad thing, regardless of what might be causing the pain. I therefore consulted my wife, who suggested a particular jewelry store. Shortly thereafter, I was sitting in a chair while a gentleman used a tiny but effective hand-cranked circular saw to cut through the ring and a couple pairs of pliers to open it up. The gentleman was surprised that it took more than ten turns of the saw to cut through the ring, in contrast to the usual three turns. Apparently wearing a ring for more than 30 years can cause it to work harden.

The next step was for me to go without a ring for a few weeks to allow my finger to decide what size it wanted to be, now that it had a choice. They gave me back the cut-open ring, which I carried in my pocket. Coincidence or not, during that time, the pains in my wrists and palms vanished. Later, jewelry store resized the ring.

I now remove my ring every night. If you take up any sort of weight lifting involving use of your hands, I recommend that you also remove any rings you might wear, just to verify that you still can.

My second mistake was to embark upon a haphazard weight-lifting regime. I felt that this was OK because I wasn't training for anything other than advanced age, so that any imbalances should be fairly easily addressed.

My body had other ideas, especially in connection with the bout of allergy/asthma/sinitus/brochitis/whatever that I have (knock on wood) mostly recovered from. This condition of course results in coughing, in which the muscles surrounding your chest work together to push air out of your lungs as abruptly and quickly as humanly possible. (Interestingly enough, the maximum velocity of cough-driven air seems to be subject to great dispute, perhaps because it is highly variable and because there are so many different places you could measure it.)

The maximum-effort nature of a cough is just fine if your various chest muscles are reasonably evenly matched. Unfortunately, I had not concerned myself with the effects of my weight-lifting regime on my ability to cough, so I learned the hard way that the weaker muscles might object to this treatment, and make their objections known by going into spasms. Spasms involving one's back can be surprisingly difficult to pin down, but for me, otherwise nonsensical shooting pains involving the neck and head are often due to something in my back. I started some simple and gentle back exercises, and also indulged in Warner Brothers therapy, which involves sitting in an easy chair watching Warner Brothers cartoons, assisted by a heating pad lent by my wife.

In summary, if you are starting weight training, (1) take an organized approach and (2) remove any rings you are wearing at least once a week.

Other than that, have a very happy new year!!!

January 01, 2018 02:29 AM

December 29, 2017

Dave Airlie (blogspot): radv and vega conformance test status

We've been passing the vulkan conformance test suite 1.0.2 mustpass list on radv for quite a while now on the CIK/VI/Polaris cards. However Vega hadn't achieved the same pass rate.





With a bunch of fixes I pushed this morning and one fix for all GPUs, we now have the same pass rate on all GPUs and 0 fails.

This means Vega on radv can now be submitted for conformance under Vulkan 1.0,  not sure when I'll get time to do the paperwork, maybe early next year sometime.


December 29, 2017 03:24 AM

December 26, 2017

Pavel Machek: PostmarketOS and digital cameras

I did some talking in 2017. If you want to learn about postmarketOS (in Czech), there's recording at https://www.superlectures.com/openalt2017/telefonni-revoluce-se-blizi . ELCE talk about status of phone cameras is at https://www.youtube.com/watch?v=fH6zuK2OOVU .

December 26, 2017 10:59 AM

December 23, 2017

Paul E. Mc Kenney: Book review: "Engineering Reminiscences" and "Tears of Stone"

I believe that Charles T. Porter's “Engineering Reminiscences“ was a gift from my grandfather, who was himself a machinist. Porter's most prominent contribution was the high-speed steam engine, that is to say, a steam engine operating at more than about 100 RPM. Although steam engines and their governors proved to be somewhat of a dead end, some of his dynamic balancing techniques are still in use.

Technology changes, people and organizations not so much. Chapter XVII starting on page 189 describes a demonstration of two of his new high-speed steam engines (on operating at 150 RPM the other at 300 RPM) along with one of his colleague's new boilers at the 1870 Fair of the American Institute in New York. The boiler ran slanted water tubes through the firebox to more efficiently separate steam from the remaining water. The engines were small by 1870s standards, one having 16-inch diameter cylinders with a 30-inch stroke and the other having 6-inch diameter cylinders with a 12-inch stroke.

Other exhibitors also had boilers and steam engines, and yet other exhibitors had equipment driven by steam engines. All the boilers and steam engines where connected, but given that steam engines were, then as now, considered to be way cooler than mere boilers, it should not be too surprising that the boilers could not produce enough steam to keep all the engines running. In fact, by the end of the day, the steam pressure had dropped by half, resulting in great consternation and annoyance all around. The finger of suspicion quickly pointed at Porter's two high-speed steam engines—after all, great speed clearly must imply equally great consumption of steam, right?

Porter had anticipated this situation, and had therefore installed a shutoff valve that isolated the boiler and his two high-speed steam engines from the rest of the Fair's equipment. Porter therefore closed his valve, with the result that the steam pressure within his little steam network immediately rose to 70 PSI and the pressure to the rest of the network dropped to 25 PSI. In fact, the boiler generated excess steam even at 70 PSI, so that the fireman had to leave the firebox door slightly open to artificially lower the boiler temperature.

The steam pressure to the rest of the fair continued to decrease until it was but 15 PSI. Over the noon hour, an additional boiler was installed, which brought the pressure up to 70 PSI. Restarting the steam engines of course reduced the pressure, but at 5PM it was still 25 PSI.

The superintendent of the machinery department had repeatedly asked Porter to reopen the valve, but each time Porter had refused. At 5PM, the superintendent made it clear that his request was now a demand, and that if Porter would not open the valve, the superintendent would open it for him. Porter finally agreed to open the valve, but only on the condition that the other managers of the institute verify that the boiler was in fact generating more than enough steam for both engines. These managers were summoned forthwith, and they agreed that the boiler had been producing most of the show's steam and that the pair of high-speed steam engines had been consuming very little. Porter opened the valve, and there was no further trouble with low-pressure steam.

It is all too easy to imagine a roughly similar story unfolding in today's world. ;–)

Porter went on to develop steam engines capable of running well in excess of 1,000 RPM, with one key challenge being convincing onlookers that the motion-blurred engine really was running that fast.

Interestingly enough, steam engines were Porter's third career. He was a lawyer for several years, but became disgusted with legal practice. At about that same time, he became quite interested in the problem of facing stone, that is, producing a machine that would take a rough-cut stone and give it a smooth planar face (smooth by the standards of the mid-1800s, anyway). After a couple of years of experimentation, he produced a steam-powered machine that efficiently faced stone. Unfortunately, at about that same time, others realized that saws could even more efficiently face stone, so his invention was what we might now call a technical success and a business failure.

Oddly enough, we have recently learned that the application of saws to stone was not an invention of the mid-1800s, but rather a re-invention of a technique used heavily in the ancient Roman Empire, and suspected of having been used as early as the 13th century BC. This is one of many interesting nuggets on life in the Roman Empire brought out by the historical novel “Tears of Stone” by Vannoy and Zeiglar. This novel is informed by Zeigler's application of Cold War remote-sensing technology to interesting areas of the Italian landscape, a fact that I had the privilege of learning directly from Zeigler himself.

On the other hand, perhaps Porter's ghost can console himself with the fact that the earliest stone saws were hand-powered, and those of the Roman Empire were water powered. Porter's stone-facing machine was instead powered by modern steam engines. Yes, the ancient Egyptians also made some use of steam power, but as far as we know they never applied it industrially, and never via a reciprocating engine driving a rotary shaft. And yes, all of the qualifiers in the preceding sentence are necessary.

As we learn more about ancient civilizations, it will be interesting to see what other “modern inventions” turn out to have deep roots in ancient times!

December 23, 2017 10:29 PM

Paul E. Mc Kenney: Book review: "Make Trouble"

This book, by John Waters of “Hairspray” fame, was an impulse purchase. After all, who could fail to notice a small pink book with large white textured letters saying “Make Trouble”? It is a transcription of Waters's commencement address to the Rhode Institute School of Design's Class of 2015. Those who have known me over several decades might be surprised by this purchase, but what old man could resist a book whose flyleaf states “Anyone embarking on a creative path, he tells us, would do well to realize that pragmatism and discipline are as important as talent and that rejection is nothing to fear.”

They might be even more surprised that I agree with much of his advice. For but three examples:


  1. “A career in the arts is like a hitchhiking trip: All you need is one person to say ‘get in,’ and off you go.” Not really any different from my advising people to use the “high-school boy” algorithm when submitting papers and proposals.
  2. “Keep up with what's causing chaos in your field.” Not really any different from my “Go where there is trouble!”
  3. “Listen to your political enemies, particularly the smart ones”. Me, I would omit the word “political”, but close enough.
The book is mostly pictures, so if you are short of money, you do have the option of just reading it in the bookstore. See, I am making trouble already! ;–)

December 23, 2017 08:46 PM

December 21, 2017

Matthew Garrett: When should behaviour outside a community have consequences inside it?

Free software communities don't exist in a vacuum. They're made up of people who are also members of other communities, people who have other interests and engage in other activities. Sometimes these people engage in behaviour outside the community that may be perceived as negatively impacting communities that they're a part of, but most communities have no guidelines for determining whether behaviour outside the community should have any consequences within the community. This post isn't an attempt to provide those guidelines, but aims to provide some things that community leaders should think about when the issue is raised.

Some things to consider

Did the behaviour violate the law?

This seems like an obvious bar, but it turns out to be a pretty bad one. For a start, many things that are common accepted behaviour in various communities may be illegal (eg, reverse engineering work may contravene a strict reading of US copyright law), and taking this to an extreme would result in expelling anyone who's ever broken a speed limit. On the flipside, refusing to act unless someone broke the law is also a bad threshold - much behaviour that communities consider unacceptable may be entirely legal.

There's also the problem of determining whether a law was actually broken. The criminal justice system is (correctly) biased to an extent in favour of the defendant - removing someone's rights in society should require meeting a high burden of proof. However, this is not the threshold that most communities hold themselves to in determining whether to continue permitting an individual to associate with them. An incident that does not result in a finding of criminal guilt (either through an explicit finding or a failure to prosecute the case in the first place) should not be ignored by communities for that reason.

Did the behaviour violate your community norms?

There's plenty of behaviour that may be acceptable within other segments of society but unacceptable within your community (eg, lobbying for the use of proprietary software is considered entirely reasonable in most places, but rather less so at an FSF event). If someone can be trusted to segregate their behaviour appropriately then this may not be a problem, but that's probably not sufficient in all cases. For instance, if someone acts entirely reasonably within your community but engages in lengthy anti-semitic screeds on 4chan, it's legitimate to question whether permitting them to continue being part of your community serves your community's best interests.

Did the behaviour violate the norms of the community in which it occurred?

Of course, the converse is also true - there's behaviour that may be acceptable within your community but unacceptable in another community. It's easy to write off someone acting in a way that contravenes the standards of another community but wouldn't violate your expected behavioural standards - after all, if it wouldn't breach your standards, what grounds do you have for taking action?

But you need to consider that if someone consciously contravenes the behavioural standards of a community they've chosen to participate in, they may be willing to do the same in your community. If pushing boundaries is a frequent trait then it may not be too long until you discover that they're also pushing your boundaries.

Why do you care?

A community's code of conduct can be looked at in two ways - as a list of behaviours that will be punished if they occur, or as a list of behaviours that are unlikely to occur within that community. The former is probably the primary consideration when a community adopts a CoC, but the latter is how many people considering joining a community will think about it.

If your community includes individuals that are known to have engaged in behaviour that would violate your community standards, potential members or contributors may not trust that your CoC will function as adequate protection. A community that contains people known to have engaged in sexual harassment in other settings is unlikely to be seen as hugely welcoming, even if they haven't (as far as you know!) done so within your community. The way your members behave outside your community is going to be seen as saying something about your community, and that needs to be taken into account.

A second (and perhaps less obvious) aspect is that membership of some higher profile communities may be seen as lending general legitimacy to someone, and they may play off that to legitimise behaviour or views that would be seen as abhorrent by the community as a whole. If someone's anti-semitic views (for example) are seen as having more relevance because of their membership of your community, it's reasonable to think about whether keeping them in your community serves the best interests of your community.

Conclusion

I've said things like "considered" or "taken into account" a bunch here, and that's for a good reason - I don't know what the thresholds should be for any of these things, and there doesn't seem to be even a rough consensus in the wider community. We've seen cases in which communities have acted based on behaviour outside their community (eg, Debian removing Jacob Appelbaum after it was revealed that he'd sexually assaulted multiple people), but there's been no real effort to build a meaningful decision making framework around that.

As a result, communities struggle to make consistent decisions. It's unreasonable to expect individual communities to solve these problems on their own, but that doesn't mean we can ignore them. It's time to start coming up with a real set of best practices.

comment count unavailable comments

December 21, 2017 10:09 AM

December 14, 2017

Matthew Garrett: The Intel ME vulnerabilities are a big deal for some people, harmless for most

(Note: all discussion here is based on publicly disclosed information, and I am not speaking on behalf of my employers)

I wrote about the potential impact of the most recent Intel ME vulnerabilities a couple of weeks ago. The details of the vulnerability were released last week, and it's not absolutely the worst case scenario but it's still pretty bad. The short version is that one of the (signed) pieces of early bringup code for the ME reads an unsigned file from flash and parses it. Providing a malformed file could result in a buffer overflow, and a moderately complicated exploit chain could be built that allowed the ME's exploit mitigation features to be bypassed, resulting in arbitrary code execution on the ME.

Getting this file into flash in the first place is the difficult bit. The ME region shouldn't be writable at OS runtime, so the most practical way for an attacker to achieve this is to physically disassemble the machine and directly reprogram it. The AMT management interface may provide a vector for a remote attacker to achieve this - for this to be possible, AMT must be enabled and provisioned and the attacker must have valid credentials[1]. Most systems don't have provisioned AMT, so most users don't have to worry about this.

Overall, for most end users there's little to worry about here. But the story changes for corporate users or high value targets who rely on TPM-backed disk encryption. The way the TPM protects access to the disk encryption key is to insist that a series of "measurements" are correct before giving the OS access to the disk encryption key. The first of these measurements is obtained through the ME hashing the first chunk of the system firmware and passing that to the TPM, with the firmware then hashing each component in turn and storing those in the TPM as well. If someone compromises a later point of the chain then the previous step will generate a different measurement, preventing the TPM from releasing the secret.

However, if the first step in the chain can be compromised, all these guarantees vanish. And since the first step in the chain relies on the ME to be running uncompromised code, this vulnerability allows that to be circumvented. The attacker's malicious code can be used to pass the "good" hash to the TPM even if the rest of the firmware has been tampered with. This allows a sufficiently skilled attacker to extract the disk encryption key and read the contents of the disk[2].

In addition, TPMs can be used to perform something called "remote attestation". This allows the TPM to provide a signed copy of the recorded measurements to a remote service, allowing that service to make a policy decision around whether or not to grant access to a resource. Enterprises using remote attestation to verify that systems are appropriately patched (eg) before they allow them access to sensitive material can no longer depend on those results being accurate.

Things are even worse for people relying on Intel's Platform Trust Technology (PTT), which is an implementation of a TPM that runs on the ME itself. Since this vulnerability allows full access to the ME, an attacker can obtain all the private key material held in the PTT implementation and, effectively, adopt the machine's cryptographic identity. This allows them to impersonate the system with arbitrary measurements whenever they want to. This basically renders PTT worthless from an enterprise perspective - unless you've maintained physical control of a machine for its entire lifetime, you have no way of knowing whether it's had its private keys extracted and so you have no way of knowing whether the attestation attempt is coming from the machine or from an attacker pretending to be that machine.

Bootguard, the component of the ME that's responsible for measuring the firmware into the TPM, is also responsible for verifying that the firmware has an appropriate cryptographic signature. Since that can be bypassed, an attacker can reflash modified firmware that can do pretty much anything. Yes, that probably means you can use this vulnerability to install Coreboot on a system locked down using Bootguard.

(An aside: The Titan security chips used in Google Cloud Platform sit between the chipset and the flash and verify the flash before permitting anything to start reading from it. If an attacker tampers with the ME firmware, Titan should detect that and prevent the system from booting. However, I'm not involved in the Titan project and don't know exactly how this works, so don't take my word for this)

Intel have published an update that fixes the vulnerability, but it's pretty pointless - there's apparently no rollback protection in the affected 11.x MEs, so while the attacker is modifying your flash to insert the payload they can just downgrade your ME firmware to a vulnerable version. Version 12 will reportedly include optional rollback protection, which is little comfort to anyone who has current hardware. Basically, anyone whose threat model depends on the low-level security of their Intel system is probably going to have to buy new hardware.

This is a big deal for enterprises and any individuals who may be targeted by skilled attackers who have physical access to their hardware, and entirely irrelevant for almost anybody else. If you don't know that you should be worried, you shouldn't be.

[1] Although admins should bear in mind that any system that hasn't been patched against CVE-2017-5689 considers an empty authentication cookie to be a valid credential

[2] TPMs are not intended to be strongly tamper resistant, so an attacker could also just remove the TPM, decap it and (with some effort) extract the key that way. This is somewhat more time consuming than just reflashing the firmware, so the ME vulnerability still amounts to a change in attack practicality.

comment count unavailable comments

December 14, 2017 01:31 AM

December 12, 2017

Matthew Garrett: Eben Moglen is no longer a friend of the free software community

(Note: While the majority of the events described below occurred while I was a member of the board of directors of the Free Software Foundation, I am no longer. This is my personal position and should not be interpreted as the opinion of any other organisation or company I have been affiliated with in any way)

Eben Moglen has done an amazing amount of work for the free software community, serving on the board of the Free Software Foundation and acting as its general counsel for many years, leading the drafting of GPLv3 and giving many forceful speeches on the importance of free software. However, his recent behaviour demonstrates that he is no longer willing to work with other members of the community, and we should reciprocate that.

In early 2016, the FSF board became aware that Eben was briefing clients on an interpretation of the GPL that was incompatible with that held by the FSF. He later released this position publicly with little coordination with the FSF, which was used by Canonical to justify their shipping ZFS in a GPL-violating way. He had provided similar advice to Debian, who were confused about the apparent conflict between the FSF's position and Eben's.

This situation was obviously problematic - Eben is clearly free to provide whatever legal opinion he holds to his clients, but his very public association with the FSF caused many people to assume that these positions were held by the FSF and the FSF were forced into the position of publicly stating that they disagreed with legal positions held by their general counsel. Attempts to mediate this failed, and Eben refused to commit to working with the FSF on avoiding this sort of situation in future[1].

Around the same time, Eben made legal threats towards another project with ties to FSF. These threats were based on a license interpretation that ran contrary to how free software licenses had been interpreted by the community for decades, and was made without any prior discussion with the FSF (2017-12-11 update: page 126 of this document includes the email in which Eben asserts that the Software Freedom Conservancy is engaging in plagiarism by making use of appropriately credited material released under a Creative Commons license). This, in conjunction with his behaviour over the ZFS issue, led to him stepping down as the FSF's general counsel.

Throughout this period, Eben disparaged FSF staff and other free software community members in various semi-public settings. In doing so he harmed the credibility of many people who have devoted significant portions of their lives to aiding the free software community. At Libreplanet earlier this year he made direct threats against an attendee - this was reported as a violation of the conference's anti-harassment policy.

Eben has acted against the best interests of an organisation he publicly represented. He has threatened organisations and individuals who work to further free software. His actions are no longer to the benefit of the free software community and the free software community should cease associating with him.

[1] Contrary to the claim provided here, Bradley was not involved in this process.

(Edit to add: various people have asked for more details of some of the accusations here. Eben is influential in many areas, and publicising details without the direct consent of his victims may put them at professional risk. I'm aware that this reduces my credibility, and it's entirely reasonable for people to choose not to believe me as a result. I will add that I said much of this several months ago, so I'm not making stuff up in response to recent events)

comment count unavailable comments

December 12, 2017 05:59 AM

December 05, 2017

Pete Zaitcev: Marcan: Debugging an evil Go runtime bug

Fascinating, and a few reactions spring to mind.

First, I have to admit, the resolution simultaneously blew me away and was very nostalgic. Forgetting that some instructions are not atomic is just the thing that I saw people commit in architecture support in kernel (I don't remember if I ever used an opportunity to do it, it's quite possible, even on sun4c).

Also, my (former) colleague DaveJ (who's now consumed by Facebook -- I remember complaints about useful people "gone to Google and never heard from again", but Facebook is the same hole nowadays) once said, approximately: "Everyone loves to crap on Gentoo hackers for silly optimizations and being otherwise unprofessional, but when it's something interesting it's always (or often) them." Gentoo crew is underrated, including their userbase.

And finally:

Go also happens to have a (rather insane, in my opinion) policy of reinventing its own standard library, so it does not use any of the standard Linux glibc code to call vDSO, but rather rolls its own calls (and syscalls too).

Usually you hear about this when their DNS resolver blows up, but it can be elsewhere, as in this case.

(h/t to a chatter in #animeblogger)

UPDATE: CKS adds that some UNIXen officially require applications to use libc.

December 05, 2017 08:20 PM

November 28, 2017

Matthew Garrett: Potential impact of the Intel ME vulnerability

(Note: this is my personal opinion based on public knowledge around this issue. I have no knowledge of any non-public details of these vulnerabilities, and this should not be interpreted as the position or opinion of my employer)

Intel's Management Engine (ME) is a small coprocessor built into the majority of Intel CPU chipsets[0]. Older versions were based on the ARC architecture[1] running an embedded realtime operating system, but from version 11 onwards they've been small x86 cores running Minix. The precise capabilities of the ME have not been publicly disclosed, but it is at minimum capable of interacting with the network[2], display[3], USB, input devices and system flash. In other words, software running on the ME is capable of doing a lot, without requiring any OS permission in the process.

Back in May, Intel announced a vulnerability in the Advanced Management Technology (AMT) that runs on the ME. AMT offers functionality like providing a remote console to the system (so IT support can connect to your system and interact with it as if they were physically present), remote disk support (so IT support can reinstall your machine over the network) and various other bits of system management. The vulnerability meant that it was possible to log into systems with enabled AMT with an empty authentication token, making it possible to log in without knowing the configured password.

This vulnerability was less serious than it could have been for a couple of reasons - the first is that "consumer"[4] systems don't ship with AMT, and the second is that AMT is almost always disabled (Shodan found only a few thousand systems on the public internet with AMT enabled, out of many millions of laptops). I wrote more about it here at the time.

How does this compare to the newly announced vulnerabilities? Good question. Two of the announced vulnerabilities are in AMT. The previous AMT vulnerability allowed you to bypass authentication, but restricted you to doing what AMT was designed to let you do. While AMT gives an authenticated user a great deal of power, it's also designed with some degree of privacy protection in mind - for instance, when the remote console is enabled, an animated warning border is drawn on the user's screen to alert them.

This vulnerability is different in that it allows an authenticated attacker to execute arbitrary code within the AMT process. This means that the attacker shouldn't have any capabilities that AMT doesn't, but it's unclear where various aspects of the privacy protection are implemented - for instance, if the warning border is implemented in AMT rather than in hardware, an attacker could duplicate that functionality without drawing the warning. If the USB storage emulation for remote booting is implemented as a generic USB passthrough, the attacker could pretend to be an arbitrary USB device and potentially exploit the operating system through bugs in USB device drivers. Unfortunately we don't currently know.

Note that this exploit still requires two things - first, AMT has to be enabled, and second, the attacker has to be able to log into AMT. If the attacker has physical access to your system and you don't have a BIOS password set, they will be able to enable it - however, if AMT isn't enabled and the attacker isn't physically present, you're probably safe. But if AMT is enabled and you haven't patched the previous vulnerability, the attacker will be able to access AMT over the network without a password and then proceed with the exploit. This is bad, so you should probably (1) ensure that you've updated your BIOS and (2) ensure that AMT is disabled unless you have a really good reason to use it.

The AMT vulnerability applies to a wide range of versions, everything from version 6 (which shipped around 2008) and later. The other vulnerability that Intel describe is restricted to version 11 of the ME, which only applies to much more recent systems. This vulnerability allows an attacker to execute arbitrary code on the ME, which means they can do literally anything the ME is able to do. This probably also means that they are able to interfere with any other code running on the ME. While AMT has been the most frequently discussed part of this, various other Intel technologies are tied to ME functionality.

Intel's Platform Trust Technology (PTT) is a software implementation of a Trusted Platform Module (TPM) that runs on the ME. TPMs are intended to protect access to secrets and encryption keys and record the state of the system as it boots, making it possible to determine whether a system has had part of its boot process modified and denying access to the secrets as a result. The most common usage of TPMs is to protect disk encryption keys - Microsoft Bitlocker defaults to storing its encryption key in the TPM, automatically unlocking the drive if the boot process is unmodified. In addition, TPMs support something called Remote Attestation (I wrote about that here), which allows the TPM to provide a signed copy of information about what the system booted to a remote site. This can be used for various purposes, such as not allowing a compute node to join a cloud unless it's booted the correct version of the OS and is running the latest firmware version. Remote Attestation depends on the TPM having a unique cryptographic identity that is tied to the TPM and inaccessible to the OS.

PTT allows manufacturers to simply license some additional code from Intel and run it on the ME rather than having to pay for an additional chip on the system motherboard. This seems great, but if an attacker is able to run code on the ME then they potentially have the ability to tamper with PTT, which means they can obtain access to disk encryption secrets and circumvent Bitlocker. It also means that they can tamper with Remote Attestation, "attesting" that the system booted a set of software that it didn't or copying the keys to another system and allowing that to impersonate the first. This is, uh, bad.

Intel also recently announced Intel Online Connect, a mechanism for providing the functionality of security keys directly in the operating system. Components of this are run on the ME in order to avoid scenarios where a compromised OS could be used to steal the identity secrets - if the ME is compromised, this may make it possible for an attacker to obtain those secrets and duplicate the keys.

It's also not entirely clear how much of Intel's Secure Guard Extensions (SGX) functionality depends on the ME. The ME does appear to be required for SGX Remote Attestation (which allows an application using SGX to prove to a remote site that it's the SGX app rather than something pretending to be it), and again if those secrets can be extracted from a compromised ME it may be possible to compromise some of the security assumptions around SGX. Again, it's not clear how serious this is because it's not publicly documented.

Various other things also run on the ME, including stuff like video DRM (ensuring that high resolution video streams can't be intercepted by the OS). It may be possible to obtain encryption keys from a compromised ME that allow things like Netflix streams to be decoded and dumped. From a user privacy or security perspective, these things seem less serious.

The big problem at the moment is that we have no idea what the actual process of compromise is. Intel state that it requires local access, but don't describe what kind. Local access in this case could simply require the ability to send commands to the ME (possible on any system that has the ME drivers installed), could require direct hardware access to the exposed ME (which would require either kernel access or the ability to install a custom driver) or even the ability to modify system flash (possible only if the attacker has physical access and enough time and skill to take the system apart and modify the flash contents with an SPI programmer). The other thing we don't know is whether it's possible for an attacker to modify the system such that the ME is persistently compromised or whether it needs to be re-compromised every time the ME reboots. Note that even the latter is more serious than you might think - the ME may only be rebooted if the system loses power completely, so even a "temporary" compromise could affect a system for a long period of time.

It's also almost impossible to determine if a system is compromised. If the ME is compromised then it's probably possible for it to roll back any firmware updates but still report that it's been updated, giving admins a false sense of security. The only way to determine for sure would be to dump the system flash and compare it to a known good image. This is impractical to do at scale.

So, overall, given what we know right now it's hard to say how serious this is in terms of real world impact. It's unlikely that this is the kind of vulnerability that would be used to attack individual end users - anyone able to compromise a system like this could just backdoor your browser instead with much less effort, and that already gives them your banking details. The people who have the most to worry about here are potential targets of skilled attackers, which means activists, dissidents and companies with interesting personal or business data. It's hard to make strong recommendations about what to do here without more insight into what the vulnerability actually is, and we may not know that until this presentation next month.

Summary: Worst case here is terrible, but unlikely to be relevant to the vast majority of users.

[0] Earlier versions of the ME were built into the motherboard chipset, but as portions of that were incorporated onto the CPU package the ME followedEdit: Apparently I was wrong and it's still on the chipset
[1] A descendent of the SuperFX chip used in Super Nintendo cartridges such as Starfox, because why not
[2] Without any OS involvement for wired ethernet and for wireless networks in the system firmware, but requires OS support for wireless access once the OS drivers have loaded
[3] Assuming you're using integrated Intel graphics
[4] "Consumer" is a bit of a misnomer here - "enterprise" laptops like Thinkpads ship with AMT, but are often bought by consumers.

comment count unavailable comments

November 28, 2017 03:45 AM

November 26, 2017

Michael Kerrisk (manpages): Next Linux/UNIX System Programming course in Munich, 5-9 February, 2018

There are still some places free for my next 5-day Linux/UNIX System Programming course to take place in Munich, Germany, for the week of 5-9 February 2018.

The course is intended for programmers developing system-level, embedded, or network applications for Linux and UNIX systems, or programmers porting such applications from other operating systems (e.g., proprietary embedded/realtime operaring systems or Windows) to Linux or UNIX. The course is based on my book, The Linux Programming Interface (TLPI), and covers topics such as low-level file I/O; signals and timers; creating processes and executing programs; POSIX threads programming; interprocess communication (pipes, FIFOs, message queues, semaphores, shared memory), and network programming (sockets).
     
The course has a lecture+lab format, and devotes substantial time to working on some carefully chosen programming exercises that put the "theory" into practice. Students receive printed and electronic copies of TLPI, along with a 600-page course book that includes all slides presented in the course. A reading knowledge of C is assumed; no previous system programming experience is needed.

Some useful links for anyone interested in the course:

Questions about the course? Email me via training@man7.org.

November 26, 2017 07:38 PM

Michael Kerrisk (manpages): man-pages-4.14 is released

I've released man-pages-4.14. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from 71 contributors. Nearly 400 commits changed more than 160 pages. In addition, 4 new manual pages were added.

Among the more significant changes in man-pages-4.14 are the following:

November 26, 2017 07:14 PM