Kernel Planet

February 27, 2020

James Morris: Linux Security Summit North America 2020: CFP and Registration

The CFP for the 2020 Linux Security Summit North America is currently open, and closes on March 31st.

The CFP details are here: https://events.linuxfoundation.org/linux-security-summit-north-america/program/cfp/

You can register as an attendee here: https://events.linuxfoundation.org/linux-security-summit-north-america/register/

Note that the conference this year has moved from August to June (24-26).  The location is Austin, TX, and we are co-located with the Open Source Summit as usual.

We’ll be holding a 3-day event again, after the success of last year’s expansion, which provides time for tutorials and ad-hoc break out sessions.  Please note that if you intend to submit a tutorial, you should be a core developer of the project or otherwise recognized leader in the field, per this guidance from the CFP:

Tutorial sessions should be focused on advanced Linux security defense topics within areas such as the kernel, compiler, and security-related libraries.  Priority will be given to tutorials created for this conference, and those where the presenter is a leading subject matter expert on the topic.

This will be the 10th anniversary of the Linux Security Summit, which was first held in 2010 in Boston as a one day event.

Get your proposals for 2020 in soon!

February 27, 2020 09:46 PM

February 26, 2020

Linux Plumbers Conference: Videos for microconferences

The videos for all the talks in microconferences at the 2019 edition of Linux Plumbers are now linked to the schedule. Clicking on the link titled “video” will take you to the right spot in the microconference video. Hopefully, watching all of these talks will get you excited for the 2020 edition which we are busy preparing! Watch out for our call for microconferences and for our refereed track both of which are to be released soon. So now’s the time to start thinking about all the exciting problems you want to discuss and solve.

February 26, 2020 03:09 PM

February 20, 2020

Matthew Garrett: What usage restrictions can we place in a free software license?

Growing awareness of the wider social and political impact of software development has led to efforts to write licenses that prevent software being used to engage in acts that are seen as socially harmful, with the Hippocratic License being perhaps the most discussed example (although the JSON license's requirement that the software be used for good, not evil, is arguably an earlier version of the theme). The problem with these licenses is that they're pretty much universally considered to fall outside the definition of free software or open source licenses due to their restrictions on use, and there's a whole bunch of people who have very strong feelings that this is a very important thing. There's also the more fundamental underlying point that it's hard to write a license like this where everyone agrees on whether a specific thing is bad or not (eg, while many people working on a project may feel that it's reasonable to prohibit the software being used to support drone strikes, others may feel that the project shouldn't have a position on the use of the software to support drone strikes and some may even feel that some people should be the victims of drone strikes). This is, it turns out, all quite complicated.

But there is something that many (but not all) people in the free software community agree on - certain restrictions are legitimate if they ultimately provide more freedom. Traditionally this was limited to restrictions on distribution (eg, the GPL requires that your recipient be able to obtain corresponding source code, and for GPLv3 must also be able to obtain the necessary signing keys to be able to replace it in covered devices), but more recently there's been some restrictions that don't require distribution. The best known is probably the clause in the Affero GPL (or AGPL) that requires that users interacting with covered code over a network be able to download the source code, but the Cryptographic Autonomy License (recently approved as an Open Source license) goes further and requires that users be able to obtain their data in order to self-host an equivalent instance.

We can construct examples of where these prevent certain fields of endeavour, but the tradeoff has been deemed worth it - the benefits to user freedom that these licenses provide is greater than the corresponding cost to what you can do. How far can that tradeoff be pushed? So, here's a thought experiment. What if we write a license that's something like the following:

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. All permissions granted by this license must be passed on to all recipients of modified or unmodified versions of this work
2. This work may not be used in any way that impairs any individual's ability to exercise the permissions granted by this license, whether or not they have received a copy of the covered work


This feels like the logical extreme of the argument. Any way you could use the covered work that would restrict someone else's ability to do the same is prohibited. This means that, for example, you couldn't use the software to implement a DRM mechanism that the user couldn't replace (along the lines of GPLv3's anti-Tivoisation clause), but it would also mean that you couldn't use the software to kill someone with a drone (doing so would impair their ability to make use of the software). The net effect is along the lines of the Hippocratic license, but it's framed in a way that is focused on user freedom.

To be clear, I don't think this is a good license - it has a bunch of unfortunate consequences like it being impossible to use covered code in self-defence if doing so would impair your attacker's ability to use the software. I'm not advocating this as a solution to anything. But I am interested in seeing whether the perception of the argument changes when we refocus it on user freedom as opposed to an independent ethical goal.

Thoughts?

Edit:

Rich Felker on Twitter had an interesting thought - if clause 2 above is replaced with:

2. Your rights under this license terminate if you impair any individual's ability to exercise the permissions granted by this license, even if the covered work is not used to do so

how does that change things? My gut feeling is that covering actions that are unrelated to the use of the software might be a reach too far, but it gets away from the idea that it's your use of the software that triggers the clause.

comment count unavailable comments

February 20, 2020 01:33 AM

February 19, 2020

Kees Cook: security things in Linux v5.4

Previously: v5.3.

Linux kernel v5.4 was released in late November. The holidays got the best of me, but better late than never! ;) Here are some security-related things I found interesting:

waitid() gains P_PIDFD
Christian Brauner has continued his pidfd work by adding a critical mode to waitid(): P_PIDFD. This makes it possible to reap child processes via a pidfd, and completes the interfaces needed for the bulk of programs performing process lifecycle management. (i.e. a pidfd can come from /proc or clone(), and can be waited on with waitid().)

kernel lockdown
After something on the order of 8 years, Linux can now draw a bright line between “ring 0” (kernel memory) and “uid 0” (highest privilege level in userspace). The “kernel lockdown” feature, which has been an out-of-tree patch series in most Linux distros for almost as many years, attempts to enumerate all the intentional ways (i.e. interfaces not flaws) userspace might be able to read or modify kernel memory (or execute in kernel space), and disable them. While Matthew Garrett made the internal details fine-grained controllable, the basic lockdown LSM can be set to either disabled, “integrity” (kernel memory can be read but not written), or “confidentiality” (no kernel memory reads or writes). Beyond closing the many holes between userspace and the kernel, if new interfaces are added to the kernel that might violate kernel integrity or confidentiality, now there is a place to put the access control to make everyone happy and there doesn’t need to be a rehashing of the age old fight between “but root has full kernel access” vs “not in some system configurations”.

tagged memory relaxed syscall ABI
Andrey Konovalov (with Catalin Marinas and others) introduced a way to enable a “relaxed” tagged memory syscall ABI in the kernel. This means programs running on hardware that supports memory tags (or “versioning”, or “coloring”) in the upper (non-VMA) bits of a pointer address can use these addresses with the kernel without things going crazy. This is effectively teaching the kernel to ignore these high bits in places where they make no sense (i.e. mathematical comparisons) and keeping them in place where they have meaning (i.e. pointer dereferences).

As an example, if a userspace memory allocator had returned the address 0x0f00000010000000 (VMA address 0x10000000, with, say, a “high bits” tag of 0x0f), and a program used this range during a syscall that ultimately called copy_from_user() on it, the initial range check would fail if the tag bits were left in place: “that’s not a userspace address; it is greater than TASK_SIZE (0x0000800000000000)!”, so they are stripped for that check. During the actual copy into kernel memory, the tag is left in place so that when the hardware dereferences the pointer, the pointer tag can be checked against the expected tag assigned to referenced memory region. If there is a mismatch, the hardware will trigger the memory tagging protection.

Right now programs running on Sparc M7 CPUs with ADI (Application Data Integrity) can use this for hardware tagged memory, ARMv8 CPUs can use TBI (Top Byte Ignore) for software memory tagging, and eventually there will be ARMv8.5-A CPUs with MTE (Memory Tagging Extension).

boot entropy improvement
Thomas Gleixner got fed up with poor boot-time entropy and trolled Linus into coming up with reasonable way to add entropy on modern CPUs, taking advantage of timing noise, cycle counter jitter, and perhaps even the variability of speculative execution. This means that there shouldn’t be mysterious multi-second (or multi-minute!) hangs at boot when some systems don’t have enough entropy to service getrandom() syscalls from systemd or the like.

userspace writes to swap files blocked
From the department of “how did this go unnoticed for so long?”, Darrick J. Wong fixed the kernel to not allow writes from userspace to active swap files. Without this, it was possible for a user (usually root) with write access to a swap file to modify its contents, thereby changing memory contents of a process once it got paged back in. While root normally could just use CAP_PTRACE to modify a running process directly, this was a loophole that allowed lesser-privileged users (e.g. anyone in the “disk” group) without the needed capabilities to still bypass ptrace restrictions.

limit strscpy() sizes to INT_MAX
Generally speaking, if a size variable ends up larger than INT_MAX, some calculation somewhere has overflowed. And even if not, it’s probably going to hit code somewhere nearby that won’t deal well with the result. As already done in the VFS core, and vsprintf(), I added a check to strscpy() to reject sizes larger than INT_MAX.

ld.gold support removed
Thomas Gleixner removed support for the gold linker. While this isn’t providing a direct security benefit, ld.gold has been a constant source of weird bugs. Specifically where I’ve noticed, it had been pain while developing KASLR, and has more recently been causing problems while stabilizing building the kernel with Clang. Having this linker support removed makes things much easier going forward. There are enough weird bugs to fix in Clang and ld.lld. ;)

Intel TSX disabled
Given the use of Intel’s Transactional Synchronization Extensions (TSX) CPU feature by attackers to exploit speculation flaws, Pawan Gupta disabled the feature by default on CPUs that support disabling TSX.

That’s all I have for this version. Let me know if I missed anything. :) Next up is Linux v5.5!

© 2020, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

February 19, 2020 12:37 AM

February 09, 2020

Michael Kerrisk (manpages): man-pages-5.05 is released

I've released man-pages-5.05. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from more than 40 contributors. The release includes approximately 110 commits that change around 50 pages.

February 09, 2020 04:55 PM

February 01, 2020

David Sterba: Btrfs hilights in 5.5: 3-copy and 4-copy block groups

A bit more detailed overview of a btrfs update that I find interesting, see the pull request for the rest.

New block group profiles RAID1C3 and RAID1C4

There are two new block group profiles enhancing capabilities of the RAID1 types with more copies than 2. Brief overview of the profiles is in the table below, for table with all profiles see manual page of mkfs.brtfs, also available on wiki.

Profile Copies Utilization Min devices
RAID1 2 50% 2
RAID1C3 3 33% 3
RAID1C4 4 25% 4

The way all the RAID1 types work is that there are 2 / 3 / 4 exact copies over all available devices. The terminology is different from linux MD RAID, that can do any number of copies. We decided not to do that in btrfs to keep the implementation simple. Another point for simplicity is from the users’ perspective. That RAID1C3 provides 3 copies is clear from the type. Even after adding a new device and not doing balance, the guarantees about redundancy still hold. Newly written data will use the new device together with 2 devices from the original set.

Compare that with a hypothetical RAID1CN, on a filesystem with M devices (N <= M). When the filesystem starts with 2 devices, equivalent to RAID1, adding a new one will have mixed redundancy guarantees after writing more data. Old data with RAID1, new with RAID1C3 – but all accounted under RAID1CN profile. A full re-balance would be required to make it a reliable 3-copy RAID1. Add another device, going to RAID1C4, same problem with more data to shuffle around.

The allocation policy would depend on number of devices, making it hard for the user to know the redundancy level. This is already the case for RAID0/RAID5/RAID6. For the striped profile RAID0 it’s not much of a problem, there’s no redundancy. For the parity profiles it’s been a known problem and new balance filter stripe has been added to support fine grained selection of block groups.

Speaking about RAID6, there’s the elephant in the room, trying to cover write hole. Lack of a resiliency against 2 device damage has been bothering all of us because of the known write hole problem in the RAID6 implementation. How this is going to be addressed is for another post, but for now, the newly added RAID1C3 profile is a reasonable substitute for RAID6.

How to use it

On a freshly created filesystem it’s simple:

# mkfs.btrfs -d raid1c3 -m raid1c4 /dev/sd[abcd]

The command combines both new profiles for sake of demonstration, you should always consider the expected use and required guarantees and choose the appropriate profiles.

Changing the profile later on an existing filesystem works as usual, you can use:

# btrfs balance start -mconvert=raid1c3 /mnt/path

Provided there are enough devices and enough space to do the conversion, this will go through all metadadata block groups and after it finishes, all of them will be of the of the desired type.

Backward compatibility

The new block groups are not understood by old kernels and can’t be mounted, not even in the read-only mode. To prevent that a new incompatibility bit is introduced, called raid1c34. Its presence on a device can be checked by btrfs inspect-internal dump-super in the incompat_flags. On a running system the incompat features are exported in sysfs, /sys/fs/btrfs/UUID/features/raid1c34.

Outlook

There is no demand for RAID1C5 at the moment (I asked more than once). The space utilization is low already, the RAID1C4 survives 3 dead devices so IMHO this is enough for most users. Extending resilience to more devices should perhaps take a different route.

With more copies there’s potential for parallelization of reads from multiple devices. Up to now this is not optimal, there’s a decision logic that’s semi-random based on process ID of the btrfs worker threads or process submitting the IO. Better load balancing policy is a work in progress and could appear in 5.7 at the earliest (because 5.6 development is now in fixes-only mode).

Look back

The history of the patchset is a bit bumpy. There was enough motivation and requests for the functionality, so I started the analysis what needs to be done. Several cleanups were necessary to unify code and to make it easily extendable for more copies while using the same mirroring code. In the end change a few constants and be done.

Following with testing, I tried simple mkfs and conversions, that worked well. Then scrub, overwrite some blocks and let the auto-repair do the work. No hiccups. The remaining and important part was the device replace, as the expected use case was to substitute RAID6, replacing a missing or damaged disk. I wrote the test script, replace 1 missing, replace 2 missing. And it did not work. While the filesystem was mounted, everything seemed OK. Unmount, check again and the devices were still missing. Not cool, right.

Due to lack of time before the upcoming merge window (a code freeze before next development cycle), I had to declare it not ready and put it aside. This was in late 2018. For a highly requested feature this was not an easy decision. Should it be something less important, the development cycle between rc1 and final release provides enough time to fix things up. But due to the maintainer role with its demands I was not confident that I could find enough time to debug and fix the remaining problem. Also nobody offered help to continue the work, but that’s how it goes.

At the late 2019 I had some spare time and looked at the pending work again. Enhanced the test script with more debugging messages and more checks. The code worked well, the test script was subtly broken. Oh well, what a blunder. That cost a year, but on the other hand releasing a highly requested feature that lacks an important part was not an appealing option.

The patchset was added to 5.5 development queue at about the last time before freeze, final 5.5 release happened a week ago.

February 01, 2020 11:00 PM

January 28, 2020

Matthew Garrett: Avoiding gaps in IOMMU protection at boot

When you save a large file to disk or upload a large texture to your graphics card, you probably don't want your CPU to sit there spending an extended period of time copying data between system memory and the relevant peripheral - it could be doing something more useful instead. As a result, most hardware that deals with large quantities of data is capable of Direct Memory Access (or DMA). DMA-capable devices are able to access system memory directly without the aid of the CPU - the CPU simply tells the device which region of memory to copy and then leaves it to get on with things. However, we also need to get data back to system memory, so DMA is bidirectional. This means that DMA-capable devices are able to read and write directly to system memory.

As long as devices are entirely under the control of the OS, this seems fine. However, this isn't always true - there may be bugs, the device may be passed through to a guest VM (and so no longer under the control of the host OS) or the device may be running firmware that makes it actively malicious. The third is an important point here - while we usually think of DMA as something that has to be set up by the OS, at a technical level the transactions are initiated by the device. A device that's running hostile firmware is entirely capable of choosing what and where to DMA.

Most reasonably recent hardware includes an IOMMU to handle this. The CPU's MMU exists to define which regions of memory a process can read or write - the IOMMU does the same but for external IO devices. An operating system that knows how to use the IOMMU can allocate specific regions of memory that a device can DMA to or from, and any attempt to access memory outside those regions will fail. This was originally intended to handle passing devices through to guests (the host can protect itself by restricting any DMA to memory belonging to the guest - if the guest tries to read or write to memory belonging to the host, the attempt will fail), but is just as relevant to preventing malicious devices from extracting secrets from your OS or even modifying the runtime state of the OS.

But setting things up in the OS isn't sufficient. If an attacker is able to trigger arbitrary DMA before the OS has started then they can tamper with the system firmware or your bootloader and modify the kernel before it even starts running. So ideally you want your firmware to set up the IOMMU before it even enables any external devices, and newer firmware should actually do this automatically. It sounds like the problem is solved.

Except there's a problem. Not all operating systems know how to program the IOMMU, and if a naive OS fails to remove the IOMMU mappings and asks a device to DMA to an address that the IOMMU doesn't grant access to then things are likely to explode messily. EFI has an explicit transition between the boot environment and the runtime environment triggered when the OS or bootloader calls ExitBootServices(). Various EFI components have registered callbacks that are triggered at this point, and the IOMMU driver will (in general) then tear down the IOMMU mappings before passing control to the OS. If the OS is IOMMU aware it'll then program new mappings, but there's a brief window where the IOMMU protection is missing - and a sufficiently malicious device could take advantage of that.

The ideal solution would be a protocol that allowed the OS to indicate to the firmware that it supported this functionality and request that the firmware not remove it, but in the absence of such a protocol we're left with non-ideal solutions. One is to prevent devices from being able to DMA in the first place, which means the absence of any IOMMU restrictions is largely irrelevant. Every PCI device has a busmaster bit - if the busmaster bit is disabled, the device shouldn't start any DMA transactions. Clearing that seems like a straightforward approach. Unfortunately this bit is under the control of the device itself, so a malicious device can just ignore this and do DMA anyway. Fortunately, PCI bridges and PCIe root ports should only forward DMA transactions if their busmaster bit is set. If we clear that then any devices downstream of the bridge or port shouldn't be able to DMA, no matter how malicious they are. Linux will only re-enable the bit after it's done IOMMU setup, so we should then be in a much more secure state - we still need to trust that our motherboard chipset isn't malicious, but we don't need to trust individual third party PCI devices.

This patch just got merged, adding support for this. My original version did nothing other than clear the bits on bridge devices, but this did have the potential for breaking devices that were still carrying out DMA at the moment this code ran. Ard modified it to call the driver shutdown code for each device behind a bridge before disabling DMA on the bridge, which in theory makes this safe but does still depend on the firmware drivers behaving correctly. As a result it's not enabled by default - you can either turn it on in kernel config or pass the efi=disable_early_pci_dma kernel command line argument.

In combination with firmware that does the right thing, this should ensure that Linux systems can be protected against malicious PCI devices throughout the entire boot process.

comment count unavailable comments

January 28, 2020 11:19 PM

January 27, 2020

Pete Zaitcev: Too Real

From CKS:

The first useful property Python has is that you can't misplace the source code for your deployed Python programs.

January 27, 2020 04:46 PM

January 26, 2020

Paul E. Mc Kenney: Confessions of a Recovering Proprietary Programmer, Part XVII

One of the gatherings I attended last year featured a young man asking if anyone felt comfortable doing git rebase “without adult supervision”, as he put it. He seemed as surprised to see anyone answer in the affirmative as I was to see only a very few people so answer. This seems to me to be a suboptimal state of affairs, and thus this post describes how you, too, can learn to become comfortable doing git rebase “without adult supervision”.

Use gitk to See What You Are Doing

The first trick is to be able to see what you are doing while you are doing it. This is nothing particularly obscure or new, and is in fact why screen editors are preferred over line-oriented editors (remember ed?). And gitk displays your commits and shows how they are connected, including any branching and merging. The current commit is evident (yellow rather than blue circle) as is the current branch, if any (branch name in bold font). As with screen editors, this display helps avoid inevitable errors stemming from you and git disagreeing on the state of the repository. Such disagreements were especially common when I was first learning git. Given that git always prevailed in these sorts of disagreements, I heartily recommend using gitk even when you are restricting yourself to the less advanced git commands.

Note that gitk opens a new window, which may not work in all environments. In such cases, the --graph --pretty=oneline arguments to the git log command will give you a static ASCII-art approximation of the gitk display. As such, this approach is similar to using a line-oriented editor, but printing out the local lines every so often. In other words, it is better than nothing, but not as good as might be hoped for.

Fortunately, one of my colleagues pointed me at tig, which provides a dynamic ASCII-art display of the selected commits. This is again not as good as gitk, but it is probably as good as it gets in a text-only environment.

These tools do have their limits, and other techniques are required if you are actively rearranging more than a few hundred commits. If you are in that situation, you should look into the workflows used by high-level maintainers or by the -stable maintainer, who commonly wrangle many hundreds or even thousands of commits. Extreme numbers of commits will of course require significant automation, and many large-scale maintainers do in fact support their workflows with elaborate scripting.

Doing advanced git work without being able to see what you are doing is about as much a recipe for success as chopping wood in the dark. So do yourself a favor and use tools that allow you to see what you are doing!

Make Sure You Can Get Back To Where You Started

A common git rebase horror story involves a mistake made while rebasing, but with the git garbage collector erasing the starting point, so that there is no going back. As the old saying goes, “to err is human”, so such stories are all too plausible. But it is dead simple to give this horror story a happy ending: Simply create a branch at your starting point before doing git rebase:

git branch starting-point
git rebase -i --onto destination-commit base-commit rebase-branch
# The rebased commits are broken, perhaps misresolved conflicts?
git checkout starting-point # or maybe: git checkout -B rebase-branch starting-branch


Alternatively, if you are using git in a distributed environment, you can push all your changes to the master repository before trying the unfamiliar command. Then if things go wrong, you can simply destroy your copy, re-clone the repository, and start over.

Whichever approach you choose, the benefit of ensuring that you can return to your starting point is the ability to repeat the git rebase as many times as needed to arrive at the desired result. Sort of like playing a video game, when you think about it.

Practice on an Experimental Repository

On-the-job training can be a wonderful thing, but sometimes it is better to create an experimental repository for the sole purpose of practicing your git commands. But sometimes, you need a repository with lots of commits to provide a realistic environment for your practice session. In that case, it might be worthwhile to clone another copy of your working repository and do your practicing there. After all, you can always remove the repository after you have finished practicing.

And there are some commands that have such far-reaching effects that I always do a dry-run on a sacrificial repository before trying it in real life. The poster boy for such a command is git filter-branch, which has impressive power for both good and evil.

 

In summary, to use advanced git commands without adult supervision, first make sure that you can see what you are doing, then make sure that you can get back to where you started, and finally, practice makes perfect!

January 26, 2020 01:22 AM

January 21, 2020

Linux Plumbers Conference: Welcome to the 2020 Linux Plumbers Conference blog

Planning for the 2020 Linux Plumbers Conference is well underway. The planning committee will be posting various informational blurbs here, including information on hotels, microconference acceptance, evening events, scheduling, and so on. Next up will be a “call for microconferences” that should appear soon.

LPC this year will be held at the Marriott Harbourfront Hotel in Halifax, Nova Scotia from 25-27 August.

January 21, 2020 10:27 PM

January 20, 2020

David Sterba: BLAKE3 vs BLAKE2 for BTRFS

Irony isn’t it. The paint of BLAKE2 as BTRFS checksum algorithm hasn’t dried yet, 1-2 weeks to go but there’s a successor to it. Faster, yet still supposed to be strong. For a second or two I considered ripping out all the work and … no not really but I do admit the excitement.

Speed and strength are competing goals for a hash algorithm. The speed can be evaluated by anyone, not so much for the strength. I am no cryptographer and for that area rely on expertise and opinion of others. That BLAKE was a SHA3 finalist is a good indication, where BLAKE2 is it’s successor, weakened but not weak. BLAKE3 is yet another step trading off strength and speed.

Regarding BTRFS, BLAKE2 is going to be the faster of strong hashes for now (the other one is SHA256). The argument I have for it now is proof of time. It’s been deployed in many projects (even crypto currencies!), there are optimized implementations, various language ports.

The look ahead regarding more checksums is to revisit them in about 5 years. Hopefully by that time there will be deployments, real workload performance evaluations and overall user experience that will back future decisions.

Maybe there are going to be new strong yet fast hashes developed. During my research I learned about Kangaroo 12 that’s a reduced version of SHA3 (Keccak). The hash is constructed in a different way, perhaps there might be a Kangaroo 2π one day on par with BLAKE3. Or something else. Why not EDON-R, it’s #1 in many of the cr.yp.to/hash benchmarks? Another thing I learned during the research is that hash algorithms are twelve in a dozen, IOW too many to choose from. That Kangaroo 12 is internally of a different construction might be a point for selecting it to have wider range of “building block types”.

Quick evaluation

For BTRFS I have a micro benchmark, repeatedly hashing a 4 KiB block and using cycles per block as a metric.

Hash Total cycles Cycles/iteration Perf vs BLAKE3 Perf vs BLAKE2b
BLAKE3 (AVX2) 111260245256 11126 1.0 0.876 (-13%)
BLAKE2b (AVX2) 127009487092 12700 1.141 (+14%) 1.0
BLAKE2b (AVX) 166426785907 16642 1.496 (+50%) 1.310 (+31%)
BLAKE2b (ref) 225053579540 22505 2.022 (+102%) 1.772 (+77%)

Right now there’s only the reference Rust implementation and a derived C implementation of BLAKE3, claimed not to be optimized but from my other experience the compiler can do a good job optimizing programmers ideas away. There’s only one BLAKE3 entry with the AVX2 implementation, the best hardware support my testing box provides. As I had the other results of BLAKE2 at hand, they’re in the table for comparison, but the most interesting pair are the AVX2 versions anyway.

The improvement is 13-14%. Not much ain’t it, way less that the announced 4+x faster than BLAKE2b. Well, it’s always important to interpret results of a benchmark with respect to the environment of measurement and the tested parameters.

For BTRFS filesystem the block size is always going to be in kilobytes. I can’t find what was the size of the official benchmark results, the bench.rs script iterates over various sizes, so I assume it’s an average. Short input buffers can skew the results as the setup/output overhead can be significant, while for long buffers the compression phase is significant. I don’t have explanation for the difference and won’t draw conclusions about BLAKE3 in general.

One thing that I dare to claim is that I can sleep well because upon the above evaluation, BLAKE3 won’t bring a notable improvement if used as a checksum hash.

References

Personal addendum

During the evaluations now and in the past, I’ve found it convenient if there’s an offer of implementations in various languages. That eg. Keccak project pages does not point me directly to a C implementation slightly annoyed me, but the reference implementation in C++ was worse than BLAKE2 I did not take the next step to compare the C version, wherever I would find it.

BLAKE3 is fresh and Rust seems to be the only thing that has been improved since the initial release. A plain C implementation without any warning-not-optimized labels would be good. I think that C versions will appear eventually, besides that Rust is now the new language hotness, there are projects not yet “let’s rewrite it in Rust”. Please Bear with us.

January 20, 2020 11:00 PM

Matthew Garrett: Verifying your system state in a secure and private way

Most modern PCs have a Trusted Platform Module (TPM) and firmware that, together, support something called Trusted Boot. In Trusted Boot, each component in the boot chain generates a series of measurements of next component of the boot process and relevant configuration. These measurements are pushed to the TPM where they're combined with the existing values stored in a series of Platform Configuration Registers (PCRs) in such a way that the final PCR value depends on both the value and the order of the measurements it's given. If any measurements change, the final PCR value changes.

Windows takes advantage of this with its Bitlocker disk encryption technology. The disk encryption key is stored in the TPM along with a policy that tells it to release it only if a specific set of PCR values is correct. By default, the TPM will release the encryption key automatically if the PCR values match and the system will just transparently boot. If someone tampers with the boot process or configuration, the PCR values will no longer match and boot will halt to allow the user to provide the disk key in some other way.

Unfortunately the TPM keeps no record of how it got to a specific state. If the PCR values don't match, that's all we know - the TPM is unable to tell us what changed to result in this breakage. Fortunately, the system firmware maintains an event log as we go along. Each measurement that's pushed to the TPM is accompanied by a new entry in the event log, containing not only the hash that was pushed to the TPM but also metadata that tells us what was measured and why. Since the algorithm the TPM uses to calculate the hash values is known, we can replay the same values from the event log and verify that we end up with the same final value that's in the TPM. We can then examine the event log to see what changed.

Unfortunately, the event log is stored in unprotected system RAM. In order to be able to trust it we need to compare the values in the event log (which can be tampered with) with the values in the TPM (which are much harder to tamper with). Unfortunately if someone has tampered with the event log then they could also have tampered with the bits of the OS that are doing that comparison. Put simply, if the machine is in a potentially untrustworthy state, we can't trust that machine to tell us anything about itself.

This is solved using a procedure called Remote Attestation. The TPM can be asked to provide a digital signature of the PCR values, and this can be passed to a remote system along with the event log. That remote system can then examine the event log, make sure it corresponds to the signed PCR values and make a security decision based on the contents of the event log rather than just on the final PCR values. This makes the system significantly more flexible and aids diagnostics. Unfortunately, it also means you need a remote server and an internet connection and then some way for that remote server to tell you whether it thinks your system is trustworthy and also you need some way to believe that the remote server is trustworthy and all of this is well not ideal if you're not an enterprise.

Last week I gave a talk at linux.conf.au on one way around this. Basically, remote attestation places no constraints on the network protocol in use - while the implementations that exist all do this over IP, there's no requirement for them to do so. So I wrote an implementation that runs over Bluetooth, in theory allowing you to use your phone to serve as the remote agent. If you trust your phone, you can use it as a tool for determining if you should trust your laptop.

I've pushed some code that demos this. The current implementation does nothing other than tell you whether UEFI Secure Boot was enabled or not, and it's also not currently running on a phone. The phone bit of this is pretty straightforward to fix, but the rest is somewhat harder.

The big issue we face is that we frequently don't know what event log values we should be seeing. The first few values are produced by the system firmware and there's no standardised way to publish the expected values. The Linux Vendor Firmware Service has support for publishing these values, so for some systems we can get hold of this. But then you get to measurements of your bootloader and kernel, and those change every time you do an update. Ideally we'd have tooling for Linux distributions to publish known good values for each package version and for that to be common across distributions. This would allow tools to download metadata and verify that measurements correspond to legitimate builds from the distribution in question.

This does still leave the problem of the initramfs. Since initramfs files are usually generated locally, and depend on the locally installed versions of tools at the point they're built, we end up with no good way to precalculate those values. I proposed a possible solution to this a while back, but have done absolutely nothing to help make that happen. I suck. The right way to do this may actually just be to turn initramfs images into pre-built artifacts and figure out the config at runtime (dracut actually supports a bunch of this already), so I'm going to spend a while playing with that.

If we can pull these pieces together then we can get to a place where you can boot your laptop and then, before typing any authentication details, have your phone compare each component in the boot process to expected values. Assistance in all of this extremely gratefully received.

comment count unavailable comments

January 20, 2020 12:53 PM

Paul E. Mc Kenney: The Old Man and His Macbook

I received a MacBook at the same time I received the smartphone. This was not my first encounter with a Mac, in fact, I long ago had the privilege of trying out a Lisa. I occasionally made use of the original Macintosh (perhaps most notably to prepare a resume when applying for a job at Sequent), and even briefly owned an iMac, purchased to run some educational software for my children. But that iMac was my last close contact with the Macintosh line, some 20 years before the MacBook: Since then, I have used Windows and more recently, Linux.

So how does the MacBook compare? Let's start with some positives:



There are of course some annoyances:

Overall impression? It is yet another laptop, with its own advantages, quirks, odd corners, and downsides. I can see how people who grew up on Macbook and who use nothing else could grow to love it passionately. But switching back and forth between MacBook and Linux is a bit jarring, though of course MacBook and Linux have much more in common than did the five different systems I switched back and forth between in the late 1970s.

My current plan is to stick with it for a year (nine months left!), and decide where to go from there. I might continue sticking with it, or I might try moving to Linux. We will see!

January 20, 2020 01:52 AM

January 19, 2020

Paul E. Mc Kenney: Other weighty matters

I used to be one of those disgusting people who could eat whatever he wanted, whenever he wanted, and as much as he wanted—and not gain weight.

In fact, towards the end of my teen years, I often grew very tired of eating. You see, what with all my running and growing, in order to maintain weight I had to eat until I felt nauseous. I would feel overstuffed for about 30 minutes and then I would feel fine for about two hours. Then I would be hungry again. In retrospect, perhaps I should have adopted hobbit-like eating habits, but then again, six meals a day does not mesh well with school and workplace schedules, to say nothing of with family traditions.

Once I stopped growing in my early 20s, I was able to eat more normally. Nevertheless, I rarely felt full. In fact, on one of those rare occasions when I did profess a feeling of fullness, my friends not only demanded that I give it to them in writing, but also that I sign and date the resulting document. This document was rendered somewhat less than fully official due to its being written on a whiteboard.

And even by age 40, eating what most would consider to be a normal diet caused my weight to drop dramatically and abruptly.

However, my metabolism continued to slow down, and my body's ability to tolerate vigorous exercise waned as well. But these change took place slowly, and so the number on the scale crept up almost imperceptibly.

But so what if I am carrying a little extra weight? Why should I worry?

Because I have a goal: Should I reach age 80, I would very much like to walk under my own power. And it doesn't take great powers of observation to conclude that carrying extra weight is not consistent with that goal. Therefore, I must pay close attention to the scale.

But life flowed quickly, so I continued failing to pay attention to the scale, at least not until a visit to airport in Florida. After passing through one of the full-body scanners, I was called out for a full-body search. A young man patted me down quite thoroughly, but wasn't able to find whatever it was that he was looking for. He called in a more experienced colleague, who quickly determined that what had apparently appeared to be a explosive device under my shirt was instead an embarrassingly thick layer of body fat. And yes, I did take entirely too much satisfaction from the fact that he chose to dress down his less-experienced colleague, but I could no longer deny that I was a good 25-30 pounds overweight. And in the poor guy's defense, the energy content of that portion of my body fat really did rival that of a small bomb. And, more to the point, the sheer mass of that fat was in no way consistent with my goal to be able to walk under my own power at age 80.

So let that be a lesson to you. If you refuse take the hint from your bathroom scale, you might well find yourself instead taking it from the United States of America's Transportation Security Administration.

Accepting the fact that I was overweight was one thing. Actually doing something about it was quite another. You see, my body had become a card-carrying member of House Stark, complete with their slogan: “Winter is coming.” And my body is wise in the ways of winter. It knows not only that winter is coming, but also that food will be hard to come by, especially given my slowing reflexes and decreasing agility. Now, my body has never actually seen such a winter, but countless generations of of natural selection have hammered a deep and abiding anticipation of such winters into my very DNA. Furthermore, my body knows exactly one way to deal with such a winter, and that is to eat well while the eating is good.

However, I have thus far had the privilege of living in a time and place where the eating is always good and where winter never comes, at least not the fearsome winters that my body is fanatically motivated to prepare for.

This line of thought reminded me of a piece written long ago by the late Isaac Asimov, in which he suggested that we should stop eating before we feel full. (Shortly after writing this, an acquaintance is said to have pointed out that Asimov could stand to lose some weight, and Asimov is said to have reacted by re-reading his own writing and then successfully implementing its recommendation.) The fact that I now weighed in at more than 210 pounds provided additional motivation.

With much effort, I was able to lose more than ten pounds, but then my weight crept back up again. I was able to keep my weight to about 205, and there it remained for some time.

At least, there it remained until I lost more than ten pounds due to illness. I figured that since I had paid the price of the illness, I owed it to myself to take full advantage of the resulting weight loss. Over a period of some months, I managed to get down to 190 pounds, which was a great improvement over 210, but significantly heavier than my 180-pound target weight.

But my weight remained stubbornly fixed at about 190 for some months.

Then I remembered the control systems class I took decades ago and realized that my body and I comprised a control system designed to maintain my weight at 190. You see, my body wanted a good fifty pounds of fat to give me a good chance of surviving the food-free winter that it knew was coming. So, yes, I wanted my weight to be 180. But only when the scale read 190 or more would I panic and take drastic action, such as fasting for a day, inspired by several colleagues' lifestyle fasts. Below 190, I would eat normally, that is, I would completely give in to my body's insistence that I gain weight.

As usual, the solution was simple but difficult to implement. I “simply” slowly decreased my panic point from 190 downwards, one pound at a time.

One of the ways that my body convinces me to overeat is through feelings of anxiety. “If I miss this meal, bad things will happen!!!” However, it is more difficult for my body to convince me that missing a meal would be a disaster if I have recently fasted. Therefore, fasting turned out to be an important component of my weight-loss regimen. A fast might mean just skipping breakfast, it might mean skipping both breakfast and lunch, or it might be a 24-hour fast. But note that a 24-hour fast skips first dinner, then breakfast, and finally lunch. Skipping breakfast, lunch, and then dinner results in more than 30 hours of fasting, which seems a bit excessive.

Of course, my body is also skilled at exploiting any opportunity for impulse eating, and I must confess that I do not yet consistently beat it at this game.

Exercise continues to be important, but it also introduces some complications. You see, exercise is inherently damaging to muscles. The strengthening effects of exercise are not due to the exercise itself, but rather to the body's efforts to repair the damage and then some. Therefore, in the 24 hours or so after exercise, my muscles suffer significant inflammation due to this damage, which results in a pound or two of added water weight (but note that everyone's body is different, so your mileage may vary). My body is not stupid, and so it quickly figured out that one of the consequences of a heavy workout was reduced rations the next day. It therefore produced all sorts of reasons why a heavy workout would be a bad idea, and with a significant rate of success.

So I allow myself an extra pound the day after a heavy workout. This way my body enjoys the exercise and gets to indulge the following day. Win-win! ;–)

There are also some foods that result in added water weight, with corned beef, ham, and bacon being prominent among them. The amount of water weight seems to vary based on I know not what, but sometimes ranges up to three pounds. I have not yet worked out exactly what to do about this, but one strategy might be to eat these types of food only on the day of a heavy workout. Another strategy would be to avoid them completely, but that is crazy talk, especially in the case of bacon.

So after two years, I have gotten down to 180, and stayed there for several months. What does the future hold?

Sadly, it is hard to say. In my case it appears that something like 90% of the effort required to lose weight is required to keep that weight off. So if you really do want to know what the future holds, all I can say is “Ask me in the future.”

But the difficulty of keeping weight off should come as no surprise.

After all, my body is still acutely aware that winter is coming!

January 19, 2020 11:53 PM

January 14, 2020

Pete Zaitcev: Nobody is Google

A little while ago I been to a talk by Michael Verhulst of Terminal Labs. He made a career of descending into Devops disaster zones and righting things up, and shared some observations. One of his aphorisms was:

Nobody Is Google — Not Even Google

If I understood him right, he meant to say that scalability costs money and businesses flounder on buying more scalability than they need. Or, people think they are Google, but they are not. Even inside Google, only a small fraction of services operate at Google scale (aka planetary scale).

Apparently this sort of thing happens quite often.

January 14, 2020 08:05 PM

January 08, 2020

Paul E. Mc Kenney: Parallel Programming: December 2019 Update

There is a new release of Is Parallel Programming Hard, And, If So, What Can You Do About It?.

This release features a number of formatting and build-system improvements by the indefatigible Akira Yokosawa. On the formatting side, we have listings automatically generated from source code, clever references, selective PDF hyperlink highlighting, and finally settling the old after-period one-space/two-space debate by mandating newline instead. On the build side, we improved checks for incompatible packages, SyncTeX database file generation (instigated by Balbir Singh), better identification of PDFs, build notes for recent Fedora releases, fixes for some multiple-figure page issues, and improved font handling, and a2ping workarounds for ever-troublesome Ghostscript. In addition, the .bib file format was dragged kicking and screaming out of the 1980s, prompted by Stamatis Karnouskos. The new format is said to be more compatible with modern .bib-file tooling.

On the content side, the “Hardware and its Habits”, “Tools of the Trade”, “Locking”, “Deferred Processing”, “Data Structures”, and “Formal Verification” chapters received some much needed attention, the latter by Akira, who also updated the “Shared-Variable Shenanigans” section based on a recent LWN article. SeongJae Park, Stamatis, and Zhang Kai fixed a large quantity of typos and addressed numerous other issues. There is now a full complement of top-level section epigraphs, and there are a few scalability results up to 420 CPUs, courtesy of a system provided by my new employer.

On the code side, there have been a number of bug fixes and updates from ACCESS_ONCE() to READ_ONCE() or WRITE_ONCE(), with significant contributions from Akira, Junchang Wang, and Slavomir Kaslev.

A full list of the changes since the previous release may be found here, and as always, git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git will be updated in real time.

January 08, 2020 05:02 AM

December 27, 2019

Paul E. Mc Kenney: Exit Libris, Take Two

Still the same number of bookshelves, and although I do have a smartphone, I have not yet succumbed to the ereader habit. So some books must go!

December 27, 2019 07:04 PM

Matthew Garrett: Wifi deauthentication attacks and home security

I live in a large apartment complex (it's literally a city block big), so I spend a disproportionate amount of time walking down corridors. Recently one of my neighbours installed a Ring wireless doorbell. By default these are motion activated (and the process for disabling motion detection is far from obvious), and if the owner subscribes to an appropriate plan these recordings are stored in the cloud. I'm not super enthusiastic about the idea of having my conversations recorded while I'm walking past someone's door, so I decided to look into the security of these devices.

One visit to Amazon later and I had a refurbished Ring Video Doorbell 2™ sitting on my desk. Tearing it down revealed it uses a TI SoC that's optimised for this sort of application, linked to a DSP that presumably does stuff like motion detection. The device spends most of its time in a sleep state where it generates no network activity, so on any wakeup it has to reassociate with the wireless network and start streaming data.

So we have a device that's silent and undetectable until it starts recording you, which isn't a great place to start from. But fortunately wifi has a few, uh, interesting design choices that mean we can still do something. The first is that even on an encrypted network, the packet headers are unencrypted and contain the address of the access point and whichever device is communicating. This means that it's possible to just dump whatever traffic is floating past and build up a collection of device addresses. Address ranges are allocated by the IEEE, so it's possible to map the addresses you see to manufacturers and get some idea of what's actually on the network[1] even if you can't see what they're actually transmitting. The second is that various management frames aren't encrypted, and so can be faked even if you don't have the network credentials.

The most interesting one here is the deauthentication frame that access points can use to tell clients that they're no longer welcome. These can be sent for a variety of reasons, including resource exhaustion or authentication failure. And, by default, they're entirely unprotected. Anyone can inject such a frame into your network and cause clients to believe they're no longer authorised to use the network, at which point they'll have to go through a new authentication cycle - and while they're doing that, they're not able to send any other packets.

So, the attack is to simply monitor the network for any devices that fall into the address range you want to target, and then immediately start shooting deauthentication frames at them once you see one. I hacked airodump-ng to ignore all clients that didn't look like a Ring, and then pasted in code from aireplay-ng to send deauthentication packets once it saw one. The problem here is that wifi cards can only be tuned to one frequency at a time, so unless you know the channel your potential target is on, you need to keep jumping between frequencies while looking for a target - and that means a target can potentially shoot off a notification while you're looking at other frequencies.

But even with that proviso, this seems to work reasonably reliably. I can hit the button on my Ring, see it show up in my hacked up code and see my phone receive no push notification. Even if it does get a notification, the doorbell is no longer accessible by the time I respond.

There's a couple of ways to avoid this attack. The first is to use 802.11w which protects management frames. A lot of hardware supports this, but it's generally disabled by default. The second is to just ignore deauthentication frames in the first place, which is a spec violation but also you're already building a device that exists to record strangers engaging in a range of legal activities so paying attention to social norms is clearly not a priority in any case.

Finally, none of this is even slightly new. A presentation from Def Con in 2016 covered this, demonstrating that Nest cameras could be blocked in the same way. The industry doesn't seem to have learned from this.

[1] The Ring Video Doorbell 2 just uses addresses from TI's range rather than anything Ring specific, unfortunately

comment count unavailable comments

December 27, 2019 03:26 AM

December 25, 2019

Paul E. Mc Kenney: The Old Man and His Smartphone, 2019 Holiday Season Episode

I used my smartphone as a camera very early on, but the need to log in made it less than attractive for snapshots. Except that I saw some of my colleagues whip out their smartphones and immediately take photos. They kindly let me in on the secret: Double-clicking the power button puts the phone directly into camera mode. This resulted in a substantial uptick in my smartphone-as-camera usage. And the camera is astonishingly good by decade-old digital-camera standards, to say nothing of old-school 35mm film standards.

I also learned how to make the camera refrain from mirror-imaging selfies, but this proved hard to get right. The selfie looks wrong when immediately viewed if it is not mirror imaged! I eventually positioned myself to include some text in the selfie in order to reliably verify proper orientation.

Those who know me will be amused to hear that I printed a map the other day, just from force of habit. But in the event, I forgot to bring not only both the map and the smartphone, but also the presents that I was supposed to be transporting. In pleasant contrast to a memorable prior year, I remembered the presents before crossing the Columbia, which was (sort of) in time to return home to fetch them. I didn't bother with either the map or the smartphone, but reached my destination nevertheless. Cautionary tales notwithstanding, sometimes you just have to trust the old neural net's direction-finding capabilities. (Or at least that is what I keep telling myself!)

I also joined the non-exclusive group who uses a smartphone to photograph whiteboards prior to erasing them. I still have not succumbed to the food-photography habit, though. Taking a selfie with the non-selfie lens through a mirror is possible, but surprisingly challenging.

I have done a bit of ride-sharing, and the location-sharing features are quite helpful when meeting someone—no need to agree on a unique landmark, only to find the hard way that said landmark is not all that unique!

The smartphone is surprisingly useful for browsing the web while on the go, with any annoyances over the small format heavily outweighed by the ability to start and stop browsing very quickly. But I could not help but feel a pang of jealousy while watching a better equipped smartphone user type using swiping motions rather than a finger-at-a-time approach. Of course, I could not help but try it. Imagine my delight to learn that the swiping-motion approach was not some add-on extra, but instead standard! Swiping typing is not a replacement for a full-sized keyboard, but it is a huge improvement over finger-at-a-time typing, to say nothing of my old multi-press flip phone.

Recent foreign travel required careful prioritization and scheduling of my sole international power adapter among the three devices needing it. But my new USB-A-to-USB-C adapter allows me to charge my smartphone from my heavy-duty rcutorture-capable ThinkPad, albeit significantly more slowly than via AC adapter, and even more slowly when the laptop is powered off. Especially when I am actively using the smartphone. To my surprise, I can also charge my MacBook from my ThinkPad using this same adapter—but only when the MacBook is powered off. If the MacBook is running, all this does is extend the MacBook's battery life. Which admittely might still be quite useful.

All in all, it looks like I can get by with just the one international AC adapter. This is a good thing, especially considering how bulky those things are!

My smartphone's notifications are still a bit annoying, though I have gotten it a bit better trained to only bother me when it is important. And yes, I have missed a few important notifications!!! When using my laptop, which also receives all these notifications, my defacto strategy has been to completely ignore the smartphone. Which more than once has had the unintended consequence of completely draining my smartphone's battery. The first time this happened was quite disconcerting because it appeared that I had bricked my new smartphone. Thankfully, a quick web search turned up the unintuitive trick of simultaneously depressing the volume-down and power buttons for ten seconds.

But if things go as they usually do, this two-button salute will soon become all too natural!

December 25, 2019 12:11 AM

December 23, 2019

Paul E. Mc Kenney: Weight-Training Followup

A couple of years ago I posted on my adventures with weightlifting, along with the dangers that disorganized and imbalanced weight-lifting regimes posed to the simple act of coughing.  A colleague pointed me to the book “Getting Stronger” by Bill Pearl.  I was quite impressed by the depth and breadth of this book's coverage of weightlifting, including even a section entitled “Training Program for Those over 50”.  Needless to say, I started with that section.

Any new regime will entail some discomfort, but I was encouraged by the fact that the discomfort caused by this book's “Over 50 Program” coincided with the muscles that had objected so strenuously during my coughing fits.  I continued this program until it became reasonably natural, and then progressed through the book's three General Conditioning programs.  Most recently I have been alternating three sets of the third program with one set of either of the first two programs, with between one and two weight workouts per week.  I fill in with stationary bicycle, elliptical trainer, or rowing machine for between three and five workouts per week.

I did note that my strength was sometimes inexplicably and wildly inconsistent.  On one exercise, where you kneel and pull down to lift a weight, I found that sometimes 160 lbs was quite easy, while other times 100 lbs was extremely difficult.  I could not correlate this variation in strength with anything in particular.  At least not until I made a closer examination of the weight machines.  It turned out that of the six stations, four provided a two-to-one mechanical advantage, so that when having randomly selected one of those four, I only needed to exert 80 lbs of force to lift the 160 lbs.  Mystery solved!  In another memorable instance, I was having great difficulty lifting the usual weight, then noticed that I had mistakenly attached to the barbell a pair of 35 lb weights instead of my customary pair of 25s.  In another case where an unexpected doubling of strength surprised me, I learned that the weights were in pounds despite being in a country that has long used the metric system.  Which is at least quite a bit less painful than making the opposite mistake!  All in all, my advice to you is to check weight machines carefully and also to first use small weights so as to double-check the system of measurement.

I initially ignored the book's section on stretching, relying instead on almost five decades of my own experience with stretching.  As you might guess, but this proved to be a mistake:  Decades of experience stretching for running does not necessarily carry over to weight lifting.  Therefore, as is embarassingly often the case, I advise you to do what I say rather than what I actually did.  So do yourself a favor and do the book's recommended set of stretches.

One current challenge is calf cramps while sleeping.  These are thankfully nowhere near as severe as those of my late teens, in which I would sometimes wake up flying through the air with one or the other of my legs stubbornly folded double at the knee.  Back then, I learned to avoid these by getting sufficient sodium and potassium, by massaging my plantar fascia (for example, by rolling it back and forth over a golf ball), by (carefully!!!) massaging the space between my achilles tendon and tibia/fibula, and of course by stretching my calf.

Fortunately, my current bouts of calf cramps can be headed off simply by rotating my feet around my ankle, so that my big toe makes a circle in the air.  This motion is of course not possible if the cramp has already taken hold, but rocking my foot side to side brought immediate relief on the last bout, which is much less painful than my traditional approach that involves copious quantities of brute force and awkwardness.  Here is hoping that this technique continues to be effective!

December 23, 2019 01:19 AM

December 22, 2019

Brendan Gregg: BPF Theremin, Tetris, and Typewriters

For my AWS re:Invent talk on BPF Performance Analysis at Netflix, I began with a demo of "BPF superpowers" (aka eBPF). The video is on [youtube] and the demo starts at [0:50] \(the sildes are on [slideshare]\):

I'm demonstrating a tool I developed to turn my laptop's wifi signal strength into audio. I first developed it as this [bpftrace] one-liner:
bpftrace -e 'k:__iwl_dbg /str(arg4) == "Rssi %d, TSF %llu\n"/ { printf("strength: %d\n", arg5);  }'
Attaching 1 probe...
strength: -46
strength: -46
strength: -45
strength: -45
[...]
This only works for the iwl driver. In the video I explained how I arrived at tracing \_\_iwl\_dbg() in this way, and how you can follow a similar approach for tracing unfamiliar code. Since I wanted to emit audio, I then switched to [BCC] and rewrote it in Python so I could use an audio library. This is not my best code, since I hacked it in a hurry, but here it is:
#!/usr/bin/python
#
# iwlstrength.py	iwl wifi stregth as audio
#			For Linux, uses BCC, eBPF. Embedded C.
#
# Tone generation from: https://gist.github.com/nekozing/5774628
#
# 29-Apr-2019	Brendan Gregg	Created this.

from __future__ import print_function
from bcc import BPF
import pygame, numpy, pygame.sndarray

# Sound setup
sampleRate = 44100
pygame.mixer.pre_init(sampleRate, -16, 1) 
pygame.init()
note = {}
for f in range(-100, 20):
	# pregenerate lookup table of frequency to note
	# this avoids the CPU cost at runtime
	# please, we need a better tone generation library (if it exists I didn't find it)
	freq = 1700 + (f * 15)
	peakvol = 4096
	note[f] = numpy.array([peakvol * numpy.sin(2.0 * numpy.pi * freq * x / sampleRate) for x in range(0, sampleRate / 5)]).astype(numpy.int16)
sound = pygame.sndarray.make_sound(note[0])

# BPF
b = BPF(text="""
#include linuxptrace.h>

int kprobe____iwl_dbg(struct pt_regs *ctx, int a, int b, int c, int d, const char *fmt, int val0) {
	bpf_trace_printk("%d %s\\n", val0, fmt);
	return 0;
}
""")

last=0
while 1:
	try:
		(task, pid, cpu, flags, ts, msg) = b.trace_fields()
		(val0, fmt) = msg.split(' ', 1)
		if (fmt == "Rssi %d, TSF %llu"):
			signal = int(val0);
			for c in range(0, (100 + signal) / 2):
				print("*", end="")
			print("")
			if (last != signal):
				sound.stop()
				sound = pygame.sndarray.make_sound(note[signal])
				sound.play(-1)
			last = signal
	except KeyboardInterrupt:
		exit()
	except ValueError:
		next

sound.stop()
Remember how this was a one-liner in bpftrace? If you want to develop your own BPF observability tools, you should start with bpftrace as I did here. My [BPF Performance Tools] book has plenty of examples. This is the culmination of five years of work: the BPF kernel runtime, C support, LLVM and Clang support, the BCC front-end, and finally the bpftrace language. Starting with other interfaces is like writing your first Java program in [JVM bytecode]. You can...but if you're looking for an educational challenge, your time is better spent using BPF tools to find performance wins (as I do in the book.) bpftrace became even more powerful on Linux 5.3, which raised the BPF instruction limit from four thousand instructions to effectively unlimited (one million instructions.) Masanori Misono has already ported [Tetris to bpftrace] \(originally by Tomoki Sekiyama):

It reads keystrokes by tracing the pty\_write() kernel function. On the topic of tracing keystrokes, David S. Miller (networking maintainer) recently asked if I still used my "noisy typewriter", which I had turned on a little too loud during a [LISA 2016] demo. Since I switched laptops it no longer works, so I just wrote it using BPF: [bpf-typewriter]. The source is:
#!/usr/local/bin/bpftrace --unsafe
#include linuxinput-event-codes.h>

kprobe:kbd_event
/arg1 == EV_KEY && arg3 == 1/
{
	if (arg2 == KEY_ENTER) {
		system("aplay -q clink.au >/dev/null 2>&1 &");
	} else {
		system("aplay -q click.au >/dev/null 2>&1 &");
	}
}
You can check it out [here]. Requires bpftrace. For many more bpftrace examples, check out my latest [book], which is arriving in time for Christmas (the paper version). [JVM bytecode]: https://medium.com/@davethomas_9528/writing-hello-world-in-java-byte-code-34f75428e0ad [youtube]: https://youtu.be/16slh29iN1g [0:50]: https://youtu.be/16slh29iN1g?t=50 [slideshare]: https://www.slideshare.net/brendangregg/reinvent-2019-bpf-performance-analysis-at-netflix [bpftrace]: https://github.com/iovisor/bpftrace [BCC]: https://github.com/iovisor/bcc [BPF Performance Tools]: /bpf-performance-tools-book.html [Tetris to bpftrace]: https://github.com/mmisono/bpftrace-tetris [noisy typewriter]: http://www.brendangregg.com/specials.html#typewriter [bpf-typewriter]: http://www.brendangregg.com/specials.html#typewriter [here]: http://www.brendangregg.com/specials.html#typewriter [LISA 2016]: https://www.youtube.com/watch?v=GsMs3n8CB6g [book]: /bpf-performance-tools-book.html

December 22, 2019 08:00 AM

December 17, 2019

Pete Zaitcev: Fedora versus Lulzbot

I selected Lulzbot Mini as my 3D printer in large part because of the strong connection between the true open source and the company. It came with some baggage: not the cheapest, stuck in the world of 3mm filament, Cura generates mediocre support. But I could tolerate all that.

However, some things happened.

One, the maker of the printer, Alef Objects collapsed and sold itself to FAME 3D.

Two, Fedora shipped a completely, utterly busted Cura twice: both Fedora 30 and Fedora 31 came out with the package that just cannot be used.

I am getting a tiny bit concerned for the support for my Lulzbot going forward. Since printers expend things like cleaning strips, availability of parts is a big problem. And I cannot live with software being unusable. Last time it happened, Spot fixed the bug pretty quickly, in a matter of days. So, only a tiny bit concerned.

But that bit is enough to consider alternatives. Again, FLOSS-friendly is paramount. The rest is not that important, but I need it to work.

December 17, 2019 09:34 PM

December 10, 2019

David Sterba: Btrfs hilights in 5.3

A bit more detailed overview of a btrfs update that I find interesting, see the pull request for the rest.

CRC32C uses accelerated versions on other architectures

… than just Intel-based ones. There was a hard coded check for the intel SSE feature providing the accelerated instruction, but this totally skipped other architectures. A brief check in linux.git/arch/*/crypto for files implementing the accelerated versions revealed that there’s a bunch of them: ARM, ARM64, MIPS, PowerPC, s390 and SPARC. I don’t have enough hardware to show the improvements, though.

Automatically remove incompat bit for RAID5/6

While this is not the fix everybody is waiting on, it’s demonstrating user-developer-user cycle how things can be improved. A filesystem created with RAID5 or -6 profiles sets an incompatibility bit. That’s supposed to be there as long as there’s any chunk using the profiles. User expectation is that the bit should be removed once the chunks are eg. balanced away. This is what got implemented and serves as a prior example for any future feature that might get removed on a given filesystem. (Note from the future: I did that for the RAID1C34 as well).

Side note: for the chunk profile it’s easy because the runtime check of presence is quick and requires only going through a list of chunk profiles types. Example of the worst case would be dropping the incompat bit for LZO after there are no compressed files using that algorithm. You can easily see that this would require either enumerating all extent metadata (one time check) or keeping track of count since mkfs time. This is IMHO not a typical request and can be eventually done using the btrfstune tool on an unmounted filesystem.

December 10, 2019 11:00 PM

Pete Zaitcev: ARM Again in 2019 or 2020

I was at this topic since NetWinder was a thing, but yet again I started looking at having an ARM box in the house. Arnd provided me with the following tips, when I asked about CuBox:

2GB of RAM [in the best CuBox] is not great, but doesn't stop you from building kerrnels. In this case, the CPU is going to be the real bottleneck.

Of course it's not SBSA, since this is an embedded 32 bit SoC. Almost no 64 bit machines are SBSA either (most big server chips claim compliance but are lacking somewhat), but generally nobody actually cares.

HoneyComb LX2K is probably what you want if you can spare the change and wait for the last missing bug fixes.

The RK3399 chip is probably a good compromise between performance and cost, there are various boards with 4GB of RAM and an m.2 slot for storage, around $100.

Why does have to be like this? All I wanted was to make usbmon work on ARM.

The assertion that nobody cares about SBSA is rather interesting. Obviously, nobody in the embedded area does. They just fork Linux, clone a bootloader, flash it and ship, and then your refrigerator sends spam and your TV is used to attack your printer, while they move on the next IoT product. But I do care. I want to download Fedora and run it, like I can on x86. Is that too much to ask?

The HoneyComb LX2K is made by the same company as CuBox, but the naked board costs $750.

December 10, 2019 09:17 PM

Daniel Vetter: Upstream Graphics: Too Little, Too Late

Unlike the tradition of my past few talks at Linux Plumbers or Kernel conferences, this time around in Lisboa I did not start out with a rant proposing to change everything. Instead I celebrated roughly 10 years of upstream graphics progress and finally achieving paradise. But that was all just prelude to a few bait-and-switches later fulfill expectations on what’s broken this time around in upstream, totally, and what needs to be fixed and changed, maybe.

The LPC video recording is now released, slides are uploaded. If neither of that is to your taste, read below the break for the written summary.

Mission Accomplished

10 or so years ago upstream graphics was essentially a proof of concept for the promised to come. Kernel display modeset just landed, finally bringing a somewhat modern display driver userspace API to linux. And GEM, the graphics execution manager landed, bringing proper GPU memory management and multi client rendering. Realistically a lot needed to be done still, from rendering drivers for all the various SoC, to an atomic display API that can expose all the features, not just what was needed to light up a linux desktop back in the days. And lots of work to improve the codebase and make it much easier and quicker to write drivers.

There’s obviously still a lot to do, but I think we’ve achieved that - for full details, check out my ELCE talk about everything great for upstream graphics.

Now despite all this justified celebrating, there is one sticking point still:

NVIDIA

The trouble with team green from an open source perspective - for them it’s a great boon - is that they own the GPU software stack in two crucial ways:

Together these create a huge software moat around the high margin hardware business. All an open stack would achieve is filling in that moat and inviting competition to eat the nice surplus. In other words, stupid to even attempt, vendor lock-in just pays too well.

Now of course the reverse engineered nouveau driver still exists. But if you have to pay for reverse engineering already, then you might as well go with someone else’s hardware, since you’re not going to get any of the CUDA/GL goodies.

And the business case for open source drivers indeed exists so much that even paying for reverse engineering a full stack is no problem. The result is a vibrant community of hardware vendors, customers, distros and consulting shops who pay the bills for all the open driver work that’s being done. And in userspace even “upstream first” works - releases happen quickly and often enough, with sufficiently smooth merge process that having a vendor tree is simply not needed. Plus customer’s willingness to upgrade if necessary, because it’s usually a well-contained component to enable new hardware support.

In short without a solid business case behind open graphics drivers, they’re just not going to happen, viz. NVIDIA.

Not Shipping Upstream

Unfortunately the business case for “upstream first” on the kernel side is completely broken. Not for open source, and not for any fundamental reasons, but simply because the kernel moves too slowly, is too big, drivers aren’t well contained enough and therefore customer will not or even can not upgrade. For some hardware upstreaming early enough is possible, but graphics simply moves too fast: By the time the upstreamed driver is actually in shipping distros, it’s already one hardware generation behind. And missing almost a year of tuning and performance improvements. Worse it’s not just new hardware, but also GL and Vulkan versions that won’t work on older kernels due to missing features, fragementing the ecosystem further.

This is entirely unlike the userspace side, where refactoring and code sharing in a cross-vendor shared upstream project actually pays off. Even in the short term.

There’s a lot of approaches trying to paper over this rift with the linux kernel:

Also, there just isn’t a single LTS kernel. Even upstream has multiple, plus every distro has their own flavour, plus customers love to grow their own variety trees too. Often they’re not even coordinated on the same upstream release. Cheapest way to support this entire madness is to completely ignore upstream and just write your own subsystem. Or at least not use any of the helper libraries provided by kernel subsystems, completely defeating the supposed benefit of upstreaming code.

No matter the strategy, they all boil down to paying twice - if you want to upstream your code. And there’s no added return for the doubled bill. In conclusion, upstream first needs a business case, like the open source graphics stack in general. And that business case is very much real, except for upstreaming, it’s only real in userspace.

In the kernel, “upstream first” is a sham, at least for graphics drivers.

Thanks to Alex Deucher for reading and commenting on drafts of this text.

December 10, 2019 12:00 AM

December 03, 2019

Daniel Vetter: ELCE Lyon: Everything Great About Upstream Graphics

At ELC Europe in Lyon I held a nice little presentation about the state of upstream graphics drivers, and how absolutely awesome it all is. Of course with a big focus on SoC and embedded drivers. Slides and the video recording

Key takeaways for the busy:

In other words, world domination is assured and progressing according to plan.

December 03, 2019 12:00 AM

December 02, 2019

Brendan Gregg: BPF: A New Type of Software

At Netflix we have 15 BPF programs running on cloud servers by default; Facebook has 40. These programs are not processes or kernel modules, and don't appear in traditional observability tools. They are a new type of software, and make a fundamental change to a 50-year old kernel model by introducing a new interface for applications to make kernel requests, alongside syscalls. BPF originally stood for Berkeley Packet Filter, but has been extended in Linux to become a generic kernel execution engine, capable of running a new type of user-defined and kernel-mode applications. This is what BPF is really about, and I described this for the first time in my [Ubuntu Masters] keynote. The video is on [youtube]:

And the slides are on [slideshare]:
My [BPF Performance Tools] book was just released as an eBook, and covers just one use case of BPF: observability. I'm also speaking about this topic at [re:Invent] this week as well (see you there). BPF is the biggest operating systems change I've seen in my career, and it's thrilling to be a part of it. Thanks to Canonical for inviting me to speak about it at the first Ubuntu Masters event. [youtube]: https://www.youtube.com/watch?v=7pmXdG8-7WU&feature=youtu.be [slideshare]: https://www.slideshare.net/brendangregg/um2019-bpf-a-new-type-of-software [Ubuntu Masters]: https://ubuntu.com/blog/the-masters-speak-forward-thinking-ubuntu-users-gather-to-share-their-experiences [BPF Performance Tools]: http://www.brendangregg.com/bpf-performance-tools-book.html [re:Invent]: https://www.portal.reinvent.awsevents.com/connect/search.ww?#loadSearch-searchPhrase=OPN303&searchType=session&tc=0&sortBy=abbreviationSort&p=

December 02, 2019 08:00 AM

November 27, 2019

Arnaldo Carvalho de Melo: sudo is fast again

A big hammer solution:

[root@quaco ~]# rpm -e fprintd fprintd-pam
[error] [/etc/nsswitch.conf] is not a symbolic link!
[error] [/etc/nsswitch.conf] was not created by authselect!
[error] Unexpected changes to the configuration were detected.
[error] Refusing to activate profile unless those changes are removed or overwrite is requested.
Unable to disable feature [17]: File exists
[root@quaco ~]#

The warnings are not that reassuring, trying to use authselect to check the config also doesn’t bring extra confidence:

[root@quaco ~]# authselect check
[error] [/etc/nsswitch.conf] is not a symbolic link!
[error] [/etc/nsswitch.conf] was not created by authselect!
Current configuration is not valid. It was probably modified outside authselect.

The fprintd is still in the config files:

[root@quaco ~]# grep fprintd /etc/pam.d/system-auth
auth sufficient pam_fprintd.so
[root@quaco ~]#

But since it is not installed, I get my fast sudo again, back to work.

November 27, 2019 07:28 PM

Arnaldo Carvalho de Melo: What is ‘sudo su -‘ doing?

Out of the blue sudo started taking a long time to ask for my password, so I sleeptyped:

$ strace sudo su -

sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?
$

Oops, perhaps it would be a good time for me to try using ‘perf trace’, so I tried:

perf trace --duration 5000 --call-graph=dwarf

To do system wide syscall tracing looking for syscalls taking more than 5 seconds to complete, together with DWARF callchains.

And after tweaking that –duration parameter and using –filter-pids to exclude some long timeout processes that seemed unrelated, and even without using ‘-e \!futex’ to exclude some syscalls taking that long to complete and again, looking unrelated to sudo’s being stuck I got the clue I needed from this entry:


12345.846 (25024.785 ms): sudo/3571 poll(ufds: 0x7ffdcc4376a0, nfds: 1, timeout_msecs: 25000) = 0 (Timeout)
__GI___poll (inlined)
[0x30dec] (/usr/lib64/libdbus-1.so.3.19.11)
[0x2fab0] (/usr/lib64/libdbus-1.so.3.19.11)
[0x176cb] (/usr/lib64/libdbus-1.so.3.19.11)
[0x1809f] (/usr/lib64/libdbus-1.so.3.19.11)
[0x1518b] (/usr/lib64/libdbus-glib-1.so.2.3.4)
dbus_g_proxy_call (/usr/lib64/libdbus-glib-1.so.2.3.4)
pam_sm_authenticate (/usr/lib64/security/pam_fprintd.so)
[0x41f1] (/usr/lib64/libpam.so.0.84.2)
pam_authenticate (/usr/lib64/libpam.so.0.84.2)
[0xb703] (/usr/libexec/sudo/sudoers.so)
[0xa8f4] (/usr/libexec/sudo/sudoers.so)
[0xc754] (/usr/libexec/sudo/sudoers.so)
[0x24a83] (/usr/libexec/sudo/sudoers.so)
[0x1d759] (/usr/libexec/sudo/sudoers.so)
[0x6ef3] (/usr/bin/sudo)
__libc_start_main (/usr/lib64/libc-2.29.so)
[0x887d] (/usr/bin/sudo)

So its about PAM, authentication using some fprintd module, and sudo polls with a timeout of 25000 msecs, no wonder when I first tried with –failure, to ask just for syscalls that returned some error I wasn’t getting anything…

Lets see what is this thing:

[root@quaco ~]# rpm -qf /usr/lib64/security/pam_fprintd.so
fprintd-pam-0.9.0-1.fc30.x86_64
[root@quaco ~]# rpm -q --qf "%{description}\n" fprintd-pam
PAM module that uses the fprintd D-Bus service for fingerprint
authentication.
[root@quaco ~]

I don’t recall enabling this and from a quick look this t480s doesn’t seem to have any fingerprint reader, lets see how to disable this on this Fedora 30 system…

November 27, 2019 07:15 PM

November 26, 2019

Pete Zaitcev: Greg K-H uses mutt

Greg threw a blog post that contains a number of interesting animations of his terminal sessions. Fascinating!

I do not use mutt because I often need to copy-paste, and I hate dealing with line breaks that the terminal inevitably hoists upon me. So, many years ago I migrated to Sylpheed, and from there to Claws (because of the support of anti-aliased fonts).

Greg's approach differs in that it avoids the copy-paste problem by manipulating the bodies of messages programmatically. So, the patches travel between messages, files, and git without being printed to a terminal.

November 26, 2019 07:04 AM

November 21, 2019

Kees Cook: experimenting with Clang CFI on upstream Linux

While much of the work on kernel Control Flow Integrity (CFI) is focused on arm64 (since kernel CFI is available on Android), a significant portion is in the core kernel itself (and especially the build system). Recently I got a sane build and boot on x86 with everything enabled, and I’ve been picking through some of the remaining pieces. I figured now would be a good time to document everything I do to get a build working in case other people want to play with it and find stuff that needs fixing.

First, everything is based on Sami Tolvanen’s upstream port of Clang’s forward-edge CFI, which includes his Link Time Optimization (LTO) work, which CFI requires. This tree also includes his backward-edge CFI work on arm64 with Clang’s Shadow Call Stack (SCS).

On top of that, I’ve got a few x86-specific patches that get me far enough to boot a kernel without warnings pouring across the console. Along with that are general linker script cleanups, CFI cast fixes, and x86 crypto fixes, all in various states of getting upstreamed. The resulting tree is here.

On the compiler side, you need a very recent Clang and LLD (i.e. “Clang 10”, or what I do is build from the latest git). For example, here’s how to get started. First, checkout, configure, and build Clang (and include a RISC-V target just for fun):

# Check out latest LLVM
mkdir -p $HOME/src
cd $HOME/src
git clone https://github.com/llvm/llvm-project.git
mkdir llvm-build
cd llvm-build
# Configure
cmake -DCMAKE_BUILD_TYPE=Release \
      -DLLVM_ENABLE_PROJECTS='clang;lld' \
      -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD="RISCV" \
      ../llvm-project/llvm
# Build!
make -j$(getconf _NPROCESSORS_ONLN)
# Install cfi blacklist template (why is this missing from "make" above?)
mkdir -p $(echo lib/clang/*)/share
cp ../../llvm-project/compiler-rt/lib/cfi/cfi_blacklist.txt $(echo lib/clang/*/share)/cfi_blacklist.txt

Then checkout, configure, and build the CFI tree. (This assumes you’ve already got a checkout of Linus’s tree.)

# Check out my branch
cd ../linux
git remote add kees https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git
git fetch kees
git checkout kees/kspp/cfi/x86 -b test/cfi
# Use the above built Clang path first for the needed binaries: clang, ld.lld, and llvm-ar.
PATH="$HOME/src/llvm-build/bin:$PATH"
# Configure (this uses "defconfig" but you could use "menuconfig"), but you must
# include CC and LD in the make args or your .config won't know about Clang.
make defconfig CC=clang LD=ld.lld
# Enable LTO and CFI.
scripts/config \
     -e CONFIG_LTO \
     -e CONFIG_THINLTO \
     -d CONFIG_LTO_NONE \
     -e CONFIG_LTO_CLANG \
     -e CONFIG_CFI_CLANG \
     -e CONFIG_CFI_PERMISSIVE \
     -e CONFIG_CFI_CLANG_SHADOW
# Enable LKDTM if you want runtime fault testing:
scripts/config -e CONFIG_LKDTM
# Build!
make -j$(getconf _NPROCESSORS_ONLN) CC=clang LD=ld.lld

Do not be alarmed by various warnings, such as:

ld.lld: warning: cannot find entry symbol _start; defaulting to 0x1000
llvm-ar: error: unable to load 'arch/x86/kernel/head_64.o': file too small to be an archive
llvm-ar: error: unable to load 'arch/x86/kernel/head64.o': file too small to be an archive
llvm-ar: error: unable to load 'arch/x86/kernel/ebda.o': file too small to be an archive
llvm-ar: error: unable to load 'arch/x86/kernel/platform-quirks.o': file too small to be an archive
WARNING: EXPORT symbol "page_offset_base" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "vmalloc_base" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "vmemmap_base" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: "__memcat_p" [vmlinux] is a static (unknown)
no symbols

Adjust your .config as you want (but, again, make sure the CC and LD args are pointed at Clang and LLD respectively). This should(!) result in a happy bootable x86 CFI-enabled kernel. If you want to see what a CFI failure looks like, you can poke LKDTM:

# Log into the booted system as root, then:
cat <(echo CFI_FORWARD_PROTO) >/sys/kernel/debug/provoke-crash/DIRECT
dmesg

Here’s the CFI splat I see on the console:

[   16.288372] lkdtm: Performing direct entry CFI_FORWARD_PROTO
[   16.290563] lkdtm: Calling matched prototype ...
[   16.292367] lkdtm: Calling mismatched prototype ...
[   16.293696] ------------[ cut here ]------------
[   16.294581] CFI failure (target: lkdtm_increment_int$53641d38e2dc4a151b75cbe816cbb86b.cfi_jt+0x0/0x10):
[   16.296288] WARNING: CPU: 3 PID: 2612 at kernel/cfi.c:29 __cfi_check_fail+0x38/0x40
...
[   16.346873] ---[ end trace 386b3874d294d2f7 ]---
[   16.347669] lkdtm: Fail: survived mismatched prototype function call!

The claim of “Fail: survived …” is due to CONFIG_CFI_PERMISSIVE=y. This allows the kernel to warn but continue with the bad call anyway. This is handy for debugging. In a production kernel that would be removed and the offending kernel thread would be killed. If you run this again with the config disabled, there will be no continuation from LKDTM. :)

Enjoy! And if you can figure out before me why there is still CFI instrumentation in the KPTI entry handler, please let me know and help us fix it. ;)

© 2019 – 2020, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

November 21, 2019 05:09 AM

November 19, 2019

Michael Kerrisk (manpages): man-pages-5.04 is released

I've released man-pages-5.04. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from 15 contributors. The release includes approximately 80 commits that change just under 30 pages.

The most notable of the changes in man-pages-5.04 are the following:

Another small but important changes is the addition of documentation of the P_PIDFD idtype in the waitid(2) manual page. This feature, added in Linux 5.4, allows a parent process to wait on a child process that is referred to by a PID file descriptor, and constitutes the final cornerstone in the pidfd API.

November 19, 2019 09:04 PM

November 18, 2019

Matthew Garrett: Extending proprietary PC embedded controller firmware

I'm still playing with my X210, a device that just keeps coming up with new ways to teach me things. I'm now running Coreboot full time, so the majority of the runtime platform firmware is free software. Unfortunately, the firmware that's running on the embedded controller (a separate chip that's awake even when the rest of the system is asleep and which handles stuff like fan control, battery charging, transitioning into different power states and so on) is proprietary and the manufacturer of the chip won't release data sheets for it. This was disappointing, because the stock EC firmware is kind of annoying (there's no hysteresis on the fan control, so it hits a threshold, speeds up, drops below the threshold, turns off, and repeats every few seconds - also, a bunch of the Thinkpad hotkeys don't do anything) and it would be nice to be able to improve it.

A few months ago someone posted a bunch of fixes, a Ghidra project and a kernel patch that lets you overwrite the EC's code at runtime for purposes of experimentation. This seemed promising. Some amount of playing later and I'd produced a patch that generated keyboard scancodes for all the missing hotkeys, and I could then use udev to map those scancodes to the keycodes that the thinkpad_acpi driver would generate. I finally had a hotkey to tell me how much battery I had left.

But something else included in that post was a list of the GPIO mappings on the EC. A whole bunch of hardware on the board is connected to the EC in ways that allow it to control them, including things like disabling the backlight or switching the wifi card to airplane mode. Unfortunately the ACPI spec doesn't cover how to control GPIO lines attached to the embedded controller - the only real way we have to communicate is via a set of registers that the EC firmware interprets and does stuff with.

One of those registers in the vendor firmware for the X210 looked promising, with individual bits that looked like radio control. Unfortunately writing to them does nothing - the EC firmware simply stashes that write in an address and returns it on read without parsing the bits in any way. Doing anything more with them was going to involve modifying the embedded controller code.

Thankfully the EC has 64K of firmware and is only using about 40K of that, so there's plenty of room to add new code. The problem was generating the code in the first place and then getting it called. The EC is based on the CR16C architecture, which binutils supported until 10 days ago. To be fair it didn't appear to actually work, and binutils still has support for the more generic version of the CR16 family, so I built a cross assembler, wrote some assembly and came up with something that Ghidra was willing to parse except for one thing.

As mentioned previously, the existing firmware code responded to writes to this register by saving it to its RAM. My plan was to stick my new code in unused space at the end of the firmware, including code that duplicated the firmware's existing functionality. I could then replace the existing code that stored the register value with code that branched to my code, did whatever I wanted and then branched back to the original code. I hacked together some assembly that did the right thing in the most brute force way possible, but while Ghidra was happy with most of the code it wasn't happy with the instruction that branched from the original code to the new code, or the instruction at the end that returned to the original code. The branch instruction differs from a jump instruction in that it gives a relative offset rather than an absolute address, which means that branching to nearby code can be encoded in fewer bytes than going further. I was specifying the longest jump encoding possible in my assembly (that's what the :l means), but the linker was rewriting that to a shorter one. Ghidra was interpreting the shorter branch as a negative offset, and it wasn't clear to me whether this was a binutils bug or a Ghidra bug. I ended up just hacking that code out of binutils so it generated code that Ghidra was happy with and got on with life.

Writing values directly to that EC register showed that it worked, which meant I could add an ACPI device that exposed the functionality to the OS. My goal here is to produce a standard Coreboot radio control device that other Coreboot platforms can implement, and then just write a single driver that exposes it. I wrote one for Linux that seems to work.

In summary: closed-source code is more annoying to improve, but that doesn't mean it's impossible. Also, strange Russians on forums make everything easier.

comment count unavailable comments

November 18, 2019 08:44 AM

November 15, 2019

Kees Cook: security things in Linux v5.3

Previously: v5.2.

Linux kernel v5.3 was released! I let this blog post get away from me, but it’s up now! :) Here are some security-related things I found interesting:

heap variable initialization
In the continuing work to remove “uninitialized” variables from the kernel, Alexander Potapenko added new init_on_alloc” and “init_on_free” boot parameters (with associated Kconfig defaults) to perform zeroing of heap memory either at allocation time (i.e. all kmalloc()s effectively become kzalloc()s), at free time (i.e. all kfree()s effectively become kzfree()s), or both. The performance impact of the former under most workloads appears to be under 1%, if it’s measurable at all. The “init_on_free” option, however, is more costly but adds the benefit of reducing the lifetime of heap contents after they have been freed (which might be useful for some use-after-free attacks or side-channel attacks). Everyone should enable CONFIG_INIT_ON_ALLOC_DEFAULT_ON=1 (or boot with “init_on_alloc=1“), and the more paranoid system builders should add CONFIG_INIT_ON_FREE_DEFAULT_ON=1 (or “init_on_free=1” at boot). As workloads are found that cause performance concerns, tweaks to the initialization coverage can be added.

pidfd_open() added
Christian Brauner has continued his pidfd work by creating the next needed syscall: pidfd_open(), which takes a pid and returns a pidfd. This is useful for cases where process creation isn’t yet using CLONE_PIDFD, and where /proc may not be mounted.

-Wimplicit-fallthrough enabled globally
Gustavo A.R. Silva landed the last handful of implicit fallthrough fixes left in the kernel, which allows for -Wimplicit-fallthrough to be globally enabled for all kernel builds. This will keep any new instances of this bad code pattern from entering the kernel again. With several hundred implicit fallthroughs identified and fixed, something like 1 in 10 were missing breaks, which is way higher than I was expecting, making this work even more well justified.

x86 CR4 & CR0 pinning
In recent exploits, one of the steps for making the attacker’s life easier is to disable CPU protections like Supervisor Mode Access (and Execute) Prevention (SMAP and SMEP) by finding a way to write to CPU control registers to disable these features. For example, CR4 controls SMAP and SMEP, where disabling those would let an attacker access and execute userspace memory from kernel code again, opening up the attack to much greater flexibility. CR0 controls Write Protect (WP), which when disabled would allow an attacker to write to read-only memory like the kernel code itself. Attacks have been using the kernel’s CR4 and CR0 writing functions to make these changes (since it’s easier to gain that level of execute control), but now the kernel will attempt to “pin” sensitive bits in CR4 and CR0 to avoid them getting disabled. This forces attacks to do more work to enact such register changes going forward. (I’d like to see KVM enforce this too, which would actually protect guest kernels from all attempts to change protected register bits.)

additional kfree() sanity checking
In order to avoid corrupted pointers doing crazy things when they’re freed (as seen in recent exploits), I added additional sanity checks to verify kmem cache membership and to make sure that objects actually belong to the kernel slab heap. As a reminder, everyone should be building with CONFIG_SLAB_FREELIST_HARDENED=1.

KASLR enabled by default on arm64
Just as Kernel Address Space Layout Randomization (KASLR) was enabled by default on x86, now KASLR has been enabled by default on arm64 too. It’s worth noting, though, that in order to benefit from this setting, the bootloader used for such arm64 systems needs to either support the UEFI RNG function or provide entropy via the “/chosen/kaslr-seed” Device Tree property.

hardware security embargo documentation
As there continues to be a long tail of hardware flaws that need to be reported to the Linux kernel community under embargo, a well-defined process has been documented. This will let vendors unfamiliar with how to handle things follow the established best practices for interacting with the Linux kernel community in a way that lets mitigations get developed before embargoes are lifted. The latest (and HTML rendered) version of this process should always be available here.

Those are the things I had on my radar. Please let me know if there are other things I should add! Linux v5.4 is almost here…

© 2019 – 2020, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

November 15, 2019 01:36 AM

November 01, 2019

Pete Zaitcev: Linode Object Storage

Linode made their Object Storage product generally available. They dropped the Swift API that they offered in the beta, and only rolled with S3.

They are careful not to mention what implementation underpins it, but I suspect it's Ceph RGW. They already provide Ceph block storage as public service. Although, I don't know for sure.

November 01, 2019 03:20 PM

October 29, 2019

Pete Zaitcev: Samsung shutting down CPU development in Austin

An acquaintance of mine was laid off from Samsung. He was a rank-and-file ASIC designer and worked on FPU unit for Samsung's new CPU. Another acquaintance, a project manager in the silicon field, relayed that supposedly ARM developed a new CPUs that are so great, that all competitors gave up and folded their CPU development, resulting in the layoffs. The online sources have details.

In the same time they gave up on in-house cores, Samsung announced Exynos 990, a standalone version of the 980, based on the successors of the Cortex family, developed by ARM, of course.

As someone said on Glassdoor, "great place to work until you're laid off".

October 29, 2019 07:58 PM

Paul E. Mc Kenney: The Old Man and His Smartphone

I recently started using my very first smartphone, and it was suggested that I blog about the resulting experiences.  So here you go!

October 29, 2019 01:57 PM

Paul E. Mc Kenney: The Old Man and His Smartphone, Episode VII

The previous episode speculated about the past, so this episode will make some wild guesses about the future.

There has been much hue and cry about the ill effects of people being glued to their smartphones.  I have tended to discount this viewpoint due to having seen a great many people's heads buried in newspapers, magazines, books, and television screens back in the day.  And yes, there was much hue and cry about that as well, so I guess some things never change.

However, a few years back, the usual insanely improbable sequence of events resulted in me eating dinner with the Chief of Police of a mid-sized but prominent city, both of which will go nameless.  He called out increased smartphone use as having required him to revamp his training programs.  You see, back in the day, typical recruits could reasonably be expected to have the social skills required to defuse a tense situation, using what he termed "verbal jiujitsu".  However, present-day recruits need to take actual classes in order to master this lost art.

I hope that we can all agree that it is far better for officers of the law to maintain order through use of vocal means, perhaps augmented with force of personality, especially given that the alternative seems to the use of violence.  So perhaps the smartphone is responsible for some significant social change after all.  Me, I will leave actual judgment on this topic to psychologists, social scientists, and of course historians.  Not that any of them are likely to reach a conclusion that I would trust.  Based on past experience, far from it!  The benefit of leaving such judgments to them is instead that it avoids me wasting any further time on such judgments.  Or so I hope.

It is of course all too easy to be extremely gloomy about the overall social impact of smartphones.  One could easily argue that people freely choose spreading misinformation over accessing vast stores of information, bad behavior over sweetness and light, and so on and so forth.

But it really is up to each and every one of us.  After all, if life were easy, I just might do a better job of living mine.  So maybe we all need to brush up on our social skills.  And to do a better job of choosing what to post, to say nothing of what posts to pass on.  Perhaps including the blog posts in this series!

Cue vigorous arguments on the appropriateness of these goals, or, failing that, the best ways to accomplish them.  ;-)

October 29, 2019 01:50 PM

October 27, 2019

Pete Zaitcev: ?fbclid

As some of you might have noticed, Facebook started adding a tracking token to all URLs as a query string "?fbclid=XXXXXXXXX". I don't know how it works, exactly. Perhaps it rattles to FB when people re-inject these links into FB after they cycle through other social media. Either way, today I found a website that innocently fails to work when shared on FB: Whomp. If one tries to share a comic strip, such as "It's Training Men", FB appends its token, and makes the URL invalid.

October 27, 2019 03:30 AM

October 25, 2019

Paul E. Mc Kenney: The Old Man and His Smartphone, Episode VI

A common science-fiction conceit is some advanced technology finding its way into a primitive culture, so why not consider what might happen if my smartphone were transported a few centuries back in time?

Of course, my most strongly anticipated smartphone use, location services, would have been completely useless as recently as 30 years ago, let alone during the 1700s.  You see, these services require at least 24 GPS satellites in near earth orbit, which didn't happen until 1993.

My smartphone's plain old telephony functionality would also have been useless surprisingly recently, courtesy of its need for large numbers of cell towers, to say nothing of the extensive communications network interconnecting them.  And I am not convinced that my smartphone would have been able to use the old analog cell towers that were starting to appear in the 1980s, but even if it could, for a significant portion of life, my smartphone would have completely useless as a telephone.

Of course, the impressive social-media capabilities of my smartphone absolutely require the huge networking and server infrastructure that has been in place only within the past couple of decades.

And even though my smartphone's battery lifetime is longer than I expected, extended operation relies on the power grid, which did not exist at all until the late 1800s. So any wonderment generated by my transported-back-in-time smartphone would be of quite limited duration.  But let's avoid this problem through use of a solar-array charger.

My smartphone probably does not much like water, large changes in temperature, or corrosive environments.  Its dislike of adverse environmental conditions would have quickly rendered it useless in a couple of my childhood houses, to say nothing of almost all buildings in existence a couple of centuries ago.  This means that long-term use would require confining my smartphone to something like a high-end library located some distance from bodies of salt water and from any uses of high-sulfur coal.  This disqualifies most 1700s Western Hemisphere environments, as well as many of the larger Eastern Hemisphere cities, perhaps most famously London with its pea-soup "fogs".  Environmental considerations also pose interesting questions regarding exactly how to deploy the solar array, especially during times of inclement weather.

So what could my smartphone do back in the 1700s?

I leave off any number of uses that involve ferrying information from 2019 back to the 1700s.  Even an accurate seacoast map could be quite useful and illuminating, but this topic has been examined quite closely by several generations of science-fiction writers, so I would not expect to be able to add anything new.  However, it might be both useful and entertaining to review some of this genre.  Everyone will have their favorites, but I will just list the three that came to mind most quickly:  Heinlein's "The Door into Summer", Zemeckis's and Gale's "Back to the Future" movies, and Rowling's "Harry Potter and the Cursed Child.&dquo So let us proceed, leaving back-in-time transportation of information to today's legion of science-fiction writers.

The upshot of all this is that if my smartphone were transported back to the 1700s, it would be completely unable to provide its most commonly used 2019 functionality.  However, given a solar array charger and development environment, and given a carefully controlled environment, it might nevertheless be quite impressive to our 1700s counterparts.

In fact, it would be quite impressive much more recently.  Just imagine if, at the end of the Mother of all Demos, Douglas Engelbart had whipped out a ca-2019 smartphone.

But the hard cold fact is that in some ways, a 2019 smartphone would actually have been a step backwards from Engelbart's famous demo.  After all, Englebart's demo allowed shared editing of a document.  Lacking both wifi and a cell-phone network, and given the severe range limitations of NFC, my smartphone would be utterly incapable of shared editing of documents.

In short, although might smartphone might be recognized as a very impressive device back in the day, the hard cold fact is that it is but the tiniest tip of a huge iceberg of large-scale technology, including forests of cell towers, globe-girdling fiber-optic cables, vast warehouses stuffed with servers, and even a constellation of satellites.  Thus, the old adage "No one is an island" was never more true than it is today.  However, the bridges between our respective islands seem to be considerably more obscure than they were in the old days.  Which sooner or later will call into question whether these bridges will be properly maintained, but that question applies all too well to a depressingly broad range of infrastructure on which we all depend.

Which leads to another old adage: "The more things change, the more they stay the same".  :-)

October 25, 2019 08:51 PM

Paul E. Mc Kenney: The Old Man and His Smartphone, Episode V

So after many years of bragging about my wristwatch's exemplary battery lifetime (years, I tell you, years!!!), I find myself a member of that very non-exclusive group that worries about battery lifetime.  But I have been pleasantly surprised to find that my smartphone's battery lasts quite a bit longer than I would have expected, in fact, it is not unusual for the battery to last through two or three days of normal usage.  And while that isn't years, it isn't at all bad.

That is, battery lifetime isn't at all bad as long as I have location services (AKA GPS) turned off.

With GPS on, the smartphone might or might not make it through the day. Which is a bit ironic given that GPS was the main reason that I knew I would eventually be getting a smartphone.  And the smartphone isn't all that happy about my current policy of keeping the GPS off unless I am actively using it.  The camera in particular keeps whining about how it could tag photos with their locations if only I would turn the GPS on.  However, I have grown used to this sort of thing, courtesy of my television constantly begging me to connect it to Internet.  Besides, if my smartphone were sincerely concerned about tagging my photos with their locations, it could always make use of the good and sufficient location data provided by the cell towers that I happen to know that it is in intimate contact with!

GPS aside, I am reasonably happy with my smartphone's current battery lifetime.  Nevertheless, long experience with other battery-powered devices leads me to believe that it will undergo the usual long slow decay over time.  But right now, it is much better than I would have expected, and way better than any of my laptops.

Then again, I am not (yet) in the habit of running rcutorture on my smartphone...

October 25, 2019 12:05 AM

October 23, 2019

Paul E. Mc Kenney: The Old Man and His Smartphone, Episode IV

I took my first GPS-enabled road trip today in order to meet a colleague for lunch.  It was only about an hour drive each way, but that was nevertheless sufficient to highlight one big advantage of GPS as well as to sound a couple of cautionary notes.

The big advantage didn't seem particularly advantageous at first.  The phone had announced that I should turn left at Center Street, but then inexplicably changed its mind, instead asking me to instead turn left on a road with a multi-syllabic vaguely Germanic name.  On the return trip, I learned that I had actually missed the left turn onto Center Street courtesy of that road's name changing to Wyers Road at the crossing.  So I saw a sign for Wyers Road and sensibly (or so I thought) elected not to turn left at that point.  The phone seamlessly autocorrected, in fact so seamlessly that I was completely unaware that I had missed the turn.

The first cautionary note involved the phone very quickly changing its mind on which way I should go.  It initially wanted me to go straight ahead for the better part of a mile, but then quickly and abruptly asked me to instead take a hard right-hand turn.  In my youth, this abrupt change might have terminally annoyed me, but one does (occasionally) learn patience with advancing age.

That and the fact that following its initial advice to go straight ahead would have taken me over a ditch, through a fence, and across a pasture.

The second cautionary note was due to the Fall colors here in Upstate New York, which caused me to let the impatient people behind me pass, rather than following my usual practice of taking the presence of tailgaters as a hint to pick up the pace.  I therefore took a right onto a side road, intending to turn around in one of the conveniently located driveways so that I could continue enjoying the Fall colors as I made my leisurely way up the highway.  But my smartphone instead suggested driving ahead for a short way to take advantage of a loop in the road.  I figured it knew more about the local geography than I did, so I naively followed its suggestion.

My first inkling of my naivete appeared when my smartphone asked me to take a right turn onto a one-lane gravel road.  I was a bit skeptical, but the gravel road appeared to have been recently graveled and also appeared to be well-maintained, so why not?  A few hundred yards in, the ruts became a bit deeper than my compact rental car would have preferred, but it is easy to position each pair of wheels on opposite sides of the too-deep rut and continue onwards.

But then I came to the stream crossing the road.

The stream covered perhaps 15 or 20 feet of the road, but on the other hand, it appeared to be fairly shallow in most places, which suggested that crossing it (as my smartphone was suggesting) might be feasible.  Except that there were a few potholes of indeterminate depth filled with swiftly swirling water, with no clear way to avoid them.  Plus the water had eroded the road surface a foot or two below its level elsewhere, which suggested that attempting to drive into the stream might leave my rental car high-centered on the newly crafted bank, leaving my poor car stuck with its nose down in the water and its rear wheels spinning helplessly in the air.

Fortunately, rental cars do have a reverse gear, but unfortunately my body is less happy than it might once have been to maintain the bent-around posture required to look out the rear window while driving backwards several hundred yards down a windy gravel road.  Fortunately, like many late-model cars, this one has a rear-view camera that automatically activates when the car is put into the reverse gear, but unfortunately I was unable to convince myself that driving several hundred yards backwards down a narrow and windy gravel road running through a forest was a particularly good beginner's use of this new-age driving technique.  (So maybe I should take the hint and practice driving backwards using the video in a parking lot?  Or maybe not...)

Which led to the next option, that of turning the car around on a rutted one-lane gravel road.  Fortunately the car is a compact, so this turned out to be just barely possible, and even more fortunately there were no other cars on the road waiting for me to complete my multipoint-star turn-around manuever.  (Some of my acquaintances will no doubt point out that had I been driving a large pickup, crossing the stream would have been a trivial event unworthy of any notice.  This might be true, but I was in fact driving a compact.)

But all is well that ends well.  After a few strident but easily ignored protests, my phone forgave my inexplicable deviation from its carefully planned and well-crafted route and deigned to guide me the rest of the way to my destination.

And, yes, it even kept me on paved roads.

October 23, 2019 10:02 PM

October 18, 2019

Matthew Garrett: Letting Birds scooters fly free

(Note: These issues were disclosed to Bird, and they tell me that fixes have rolled out. I haven't independently verified)

Bird produce a range of rental scooters that are available in multiple markets. With the exception of the Bird Zero[1], all their scooters share a common control board described in FCC filings. The board contains three primary components - a Nordic NRF52 Bluetooth controller, an STM32 SoC and a Quectel EC21-V modem. The Bluetooth and modem are both attached to the STM32 over serial and have no direct control over the rest of the scooter. The STM32 is tied to the scooter's engine control unit and lights, and also receives input from the throttle (and, on some scooters, the brakes).

The pads labeled TP7-TP11 near the underside of the STM32 and the pads labeled TP1-TP5 near the underside of the NRF52 provide Serial Wire Debug, although confusingly the data and clock pins are the opposite way around between the STM and the NRF. Hooking this up via an STLink and using OpenOCD allows dumping of the firmware from both chips, which is where the fun begins. Running strings over the firmware from the STM32 revealed "Set mode to Free Drive Mode". Challenge accepted.

Working back from the code that printed that, it was clear that commands could be delivered to the STM from the Bluetooth controller. The Nordic NRF52 parts are an interesting design - like the STM, they have an ARM Cortex-M microcontroller core. Their firmware is split into two halves, one the low level Bluetooth code and the other application code. They provide an SDK for writing the application code, and working through Ghidra made it clear that the majority of the application firmware on this chip was just SDK code. That made it easier to find the actual functionality, which was just listening for writes to a specific BLE attribute and then hitting a switch statement depending on what was sent. Most of these commands just got passed over the wire to the STM, so it seemed simple enough to just send the "Free drive mode" command to the Bluetooth controller, have it pass that on to the STM and win. Obviously, though, things weren't so easy.

It turned out that passing most of the interesting commands on to the STM was conditional on a variable being set, and the code path that hit that variable had some impressively complicated looking code. Fortunately, I got lucky - the code referenced a bunch of data, and searching for some of the values in that data revealed that they were the AES S-box values. Enabling the full set of commands required you to send an encrypted command to the scooter, which would then decrypt it and verify that the cleartext contained a specific value. Implementing this would be straightforward as long as I knew the key.

Most AES keys are 128 bits, or 16 bytes. Digging through the code revealed 8 bytes worth of key fairly quickly, but the other 8 bytes were less obvious. I finally figured out that 4 more bytes were the value of another Bluetooth variable which could be simply read out by a client. The final 4 bytes were more confusing, because all the evidence made no sense. It looked like it came from passing the scooter serial number to atoi(), which converts an ASCII representation of a number to an integer. But this seemed wrong, because atoi() stops at the first non-numeric value and the scooter serial numbers all started with a letter[2]. It turned out that I was overthinking it and for the vast majority of scooters in the fleet, this section of the key was always "0".

At that point I had everything I need to write a simple app to unlock the scooters, and it worked! For about 2 minutes, at which point the network would notice that the scooter was unlocked when it should be locked and sent a lock command to force disable the scooter again. Ah well.

So, what else could I do? The next thing I tried was just modifying some STM firmware and flashing it onto a board. It still booted, indicating that there was no sort of verified boot process. Remember what I mentioned about the throttle being hooked through the STM32's analogue to digital converters[3]? A bit of hacking later and I had a board that would appear to work normally, but about a minute after starting the ride would cut the throttle. Alternative options are left as an exercise for the reader.

Finally, there was the component I hadn't really looked at yet. The Quectel modem actually contains its own application processor that runs Linux, making it significantly more powerful than any of the chips actually running the scooter application[4]. The STM communicates with the modem over serial, sending it an AT command asking it to make an SSL connection to a remote endpoint. It then uses further AT commands to send data over this SSL connection, allowing it to talk to the internet without having any sort of IP stack. Figuring out just what was going over this connection was made slightly difficult by virtue of all the debug functionality having been ripped out of the STM's firmware, so in the end I took a more brute force approach - I identified the address of the function that sends data to the modem, hooked up OpenOCD to the SWD pins on the STM, ran OpenOCD's gdb stub, attached gdb, set a breakpoint for that function and then dumped the arguments being passed to that function. A couple of minutes later and I had a full transaction between the scooter and the remote.

The scooter authenticates against the remote endpoint by sending its serial number and IMEI. You need to send both, but the IMEI didn't seem to need to be associated with the serial number at all. New connections seemed to take precedence over existing connections, so it would be simple to just pretend to be every scooter and hijack all the connections, resulting in scooter unlock commands being sent to you rather than to the scooter or allowing someone to send fake GPS data and make it impossible for users to find scooters.

In summary: Secrets that are stored on hardware that attackers can run arbitrary code on probably aren't secret, not having verified boot on safety critical components isn't ideal, devices should have meaningful cryptographic identity when authenticating against a remote endpoint.

Bird responded quickly to my reports, accepted my 90 day disclosure period and didn't threaten to sue me at any point in the process, so good work Bird.

(Hey scooter companies I will absolutely accept gifts of interesting hardware in return for a cursory security audit)

[1] And some very early M365 scooters
[2] The M365 scooters that Bird originally deployed did have numeric serial numbers, but they were 6 characters of type code followed by a / followed by the actual serial number - the number of type codes was very constrained and atoi() would terminate at the / so this was still not a large keyspace
[3] Interestingly, Lime made a different design choice here and plumb the controls directly through to the engine control unit without the application processor having any involvement
[4] Lime run their entire software stack on the modem's application processor, but because of [3] they don't have any realtime requirements so this is more straightforward

comment count unavailable comments

October 18, 2019 11:44 AM

October 15, 2019

Brendan Gregg: Two kernel mysteries and the most technical talk I've ever seen

If you start digging into Linux kernel internals, like function disassembly and profiling, you may run into two mysteries of kernel engineering, as I did: 1. What is this "\_\_fentry\_\_" stuff at the start of _every_ kernel function? Profiling? Why doesn't it massively slow down the Linux kernel? 2. How can Ftrace instrument _all_ kernel functions almost instantly, and with low overhead? To show what I mean, here's a screenshot for (1) (I used ElfMaster's [kdress] to turn my vmlinuz into vmlinux):

# gdb ./vmlinux /proc/kcore
[...]
(gdb) disas schedule
Dump of assembler code for function schedule:
   0xffffffff819b6530 <+0>:	callq  0xffffffff81a01cd0 <__fentry__>
   0xffffffff819b6535 <+5>:	push   %rbp
[...]
That's at the start of _every_ kernel function on Linux. What the heck! And for (2), using my [perf-tools]:
# ~/Git/perf-tools/bin/funccount '*'
Tracing "*"... Ctrl-C to end.
^C
FUNC                              COUNT
aa_af_perm                            1
[...truncated...]
_raw_spin_lock                  1811959
__mod_node_page_state           1856254
_cond_resched                   1921320
rcu_all_qs                      1924218
__accumulate_pelt_segments      2883418
decay_load                      8684460

Ending tracing...
Despite this tracing every kernel function on my laptop, I didn't notice any slow down. How?? At a Linux conference in Düsseldorf, 2014, I saw a talk by kernel maintainer [Steven Rostedt] that answered both questions. It was and is the most technical talk I've ever seen. Unfortunately, it wasn't videoed. Fortunately, he just gave an updated version at Kernel Recipes in Paris, 2019, titled ftrace: Where modifying a running kernel all started. You can now learn the answers to these mysteries and answer many more questions about kernel internals. Steven covers things that aren't documented elsewhere. The video is on [youtube]:

The slides are on [slideshare]:

Thanks to Steven for updating and delivering the talk, and Kernel Recipes organizers for sharing this and other talks on video. I'll need to come back to a future Kernel Recipes! I'd also recommend watching: - BPF at Facebook by Alexei Starovoitov (youtube) - The ubiquity but also the necessity of eBPF as a technology to keep the kernel relevant by David S. Miller [kdress]: https://github.com/elfmaster/kdress [Steven Rostedt]: https://twitter.com/srostedt [youtube]: https://www.youtube.com/watch?v=93uE_kWWQjs [slideshare]: https://www.slideshare.net/ennael/kernel-recipes-2019-ftrace-where-modifying-a-running-kernel-all-started [perf-tools]: https://github.com/brendangregg/perf-tools

October 15, 2019 07:00 AM

October 14, 2019

Paul E. Mc Kenney: The Old Man and His Smartphone, Episode III

I still haven't installed many apps, but I have already come across lookalike apps that offer interesting services that are less pertinent to my current mode of existence than the apps I was actually looking for.  So perhaps app spam will become as much an issue as plain old email spam.

I also took my first-ever selfie, thus learning that using the camera on the same side of the smartphone as the screen gets you a mirror-imaged photograph.  I left the selfie that way on my obligatory Facebook post for purposes of historical accuracy and as a cautionary tale, but it turns out to be quite easy to flip photos (both horizonally and vertically) from the Android Gallery app. It is also possible to change brightness and contrast, add captions, add simple graphics, scrawl over the photo in various colors, adjust perspective, and so on.  An application popped up and offered much much much more (QR code scanning! OCR! Other stuff I didn't bother reading!), but only if I would agree to the license.  Which I might do some time in the future.

I have not yet worked out how I will carry the smartphone long term.  For the moment, it rests in the classic nerd position in my shirt pocket (truth in advertising!!!).

My wife was not all that impressed with the smartphone, which is not too surprising given that grade-school students commonly have them.  She did note that if someone broke into it, they could be taking pictures of her without her knowledge.  I quickly overcame that potential threat by turning the smartphone the other side up, so that any unauthorized photography would be confined to the inside of my shirt pocket.  :-)

October 14, 2019 02:46 PM

October 12, 2019

Paul E. Mc Kenney: The Old Man and His Smartphone, Episode II

At some point in the setup process, it was necessary to disable wifi.  And I of course forgot to re-enable it.  A number of apps insisted on downloading new versions.  Eventually I realized my mistake, and re-enabled wifi, but am still wondering just how severe a case of sticker shock I am in for at the end of the month.

Except that when I re-enabled wifi, I did so in my hotel room.  During the last day of my stay at that hotel.  So I just now re-enabled it again on the shuttle.  I can clearly see that I have many re-enablings of wifi in my future, apparently one per hotspot that I visit.  :-)

Some refinement of notifications is still required.  Some applications notify me, but upon opening the corresponding app, there is no indication of any reason for notification.  I have summarily disabled notifications for some of these, and will perhaps learn the hard way why this was a bad idea.  Another issue is that some applications have multiple devices on which they can notify me.  It would be really nice if they stuck to the device I was actively using at the time rather than hitting them all, but perhaps that is too much to ask for in this hyperconnected world.

My new smartphone's virtual keyboard represents a definite improvement over the multipress nature of text messaging on my old flip phone, but it does not hold a candle to a full-sized keyboard.  However, even this old man must confess that it is much faster to respond to the smartphone than to the laptop if both start in their respective sleeping states.  There is probably an optimal strategy in there somewhere!  :-)

October 12, 2019 12:32 AM

October 11, 2019

Michael Kerrisk (manpages): man-pages-5.03 is released

I've released man-pages-5.03. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from 45 contributors. The release includes over 200 commits that change around 80 pages.

The most notable of the changes in man-pages-5.03 are the following:

October 11, 2019 08:59 PM

October 10, 2019

Paul E. Mc Kenney: The Old Man and His Smartphone, Episode I

I have long known that I would sooner or later be getting a smartphone, and this past Tuesday it finally happened.  So, yes, at long last I am GPS-enabled, and much else besides, a little of which I actually know how to use.

It took quite a bit of assistance to get things all wired together, so a big "Thank You" to my fellow bootcamp members!  I quickly learned that simply telling applications that they can't access anything is self-defeating, though one particular application reacted by simply not letting go of the screen.  Thankfully someone was on hand to tell me about the button in the lower right, and how to terminate an offending application by sweeping up on it. And I then quickly learned that pressing an app button spawns a new instance of that app, whether or not an instance was already running.  I quickly terminated a surprising number of duplicate app instances that I had spawned over the prior day or so.

Someone also took pity on me and showed me how to silence alerts from attention-seeking apps, with one notable instance being an app that liked to let me know when spam arrived for each instance of spam.

But having a portable GPS receiver has proven handy a couple of times already, so I can already see how these things could become quite addictive.  Yes, I resisted for a great many years, but my smartphone-free days are now officially over.  :-)

October 10, 2019 08:47 PM

October 07, 2019

David Sterba: Selecting the next checksum for btrfs

The currently used and the only one checksum algorithm that’s implemented in btrfs is crc32c, with 32bit digest. It has served well for the years but has some weaknesses so we’re looking for a replacements and also for enhancing the use cases based on possibly better hash algorithms.

The advantage of crc32c is it’s simplicity of implementation, various optimized versions exist and hardware CPU instruction-level support. The error detection strength is not great though, the collisions are easy to generate. Note that crc32c has not been used in cryptographically sensitive context (eg. deduplication).

Side note: the collision generation weakness is used in the filesystem image dump tool to preserve hashes of directory entries while obscuring the real names.

Future use cases

The natural use case is still to provide checksumming for data and metadata blocks. With strong hashes, the same checksums can be used to aid deduplication or verification (HMAC instead of plain hash). Due to different requirements, one hash algorithm cannot possibly satisfy all requirements, namely speed vs. strength. Or can it?

The criteria

The frequency of checksumming is high, every data block needs that, every metadata block needs that.

During the discussions in the community, there were several candidate hash algorithms proposed and as it goes users want different things but we developers want to keep the number of features sane or at least maintainable. I think and hope the solution will address that.

The first hash type is to replace crc32c with focus on speed and not necessarily strength (ie. collision resistance).

The second type focus is strength, in the cryptographic sense.

In addition to the technical aspects of the hashes, there are some requirements that would allow free distribution and use of the implementations:

Constraints posed by btrfs implementation:

For the enhanced use case of data verification (using HMAC), there’s a requirement that might not interest everybody but still covers a lot of deployments. And this is standard compliance and certification:

And maybe last but not least, use something that is in wide use already, proven by time.

Speed

Implementation of all algorithms should be performant on common hardware, ie. 64bit architectures and hopefully not terrible on 32bit architectures or older and weaker hardware. By hardware I mean the CPU, not specialized hardware cards.

The crypto API provided by linux kernel can automatically select the best implementation of a given algorithm, eg. optimized implementation in assembly and on multiple architectures.

Strength

For the fast hash the collisions could be generated but hopefully not that easily as for crc32c. For strong hash it’s obvious that finding a collision would be jackpot.

In case of the fast hash the quality can be evaluated using the SMHasher suite.

The contenders

The following list of hashes has been mentioned and considered or evaluated:

xxhash

criteria: license OK, implementation OK, digest size OK, not standardized but in wide use

The hash is quite fast as it tries to exploit the CPU features that allow instruction parallelism. The SMHasher score is 10, that’s great. The linux kernel implementation landed in 5.3.

Candidate for fast hash.

XXH3

Unfortunately the hash is not yet finalized and cannot be in the final round, but for evaluation of speed it was considered. The hash comes from the same author as xxhash.

SipHash

This hash is made for hash tables and hashing short strings but we want 4KiB or larger blocks. Not a candidate.

CRC64

Similar to crc32 and among the contenders only because it was easy to evaluate but otherwise is not in the final round. It has shown to be slow in the microbenchmark.

SHA256

criteria: license OK, implementation OK, digest size OK, standardized in FIPS

The SHA family of hashes is well known, has decent support in CPU and is standardized. Specifically, SHA256 is the strongest that still fits into the available 32 bytes.

Candidate for strong hash.

SHA3

criteria: license OK, implementation OK, digest size OK, standardized in FIPS

Winner of the 2012 hash contest, we can’t leave it out. From the practical perspective of checksum, the speed is bad even for the strong hash type. One of the criteria stated above is performance without special hardware, unlike what was preferred during the SHA3 contest.

Candidate for strong hash.

BLAKE2

criteria: license OK, implementation OK, digest size OK, not standardized

From the family of BLAKE that participated in the 2012 SHA contest, the ‘2’ provides a trade-off speed vs. strength. More and more projects adopt it.

Candidate for strong hash.

Final round

I don’t blame you if you skipped all the previous paragraphs. The (re)search for the next hash was quite informative and fun so it would be shame not to share it, also to document the selection process for some transparency. This is a committee driven process though.

Two hashes selected for the strong type is a compromise to get a fast-but-strong hash yet also something that’s standardized.

The specific version of BLAKE2 is going to be the ‘BLAKE2b-256’ variant, ie. optimized for 64bit (2b) but with 256bit digest.

Microbenchmark

A microbenchmark gives more details about performance of the hashes:

Block: 4KiB (4096 bytes), Iterations: 100000

Hash Total cycles Cycles/iteration
NULL-NOP 56888626 568
NULL-MEMCPY 60644484 606
CRC64 3240483902 32404
CRC32C 174338871 1743
CRC32C-SW 174388920 1743
XXHASH 251802871 2518
XXH3 193287384 1932
BLAKE2b-256-arch 1798517566 17985
BLAKE2b-256 2358400785 23584
BLAKE2s-arch 2593112451 25931
BLAKE2s 3451879891 34518
SHA256 10674261873 106742
SHA3-256 29152193318 291521

Machine: Intel(R) Xeon(R) CPU E5-1620 v3 @ 3.50GHz (AVX2)

Hash implementations are the reference ones in C:

There aren’t optimized versions for all hashes so for fair comparison the unoptimized reference implementation should be used. As BLAKE2 is my personal favourite I added the other variants and optimized versions to observe the relative improvements.

Evaluation

CRC64 was added by mere curiosity how does it compare to the rest. Well, curiosity satisfied.

SHA3 is indeed slow on a CPU.

What isn’t here

There are non-cryptographic hashes like CityHash, FarmHash, Murmur3 and more, that were found unsuitable or not meeting some of the basic criteria. Others like FNV or Fletcher used in ZFS are of comparable strength of crc32c, so that won’t be a progress.

Merging

All the preparatory work in btrfs landed in version 5.3. Hardcoded assumptions of crc32c were abstracted, linux crypto API wired in, with additional cleanups or refactoring. With that in place, adding a new has is a matter of a few lines of code adding the specifier string for crypto API.

The work on btrfs-progs is following the same path.

Right now, the version 5.4 is in development but new features can’t be added, so the target for the new hashes is 5.5. The BLAKE2 algorithm family still needs to be submitted and merged, hopefully they’ll make it to 5.5 as well.

One of my merge process requirements was to do a call for public testing, so we’re going to do that once all the kernel and progs code is ready for testing. Stay tuned.

October 07, 2019 10:00 PM

October 05, 2019

James Bottomley: Why Ethical Open Source Really Isn’t

A lot of virtual ink has been expended debating the practicalities of the new push to adopt so called ethical open source licences. The two principle arguments being it’s not legally enforceable and it’s against the Open Source Definition. Neither of these seems to be hugely controversial and the proponents of ethical licences even acknowledge the latter by starting a push to change the OSD itself. I’m not going to rehash these points but instead I’m going to examine the effects injecting this form of ethics would have on Open Source Communities and society in general. As you can see from the title I already have an opinion but I hope to explain in a reasoned way how that came about.

Ethics is Absolute Ethical Positions are Mostly Relative

Ethics itself is the actual process by which philosophical questions of human morality are resolved. The job of Ethics is to give moral weight to consequences in terms of good and evil (or ethical and unethical). However, ethics also recognizes that actions have indivisible compound consequences of which often some would be classified as unethical and some as ethical. There are actually very few actions where all compound consequences are wholly Ethical (or Unethical). Thus the absolute position that all compound consequences must be ethical rarely exists in practice and what people actually mean when they say an action is “ethical” is that in their judgment the unethical consequences are outweighed by the ethical ones. Where and how you draw this line of ethical being outweighed by unethical is inherently political and can vary from person to person.

To give a concrete example tied to the UN Declaration of Human Rights (since that seems to be being held up as the pinnacle of unbiased ethics): The right to bear arms is enshrined in the US constitution as Amendment 2 and thus is protected under the UNDHR Article 8. However, the UNHDR also recognizes under Article 3 the right to life, liberty and security of person and it’s arguable that flooding the country with guns precipitating mass shootings violates this article. Thus restricting guns in the US would violate 8 and support 3 and not restricting them do the opposite. Which is more important is essentially a political decision and where you fall depend largely on whether you see yourself as Republican or Democrat. The point being this is a classical ethical conundrum where there is no absolute ethical position because it depends on the relative weights you give to the ethical and unethical consequences. The way out of this is negotiation between both sides to achieve a position not necessarily that each side supports wholeheartedly but which each side can live with.

The above example shows the problem of ethical open source because there are so few wholly ethical actions as to make conditioning a licence on this alone pointlessly ineffective and to condition it on actions with mixed ethical consequences effectively injects politics because the line has to be drawn somewhere, which means that open source under this licence becomes a politicized process.

The Relativity of Protest

Once you’ve made the political determination that a certain mixed consequence thing is unethical there’s still the question of what you do about it. For the majority expressing their preference through the ballot box every few years is sufficient. For others the gravity is so great that some form of protest is required. However, what forms of protest you choose to adhere to and what you choose not to is also an ethically relative choice. For instance a lot of the people pushing ethical open source would support the #NoTechForICE political movement. However if you look at locations on twitter, most of them are US based and thus pay taxes to the US government that supports and funds the allegedly unethical behaviour of ICE. Obviously they could protest this by withdrawing their support via taxation but they choose not to because the personal consequences would be too devastating. Instead they push ethical licences and present this as a simple binary choice when it isn’t at all: the decision about whether forcing a political position via a licence is one which may have fewer personally devastating consequences, but which people who weigh the ethical consequences are still entitled to think might be devastating for open source itself and thus an incorrect protest choice.

Community, Discrimination and Society

One of the great advances Open Source Communities have made over the past few years is the attempts to eliminate all forms of discrimination either by the introduction of codes of conduct or via other means. What this is doing is making Open Source more inclusive even as society at large becomes more polarized. In the early days of open source, we realized that simple forms of inclusion, talking face to face, had huge advantages in social terms (the face on the end of the email) and that has been continued into modern times and enhanced with the idea that conferences should be welcoming to all people and promote unbiased discussion in an atmosphere of safety. If Society itself is ever to overcome the current political polarization it will have to begin with both sides talking to each other presumably in one of the few remaining unpolarized venues for such discussion and thus keeping Open Source Communities one of these unpolarized venues is a huge societal good. That means keeping open source unpoliticized and thus free from discrimination against people, gender, sexual orientation, political belief or field of endeavour; the very things our codes of conduct mostly say anyway.

It is also somewhat ironic that the very people who claim to be champions against discrimination in open source now find it necessary to introduce discrimination to further their own supposedly ethical ends.

Conclusion

I hope I’ve demonstrated that ethical open source is really nothing more than co-opting open source as a platform for protest and as such will lead to the politicization of open source and its allied communities causing huge societal harm by removing more of our much needed unpolarized venues for discussion. It is my ethical judgement that this harm outweighs the benefits of using open source as a platform for protest and is thus ethically wrong. With regard to the attempts to rewrite the OSD to be more reflective of modern society, I content that instead of increasing our ability to discriminate by removing the fields of endeavour restriction, we should instead be tightening the anti-discrimination clauses by naming more things that shouldn’t be discriminated against which would make Open Source and the communities which are created by it more welcoming to all manner of contributions and keep them as neutral havens where people of different beliefs can nevertheless observe first hand the utility of mutual collaboration, possibly even learning to bridge the political, cultural and economic divides as a consequence.

October 05, 2019 10:19 PM

October 04, 2019

James Morris: Linux Security Summit North America 2019: Videos and Slides

LSS-NA for 2019 was held in August in San Diego.  Slides are available at the Schedule, and videos of the talks may now be found in this playlist.

LWN covered the following presentations:

The new 3-day format (as previously discussed) worked well, and we’re expecting to continue this next year for LSS-NA.

Details on the 2020 event will be announced soon!

Announcements may be found on the event twitter account @LinuxSecSummit, on the linux-security-module mailing list, and via this very blog.

October 04, 2019 08:09 PM

Matthew Garrett: Investigating the security of Lime scooters

(Note: to be clear, this vulnerability does not exist in the current version of the software on these scooters. Also, this is not the topic of my Kawaiicon talk.)

I've been looking at the security of the Lime escooters. These caught my attention because:
(1) There's a whole bunch of them outside my building, and
(2) I can see them via Bluetooth from my sofa
which, given that I'm extremely lazy, made them more attractive targets than something that would actually require me to leave my home. I did some digging. Limes run Linux and have a single running app that's responsible for scooter management. They have an internal debug port that exposes USB and which, until this happened, ran adb (as root!) over this USB. As a result, there's a fair amount of information available in various places, which made it easier to start figuring out how they work.

The obvious attack surface is Bluetooth (Limes have wifi, but only appear to use it to upload lists of nearby wifi networks, presumably for geolocation if they can't get a GPS fix). Each Lime broadcasts its name as Lime-12345678 where 12345678 is 8 digits of hex. They implement Bluetooth Low Energy and expose a custom service with various attributes. One of these attributes (0x35 on at least some of them) sends Bluetooth traffic to the application processor, which then parses it. This is where things get a little more interesting. The app has a core event loop that can take commands from multiple sources and then makes a decision about which component to dispatch them to. Each command is of the following form:

AT+type,password,time,sequence,data$

where type is one of either ATH, QRY, CMD or DBG. The password is a TOTP derived from the IMEI of the scooter, the time is simply the current date and time of day, the sequence is a monotonically increasing counter and the data is a blob of JSON. The command is terminated with a $ sign. The code is fairly agnostic about where the command came from, which means that you can send the same commands over Bluetooth as you can over the cellular network that the Limes are connected to. Since locking and unlocking is triggered by one of these commands being sent over the network, it ought to be possible to do the same by pushing a command over Bluetooth.

Unfortunately for nefarious individuals, all commands sent over Bluetooth are ignored until an authentication step is performed. The code I looked at had two ways of performing authentication - you could send an authentication token that was derived from the scooter's IMEI and the current time and some other stuff, or you could send a token that was just an HMAC of the IMEI and a static secret. Doing the latter was more appealing, both because it's simpler and because doing so flipped the scooter into manufacturing mode at which point all other command validation was also disabled (bye bye having to generate a TOTP). But how do we get the IMEI? There's actually two approaches:

1) Read it off the sticker that's on the side of the scooter (obvious, uninteresting)
2) Take advantage of how the scooter's Bluetooth name is generated

Remember the 8 digits of hex I mentioned earlier? They're generated by taking the IMEI, encrypting it using DES and a static key (0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88), discarding the first 4 bytes of the output and turning the last 4 bytes into 8 digits of hex. Since we're discarding information, there's no way to immediately reverse the process - but IMEIs for a given manufacturer are all allocated from the same range, so we can just take the entire possible IMEI space for the modem chipset Lime use, encrypt all of them and end up with a mapping of name to IMEI (it turns out this doesn't guarantee that the mapping is unique - for around 0.01%, the same name maps to two different IMEIs). So we now have enough information to generate an authentication token that we can send over Bluetooth, which disables all further authentication and enables us to send further commands to disconnect the scooter from the network (so we can't be tracked) and then unlock and enable the scooter.

(Note: these are actual crimes)

This all seemed very exciting, but then a shock twist occurred - earlier this year, Lime updated their authentication method and now there's actual asymmetric cryptography involved and you'd need to engage in rather more actual crimes to obtain the key material necessary to authenticate over Bluetooth, and all of this research becomes much less interesting other than as an example of how other companies probably shouldn't do it.

In any case, congratulations to Lime on actually implementing security!

comment count unavailable comments

October 04, 2019 06:10 AM

October 01, 2019

James Bottomley: Retro Engineering: Updating a Nexus One for the modern world

A few of you who’ve met me know that my current Android phone is an ancient Nexus One. I like it partly because of the small form factor, partly because I’ve re-engineered pieces of the CyanogneMod OS it runs to suit me and can’t be bothered to keep upporting to newer versions and partly because it annoys a lot of people in the Open Source Community who believe everyone should always be using the latest greatest everything. Actually, the last reason is why, although the Nexus One I currently run is the original google gave me way back in 2010, various people have donated a stack of them to me just in case I might need a replacement.

However, the principle problem with running one of these ancient beasts is that they cannot, due to various flash sizing problems, run anything later than Android 2.3.7 (or CyanogenMod 7.1.0) and since the OpenSSL in that is ancient, it won’t run any TLS protocol beyond 1.0 so with the rush to move to encryption and secure the web, more and more websites are disallowing the old (and, lets admit it, buggy) TLS 1.0 protocol, meaning more and more of the web is steadily going dark to my mobile browser. It’s reached the point where simply to get a boarding card, I have to download the web page from my desktop and transfer it manually to the phone. This started as an annoyance, but it’s becoming a major headache as the last of the websites I still use for mobile service go dark to me. So the task I set myself is to fix this by adding the newer protocols to my phone … I’m an open source developer, I have the source code, it should be easy, right …?

First Problem, the source code and Build Environment

Ten years ago, I did build CyanogenMod from scratch and install it on my phone, what could be so hard about reviving the build environment today. Firstly there was finding it, but github still has a copy and the AOSP project it links to still keeps old versions, so simply doing a

curl https://dl-ssl.google.com/dl/googlesource/git-repo/repo > ~/bin/repo
repo init -u -u git://github.com/CyanogenMod/android.git -b gingerbread --repo-url=git://github.com/android/tools_repo.git
repo sync

Actually worked (of course it took days of googling to remember these basic commands). However the “brunch passion” command to actually build it crashed and burned somewhat spectacularly. Apparently the build environment has moved on in the last decade.

The first problem is that most of the prebuilt x86 binaries are 32 bit. This means you have to build the host for 32 bit, and that involves quite a quest on an x86_64 system to make sure you have all the 32 bit build precursors. The next problem is that java 1.6.0 is required, but fortunately openSUSE build service still has it. Finally, the big problem is a load of c++ compile issues which turn out to be due to the fact that the c++ standard has moved on over the years and gcc-7 tries the latest one. Fortunately this can be fixed with

export HOST_GLOBAL_CPPFLAGS=-std=gnu++98

And the build works. If you’re only building the OpenSSL support infrastructure, you don’t need to build the entire thing, but figuring out the pieces you do need is hard, so building everything is a good way to finesse the dependency problem.

Figuring Out how to Upgrade OpenSSL

Unfortunately, this is Android, so you can’t simply drop a new OpenSSL library into the system and have it work. Firstly, the version of OpenSSL that Android builds with (at least for 2.3.7) is heavily modified, so even an android build of vanilla OpenSSL won’t work because it doesn’t have the necessary patches. Secondly, OpenSSL is very prone to ABI breaks, so if you start with 0.9.8, for instance, you’re never going to be able to support TLS 1.2. Fortunately, Android 2.3.7 has OpenSSL 1.0.0a so it is in the 1.0.0 ABI and versions of openssl for that ABI do support later versions of TLS (but only in version 1.0.1 and beyond). The solution actually is to look at external/openssl and simply update it to the latest version the project has (for CyanogenMod this is cm-10.1.2 which is openssl 1.0.1c … still rather ancient but at least supporting TLS 1.2).

cd external/openssl
git checkout cm-10.1.2
mm

And it builds and even works when installed on the phone … great. Except that nothing can use the later ciphers because the java provider (JSSE) also needs updating to support them. Updating the JSSE provider is a bit of a pain but you can do it in two patches:

Once this is done and installed you can browse most websites. There are still some exceptions because of websites that have caught the “can’t use sha1 in any form” bug, but these are comparatively minor. The two patches apply to libcore and once you have them, you can rebuild and install it.

Safely Installing the updated files

Installing new files in android can be a bit of a pain. The ideal way would be to build the entire rom and reflash, but that’s a huge pain, so the simple way is simply to open the /system partition and dump the files in. Opening the system partition is easy, just do

adb shell
# mount -o remount,rw /system

Uploading the required files is more difficult primarily because you want to make sure you can recover if there’s a mistake. I do this by transferring the files to <file>.new:

adb push out/target/product/passion/system/lib/libcrypto.so /system/lib/libcrypto.so.new
adb push out/target/product/passion/system/lib/libssl.so /system/lib/libssl.so.new
adb push out/target/product/passion/system/framework/core.jar /system/framework/core.jar.new

Now move everything into place and reboot

adb shell
# mv /system/lib/libcrypto.so /system/lib/libcrypto.so.old && mv /system/lib/libcrtypto.so.new /system/lib/libcrypto.so
# mv /system/lib/libssl.so /system/lib/libssl.so.old && mv /system/lib/libssl.so.new /system/lib/libssl.so
# mv /system/framework/core.jar /system/framework/core.jar.old && mv /system/framework/core.jar.new /system/framework/core.jar

If the reboot fails, use adb to recover

adb shell
# mount /system
# mv /system/lib/libcrypto.so.old /system/lib/libcrypto.so
...

Conclusions

That’s it. Following the steps above, my Nexus One can now browse useful internet sites like my Airline and the New York times. The only website I’m still having trouble with is the Wall Street Journal because they disabled all ciphers depending on sha1

October 01, 2019 11:17 PM

Paul E. Mc Kenney: Announcement: Change of Venue

This week of September 30th marks my last week at IBM, and I couldn't be more excited to be moving on to the next phase of my career by joining a great team at Facebook! Yes, yes, I am bringing with me my maintainership of both Linux-kernel RCU and the Linux-kernel memory model, my editing of "Is Parallel Programming Hard, And, If So, What Can You Do About It?", and other similar items, just in case you were wondering. ;-)

Of course, it is only appropriate for me to express my gratitude and appreciation for the many wonderful colleagues at IBM, before that at Sequent, and more recently at Red Hat. Together with others in the various communities, we in our own modest way have changed the world several times over. It was a great honor and privilege to have worked with you, and I expect and hope that our path will cross again. For those in the Linux-kernel and C/C++ standards communities, our paths will continue to run quite closely, and I look forward to continued productive and enjoyable collaborations.

I also have every reason to believe that IBM will continue to be a valuable and essential part of the computing industry as it continues through its second century, especially given the recent addition of Red Hat to the IBM family.

But all that aside, I am eagerly looking forward to starting Facebook bootcamp next week. Which is said to involve one of those newfangled smartphones. ;-)

October 01, 2019 10:24 PM

September 27, 2019

Matthew Garrett: Do we need to rethink what free software is?

Licensing has always been a fundamental tool in achieving free software's goals, with copyleft licenses deliberately taking advantage of copyright to ensure that all further recipients of software are in a position to exercise free software's four essential freedoms. Recently we've seen people raising two very different concerns around existing licenses and proposing new types of license as remedies, and while both are (at present) incompatible with our existing concepts of what free software is, they both raise genuine issues that the community should seriously consider.

The first is the rise in licenses that attempt to restrict business models based around providing software as a service. If users can pay Amazon to provide a hosted version of a piece of software, there's little incentive for them to pay the authors of that software. This has led to various projects adopting license terms such as the Commons Clause that effectively make it nonviable to provide such a service, forcing providers to pay for a commercial use license instead.

In general the entities pushing for these licenses are VC backed companies[1] who are themselves benefiting from free software written by volunteers that they give nothing back to, so I have very little sympathy. But it does raise a larger issue - how do we ensure that production of free software isn't just a mechanism for the transformation of unpaid labour into corporate profit? I'm fortunate enough to be paid to write free software, but many projects of immense infrastructural importance are simultaneously fundamental to multiple business models and also chronically underfunded. In an era where people are becoming increasingly vocal about wealth and power disparity, this obvious unfairness will result in people attempting to find mechanisms to impose some degree of balance - and given the degree to which copyleft licenses prevented certain abuses of the commons, it's likely that people will attempt to do so using licenses.

At the same time, people are spending more time considering some of the other ethical outcomes of free software. Copyleft ensures that you can share your code with your neighbour without your neighbour being able to deny the same freedom to others, but it does nothing to prevent your neighbour using your code to deny other fundamental, non-software, freedoms. As governments make more and more use of technology to perform acts of mass surveillance, detention, and even genocide, software authors may feel legitimately appalled at the idea that they are helping enable this by allowing their software to be used for any purpose. The JSON license includes a requirement that "The Software shall be used for Good, not Evil", but the lack of any meaningful clarity around what "Good" and "Evil" actually mean makes it hard to determine whether it achieved its aims.

The definition of free software includes the assertion that it must be possible to use the software for any purpose. But if it is possible to use software in such a way that others lose their freedom to exercise those rights, is this really the standard we should be holding? Again, it's unsurprising that people will attempt to solve this problem through licensing, even if in doing so they no longer meet the current definition of free software.

I don't have solutions for these problems, and I don't know for sure that it's possible to solve them without causing more harm than good in the process. But in the absence of these issues being discussed within the free software community, we risk free software being splintered - on one side, with companies imposing increasingly draconian licensing terms in an attempt to prop up their business models, and on the other side, with people deciding that protecting people's freedom to life, liberty and the pursuit of happiness is more important than protecting their freedom to use software to deny those freedoms to others.

As stewards of the free software definition, the Free Software Foundation should be taking the lead in ensuring that these issues are discussed. The priority of the board right now should be to restructure itself to ensure that it can legitimately claim to represent the community and play the leadership role it's been failing to in recent years, otherwise the opportunity will be lost and much of the activist energy that underpins free software will be spent elsewhere.

If free software is going to maintain relevance, it needs to continue to explain how it interacts with contemporary social issues. If any organisation is going to claim to lead the community, it needs to be doing that.

[1] Plus one VC firm itself - Bain Capital, an investment firm notorious for investing in companies, extracting as much value as possible and then allowing the companies to go bankrupt

comment count unavailable comments

September 27, 2019 05:47 PM

September 24, 2019

Pete Zaitcev: Github

A job at LinkedIn in my area includes the following instruction statement:

Don't apply if you can't really write code and don't have a github profile. This is a job for an expert level coder.

I remember the simpler times when LWN authors fretted about the SourceForge monopoly capturing all FLOSS projects.

P.S. For the record, I do have a profile at Github. It is required in order to contribute to RDO, because it is the only way to login in their Gerrit and submit patches for review. Ironically, Fedora offers a single sign-on with FAS and RDO is a Red Hat sponsored project, but nope — it's easier to force contributors into Github.

September 24, 2019 08:06 PM

September 14, 2019

Matthew Garrett: It's time to talk about post-RMS Free Software

Richard Stallman has once again managed to demonstrate incredible insensitivity[1]. There's an argument that in a pure technical universe this is irrelevant and we should instead only consider what he does in free software[2], but free software isn't a purely technical topic - the GNU Manifesto is nakedly political, and while free software may result in better technical outcomes it is fundamentally focused on individual freedom and will compromise on technical excellence if otherwise the result would be any compromise on those freedoms. And in a political movement, there is no way that we can ignore the behaviour and beliefs of that movement's leader. Stallman is driving away our natural allies. It's inappropriate for him to continue as the figurehead for free software.

But I'm not calling for Stallman to be replaced. If the history of social movements has taught us anything, it's that tying a movement to a single individual is a recipe for disaster. The FSF needs a president, but there's no need for that person to be a leader - instead, we need to foster an environment where any member of the community can feel empowered to speak up about the importance of free software. A decentralised movement about returning freedoms to individuals can't also be about elevating a single individual to near-magical status. Heroes will always end up letting us down. We fix that by removing the need for heroes in the first place, not attempting to find increasingly perfect heroes.

Stallman was never going to save us. We need to take responsibility for saving ourselves. Let's talk about how we do that.

[1] There will doubtless be people who will leap to his defense with the assertion that he's neurodivergent and all of these cases are consequences of that.

(A) I am unaware of a formal diagnosis of that, and I am unqualified to make one myself. I suspect that basically everyone making that argument is similarly unqualified.
(B) I've spent a lot of time working with him to help him understand why various positions he holds are harmful. I've reached the conclusion that it's not that he's unable to understand, he's just unwilling to change his mind.

[2] This argument is, obviously, bullshit

comment count unavailable comments

September 14, 2019 11:57 AM

September 10, 2019

Davidlohr Bueso: Linux v5.2: Performance Goodies

locking/rwsem: optimize trylocking for the uncontended case

This applies the idea that in most cases, a rwsem will be uncontended (single threaded). For example, experimentation showed that page fault paths really expect this. The change itself makes the code basically not read in a cacheline in a tight loop over and over. Note however that this can be a double edged sword, as microbenchmarks have show performance deterioration upon high amounts of tasks, albeit mainly pathological workloads.
[Commit ddb20d1d3aed a338ecb07a33]

lib/lockref: limit number of cmpxchg loop retries

Unbounded loops are rather froned upon, specially ones ones doing CAS operations. As such, Linus suggested adding an arbitrary upper bound to the loop to force the slowpath (spinlock fallback), which was seen to improve performance on an adhoc testcase on hardware that incurrs in the loop retry game.
[Commit 893a7d32e8e0]
 

rcu: avoid unnecessary softirqs when system is idle

Upon an idle system with no pending callbacks, rcu sofirqs to process callbacks were being triggered repeatedly. Specifically the mismatch between cpu_no_qs and core_need_rq was addressed.
[Commit 671a63517cf9]

rcu: fix potential cond_resched() slowdowns

When using the jiffies_till_sched_qs kernel boot parameter, a bug made jiffies_to_sched_qs become uinitialized as zero and therefore impacts negatively in cond_resched().
[Commit 6973032a602e]

mm: improve vmap allocation

Doing a vmalloc can be quite slow at times, and with it being done with preemption disabled, can affect workloads that are sensible to this. The problem relies in the fact that a new VA area is done over a busy list iteration until a suitable hole is found between two busy areas. The changes propose the always reliable red-black tree to keep blocks sorted by their offsets along with a list keeping the free space in order of increasing addresses.
[Commit 68ad4a330433 68571be99f32]


mm/gup: safe usage of get_user_pages_fast() with DAX

Users of get_user_pages_fast() have potential performance benefits compared to its non-fast cousin, by avoiding mmap_sem, than it's non-fast equivalent. However drivers such as rdma can pin these pages for a significant amount of time, where a number of issues come with the filesystem as referenced pages will block a number of critical operations and is known to mess up DAX. A new FOLL_LONGTERM flag is added and checked accordingly; which also means that other users such as xdp can now also be converted to gup_fast.
[Commit 932f4a630a69 b798bec4741b 73b0140bf0fe 7af75561e171 9fdf4aa15673 664b21e717cf f3b4fdb18cb5 ]

lib/sort: faster and smaller

Because CONFIG_RETPOLINE has made indirect calls much more expensive, these changes reduce the number made by the library sort functions, lib/sort and lib/list_sort. A number of optimizations and clever tricks are used such as a more efficient bottom up heapsort and playing nicer with store buffers.
[Commit 37d0ec34d111 22a241ccb2c1 8fb583c4258d 043b3f7b6388 b5c56e0cdd62]

ipc/mqueue: make msg priorities truly O(1)

By keeping the pointer to the tree's rightmost node, the process of consuming a message can be done in constant time, instead of logarithmic.
[Commit a5091fda4e3c]

x86/fpu: load FPU registers on return to userland

This is a large, 27-patch, cleanup and optimization to only load fpu registers on return to userspace, instead of upon every context switch. This means that tasks that remain in kernel space do not load the registers. Accessing the fpu registers in the kernel requires disabling preemption and bottom-halfs  for scheduler and softirqs, accordingly.

x86/hyper-v: implement EOI optimization

Avoid a vmexit on EOI. This was seen to slightly improve IOPS when testing nvme disks with raid and ext4.
[Commit ba696429d290]
 

btrfs: improve performance on fsync of files with multiple hardlinks

A fix to a performance regression seen in pgbench which can make fsync a full transaction commit in order to avoid losing hard links and new ancestors of the fsynced inode.
[Commit b8aa330d2acb]

fsnotify: fix unlink performance regression

This restores an unlink performance optimization that avoids take_dentry_name_snapshot().
[Commit 4d8e7055a405]

block/bfq: do not merge queues on flash storage with queuing

Disable queue merging on non-rotational devices with internal queueing, thus boosting throughput on interleaved IO.
[Commit 8cacc5ab3eac]

September 10, 2019 07:26 PM

September 08, 2019

James Bottomley: The Mythical Economic Model of Open Source

It has become fashionable today to study open source through the lens of economic benefits to developers and sometimes draw rather alarming conclusions. It has also become fashionable to assume a business model tie and then berate the open source community, or their licences, for lack of leadership when the business model fails. The purpose of this article is to explain, in the first part, the fallacy of assuming any economic tie in open source at all and, in the second part, go on to explain how economics in open source is situational and give an overview of some of the more successful models.

Open Source is a Creative Intellectual Endeavour

All the creative endeavours of humanity, like art, science or even writing code, are often viewed as activities that produce societal benefit. Logically, therefore, the people who engage in them are seen as benefactors of society, but assuming people engage in these endeavours purely to benefit society is mostly wrong. People engage in creative endeavours because it satisfies some deep need within themselves to exercise creativity and solve problems often with little regard to the societal benefit. The other problem is that the more directed and regimented a creative endeavour is, the less productive its output becomes. Essentially to be truly creative, the individual has to be free to pursue their own ideas. The conundrum for society therefore is how do you harness this creativity for societal good if you can’t direct it without stifling the very creativity you want to harness? Obviously society has evolved many models that answer this (universities, benefactors, art incubation programmes, museums, galleries and the like) with particular inducements like funding, collaboration, infrastructure and so on.

Why Open Source development is better than Proprietary

Simply put, the Open Source model, involving huge freedoms to developers to decide direction and great opportunities for collaboration stimulates the intellectual creativity of those developers to a far greater extent than when you have a regimented project plan and a specific task within it. The most creatively deadening job for any engineer is to find themselves strictly bound within the confines of a project plan for everything. This, by the way, is why simply allowing a percentage of paid time for participating in Open Source seems to enhance input to proprietary projects: the liberated creativity has a knock on effect even in regimented development. However, obviously, the goal for any Corporation dependent on code development should be to go beyond the knock on effect and actually employ open source methodologies everywhere high creativity is needed.

What is Open Source?

Open Source has it’s origin in code sharing models, permissive from BSD and reciprocal from GNU. However, one of its great values is the reasons why people do open source aren’t the same reasons why the framework was created in the first place. Today Open Source is a framework which stimulates creativity among developers and helps them create communities, provides economic benefits to corportations (provided they understand how to harness them) and produces a great societal good in general in terms of published reusable code.

Economics and Open Source

As I said earlier, the framework of Open Source has no tie to economics, in the same way things like artistic endeavour don’t. It is possible for a great artist to make money (as Picasso did), but it’s equally possible for a great artist to live all their lives in penury (as van Gough did). The demonstration of the analogy is that trying to measure the greatness of the art by the income of the artist is completely wrong and shortsighted. Developing the ability to exploit your art for commercial gain is an additional skill an artist can develop (or not, as they choose) it’s also an ability they could fail in and in all cases it bears no relation to the societal good their art produces. In precisely the same way, finding an economic model that allows you to exploit open source (either individually or commercially) is firstly a matter of choice (if you have other reasons for doing Open Source, there’s no need to bother) and secondly not a guarantee of success because not all models succeed. Perhaps the easiest way to appreciate this is through the lens of personal history.

Why I got into Open Source

As a physics PhD student, I’d always been interested in how operating systems functioned, but thanks to the BSD lawsuit and being in the UK I had no access to the actual source code. When Linux came along as a distribution in 1992, it was a revelation: not only could I read the source code but I could have a fully functional UNIX like system at home instead of having to queue for time to write up my thesis in TeX on the limited number of department terminals.

After completing my PhD I was offered a job looking after computer systems in the department and my first success was shaving a factor of ten off the computing budget by buying cheap pentium systems running Linux instead of proprietary UNIX workstations. This success was nearly derailed by an NFS bug in Linux but finding and fixing the bug (and getting it upstream into the 1.0.2 kernel) cemented the budget savings and proved to the department that we could handle this new technology for a fraction of the cost of the old. It also confirmed my desire to poke around in the Operating System which I continued to do, even as I moved to America to work on Proprietary software.

In 2000 I got my first Open Source break when the product I’d been working on got sold to a silicon valley startup, SteelEye, whose business plan was to bring High Availability to Linux. As the only person on the team with an Open Source track record, I became first the Architect and later CTO of the company, with my first job being to make the somewhat eccentric Linux SCSI subsystem work for the shared SCSI clusters LifeKeeper then used. Getting SCSI working lead to fund interactions with the Linux community, an Invitation to present on fixing SCSI to the Kernel Summit in 2002 and the maintainership of SCSI in 2003. From that point, working on upstream open source became a fixture of my Job requirements but progressing through Novell, Parallels and now IBM it also became a quality sought by employers.

I have definitely made some money consulting on Open Source, but it’s been dwarfed by my salary which does get a boost from my being an Open Source developer with an external track record.

The Primary Contributor Economic Models

Looking at the active contributors to Open Source, the primary model is that either your job description includes working on designated open source projects so you’re paid to contribute as your day job
or you were hired because of what you’ve already done in open source and contributing more is a tolerated use of your employer’s time, a third, and by far smaller group is people who work full-time on Open Source but fund themselves either by shared contributions like patreon or tidelift or by actively consulting on their projects. However, these models cover existing contributors and they’re not really a route to becoming a contributor because employers like certainty so they’re unlikely to hire someone with no track record to work on open source, and are probably not going to tolerate use of their time for developing random open source projects. This means that the route to becoming a contributor, like the route to becoming an artist, is to begin in your own time.

Users versus Developers

Open Source, by its nature, is built by developers for developers. This means that although the primary consumers of open source are end users, they get pretty much no say in how the project evolves. This lack of user involvement has been lamented over the years, especially in projects like the Linux Desktop, but no real community solution has ever been found. The bottom line is that users often don’t know what they want and even if they do they can’t put it in technical terms, meaning that all user driven product development involves extensive and expensive product research which is far beyond any open source project. However, this type of product research is well within the ability of most corporations, who can also afford to hire developers to provide input and influence into Open Source projects.

Business Model One: Reflecting the Needs of Users

In many ways, this has become the primary business model of open source. The theory is simple: develop a traditional customer focussed business strategy and execute it by connecting the gathered opinions of customers to the open source project in exchange for revenue for subscription, support or even early shipped product. The business value to the end user is simple: it’s the business value of the product tuned to their needs and the fact that they wouldn’t be prepared to develop the skills to interact with the open source developer community themselves. This business model starts to break down if the end users acquire developer sophistication, as happens with Red Hat and Enterprise users. However, this can still be combatted by making sure its economically unfeasible for a single end user to match the breadth of the offering (the entire distribution). In this case, the ability of the end user to become involved in individual open source projects which matter to them is actually a better and cheaper way of doing product research and feeds back into the synergy of this business model.

This business model entirely breaks down when, as in the case of the cloud service provider, the end user becomes big enough and technically sophisticated enough to run their own distributions and sees doing this as a necessary adjunct to their service business. This means that you can no-longer escape the technical sophistication of the end user by pursuing a breadth of offerings strategy.

Business Model Two: Drive Innovation and Standardization

Although venture capitalists (VCs) pay lip service to the idea of constant innovation, this isn’t actually what they do as a business model: they tend to take an innovation and then monetize it. The problem is this model doesn’t work for open source: retaining control of an open source project requires a constant stream of innovation within the source tree itself. Single innovations get attention but unless they’re followed up with another innovation, they tend to give the impression your source tree is stagnating, encouraging forks. However, the most useful property of open source is that by sharing a project and encouraging contributions, you can obtain a constant stream of innovation from a well managed community. Once you have a constant stream of innovation to show, forking the project becomes much harder, even for a cloud service provider with hundreds of developers, because they must show they can match the innovation stream in the public tree. Add to that Standardization which in open source simply means getting your project adopted for use by multiple consumers (say two different clouds, or a range of industry). Further, if the project is largely run by a single entity and properly managed, seeing the incoming innovations allows you to recruit the best innovators, thus giving you direct ownership of most of the innovation stream. In the early days, you make money simply by offering user connection services as in Business Model One, but the ultimate goal is likely acquisition for the talent possesed, which is a standard VC exit strategy.

All of this points to the hypothesis that the current VC model is wrong. Instead of investing in people with the ideas, you should be investing in people who can attract and lead others with ideas

Other Business Models

Although the models listed above have proven successful over time, they’re by no means the only possible ones. As the space of potential business models gets explored, it could turn out they’re not even the best ones, meaning the potential innovation a savvy business executive might bring to open source is newer and better business models.

Conclusions

Business models are optional extras with open source and just because you have a successful open source project does not mean you’ll have an equally successful business model unless you put sufficient thought into constructing and maintaining it. Thus a successful open source start up requires three elements: A sound business model, or someone who can evolve one, a solid community leader and manager and someone with technical ability in the problem space.

If you like working in Open Source as a contributor, you don’t necessarily have to have a business model at all and you can often simply rely on recognition leading to opportunities that provide sufficient remuneration.

Although there are several well known business models for exploiting open source, there’s no reason you can’t create your own different one but remember: a successful open source project in no way guarantees a successful business model.

September 08, 2019 09:35 AM

August 30, 2019

Pete Zaitcev: Docker Block Storage... say what again?

Found an odd job posting at the website of Rancher:

What you will be doing

Okay. Since they talk about consistency and replication together, this thing probably provides actual service, in addition to the necessary orchestration. Kind of the ill-fated Sheepdog. They may under-estimate the amount of work necesary, sure. Look no further than Ceph RBD. Remember how much work it took for a genius like Sage? But a certain arrogance is essential in a start-up, and Rancher only employs 150 people.

Also, nobody is dumb enough to write orchestration in Go, right? So this probably is not just a layer on top of Ceph or whatever.

Well, it's still possible that it's merely an in-house equivalent of OpenStack Cinder, and they want it in Go because they are a Go house and if you have a hammer everything looks like a nail.

Either way, here's the main question: what does block storage have to do with Docker?

Docker, as far as I know, is a container runtime. And containers do not consume block storage. They plug into a Linux kernel that presents POSIX to them, only namespaced. Granted, certain applications consume block storage through Linux, that is why we have O_DIRECT. But to roll out a whole service like this just for those rare appliations... I don't think so.

Why would anyone want block storage for (Docker) containers? Isn't it absurd? What am I missing and what is Rancher up to?

UPDATE:

The key to remember here is that while running containers aren't using block storage, Docker containers are distributed as disk images, and they get a their own root filesystem by default. Therefore, any time anyone adds a Docker container, they have to allocate a block device and dump the application image into it. So, yes, it is some kind of Docker Cinder they are trying to build.

See Red Hat docs about managing Docker block storage in Atomic Host (h/t penguin42).

August 30, 2019 05:52 PM