Kernel Planet

October 30, 2014

Matthew Garrett: Hacker News metrics (first rough approach)

I'm not a huge fan of Hacker News[1]. My impression continues to be that it ends up promoting stories that align with the Silicon Valley narrative of meritocracy, technology will fix everything, regulation is the cancer killing agile startups, and discouraging stories that suggest that the world of technology is, broadly speaking, awful and we should all be ashamed of ourselves.

But as a good data-driven person[2], wouldn't it be nice to have numbers rather than just handwaving? In the absence of a good public dataset, I scraped Hacker Slide to get just over two months of data in the form of hourly snapshots of stories, their age, their score and their position. I then applied a trivial test:

  1. If the story is younger than any other story
  2. and the story has a higher score than that other story
  3. and the story has a worse ranking than that other story
  4. and at least one of these two stories is on the front page
then the story is considered to have been penalised.

(note: "penalised" can have several meanings. It may be due to explicit flagging, or it may be due to an automated system deciding that the story is controversial or appears to be supported by a voting ring. There may be other reasons. I haven't attempted to separate them, because for my purposes it doesn't matter. The algorithm is discussed here.)

Now, ideally I'd classify my dataset based on manual analysis and classification of stories, but I'm lazy (see [2]) and so just tried some keyword analysis:
KeywordPenalisedUnpenalised
Women134
Harass20
Female51
Intel23
x8634
ARM34
Airplane12
Startup4626

A few things to note:
  1. Lots of stories are penalised. Of the front page stories in my dataset, I count 3240 stories that have some kind of penalty applied, against 2848 that don't. The default seems to be that some kind of detection will kick in.
  2. Stories containing keywords that suggest they refer to issues around social justice appear more likely to be penalised than stories that refer to technical matters
  3. There are other topics that are also disproportionately likely to be penalised. That's interesting, but not really relevant - I'm not necessarily arguing that social issues are penalised out of an active desire to make them go away, merely that the existing ranking system tends to result in it happening anyway.

This clearly isn't an especially rigorous analysis, and in future I hope to do a better job. But for now the evidence appears consistent with my innate prejudice - the Hacker News ranking algorithm tends to penalise stories that address social issues. An interesting next step would be to attempt to infer whether the reasons for the penalties are similar between different categories of penalised stories[3], but I'm not sure how practical that is with the publicly available data.

(Raw data is here, penalised stories are here, unpenalised stories are here)


[1] Moving to San Francisco has resulted in it making more sense, but really that just makes me even more depressed.
[2] Ha ha like fuck my PhD's in biology
[3] Perhaps stories about startups tend to get penalised because of voter ring detection from people trying to promote their startup, while stories about social issues tend to get penalised because of controversy detection?

comment count unavailable comments

October 30, 2014 06:18 PM

Matthew Garrett: Linux Container Security

First, read these slides. Done? Good.

(Edit: Just to clarify - these are not my slides. They're from a presentation Jerome Petazzoni gave at Linuxcon NA earlier this year)

Hypervisors present a smaller attack surface than containers. This is somewhat mitigated in containers by using seccomp, selinux and restricting capabilities in order to reduce the number of kernel entry points that untrusted code can touch, but even so there is simply a greater quantity of privileged code available to untrusted apps in a container environment when compared to a hypervisor environment[1].

Does this mean containers provide reduced security? That's an arguable point. In the event of a new kernel vulnerability, container-based deployments merely need to upgrade the kernel on the host and restart all the containers. Full VMs need to upgrade the kernel in each individual image, which takes longer and may be delayed due to the additional disruption. In the event of a flaw in some remotely accessible code running in your image, an attacker's ability to cause further damage may be restricted by the existing seccomp and capabilities configuration in a container. They may be able to escalate to a more privileged user in a full VM.

I'm not really compelled by either of these arguments. Both argue that the security of your container is improved, but in almost all cases exploiting these vulnerabilities would require that an attacker already be able to run arbitrary code in your container. Many container deployments are task-specific rather than running a full system, and in that case your attacker is already able to compromise pretty much everything within the container. The argument's stronger in the Virtual Private Server case, but there you're trading that off against losing some other security features - sure, you're deploying seccomp, but you can't use selinux inside your container, because the policy isn't per-namespace[2].

So that seems like kind of a wash - there's maybe marginal increases in practical security for certain kinds of deployment, and perhaps marginal decreases for others. We end up coming back to the attack surface, and it seems inevitable that that's always going to be larger in container environments. The question is, does it matter? If the larger attack surface still only results in one more vulnerability per thousand years, you probably don't care. The aim isn't to get containers to the same level of security as hypervisors, it's to get them close enough that the difference doesn't matter.

I don't think we're there yet. Searching the kernel for bugs triggered by Trinity shows plenty of cases where the kernel screws up from unprivileged input[3]. A sufficiently strong seccomp policy plus tight restrictions on the ability of a container to touch /proc, /sys and /dev helps a lot here, but it's not full coverage. The presentation I linked to at the top of this post suggests using the grsec patches - these will tend to mitigate several (but not all) kernel vulnerabilities, but there's tradeoffs in (a) ease of management (having to build your own kernels) and (b) performance (several of the grsec options reduce performance).

But this isn't intended as a complaint. Or, rather, it is, just not about security. I suspect containers can be made sufficiently secure that the attack surface size doesn't matter. But who's going to do that work? As mentioned, modern container deployment tools make use of a number of kernel security features. But there's been something of a dearth of contributions from the companies who sell container-based services. Meaningful work here would include things like:


These aren't easy jobs, but they're important, and I'm hoping that the lack of obvious development in areas like this is merely a symptom of the youth of the technology rather than a lack of meaningful desire to make things better. But until things improve, it's going to be far too easy to write containers off as a "convenient, cheap, secure: choose two" tradeoff. That's not a winning strategy.

[1] Companies using hypervisors! Audit your qemu setup to ensure that you're not providing more emulated hardware than necessary to your guests. If you're using KVM, ensure that you're using sVirt (either selinux or apparmor backed) in order to restrict qemu's privileges.
[2] There's apparently some support for loading per-namespace Apparmor policies, but that means that the process is no longer confined by the sVirt policy
[3] To be fair, last time I ran Trinity under Docker under a VM, it ended up killing my host. Glass houses, etc.

comment count unavailable comments

October 30, 2014 01:11 AM

Matthew Garrett: On joining the FSF board

I joined the board of directors of the Free Software Foundation a couple of weeks ago. I've been travelling a bunch since then, so haven't really had time to write about it. But since I'm currently waiting for a test job to finish, why not?

It's impossible to overstate how important free software is. A movement that began with a quest to work around a faulty printer is now our greatest defence against a world full of hostile actors. Without the ability to examine software, we can have no real faith that we haven't been put at risk by backdoors introduced through incompetence or malice. Without the freedom to modify software, we have no chance of updating it to deal with the new challenges that we face on a daily basis. Without the freedom to pass that modified software on to others, we are unable to help people who don't have the technical skills to protect themselves.

Free software isn't sufficient for building a trustworthy computing environment, one that not merely protects the user but respects the user. But it is necessary for that, and that's why I continue to evangelise on its behalf at every opportunity.

However.

Free software has a problem. It's natural to write software to satisfy our own needs, but in doing so we write software that doesn't provide as much benefit to people who have different needs. We need to listen to others, improve our knowledge of their requirements and ensure that they are in a position to benefit from the freedoms we espouse. And that means building diverse communities, communities that are inclusive regardless of people's race, gender, sexuality or economic background. Free software that ends up designed primarily to meet the needs of well-off white men is a failure. We do not improve the world by ignoring the majority of people in it. To do that, we need to listen to others. And to do that, we need to ensure that our community is accessible to everybody.

That's not the case right now. We are a community that is disproportionately male, disproportionately white, disproportionately rich. This is made strikingly obvious by looking at the composition of the FSF board, a body made up entirely of white men. In joining the board, I have perpetuated this. I do not bring new experiences. I do not bring an understanding of an entirely different set of problems. I do not serve as an inspiration to groups currently under-represented in our communities. I am, in short, a hypocrite.

So why did I do it? Why have I joined an organisation whose founder I publicly criticised for making sexist jokes in a conference presentation? I'm afraid that my answer may not seem convincing, but in the end it boils down to feeling that I can make more of a difference from within than from outside. I am now in a position to ensure that the board never forgets to consider diversity when making decisions. I am in a position to advocate for programs that build us stronger, more representative communities. I am in a position to take responsibility for our failings and try to do better in future.

People can justifiably conclude that I'm making excuses, and I can make no argument against that other than to be asked to be judged by my actions. I hope to be able to look back at my time with the FSF and believe that I helped make a positive difference. But maybe this is hubris. Maybe I am just perpetuating the status quo. If so, I absolutely deserve criticism for my choices. We'll find out in a few years.

comment count unavailable comments

October 30, 2014 12:45 AM

October 27, 2014

Paul E. Mc Kenney: Lies that firmware tells RCU

One of the complaints that real-time people have against some firmware is that it lies about its age, attempting to cover up cycle-stealing via SMIs by reprogramming the TSC. Some firmware goes farther and lies about the number of CPUs on the system, apparently on the grounds that more is better, regardless of how many of those alleged CPUs actually exist.

RCU used to naively believe the firmware, and would therefore create one set of rcuo kthreads per advertised CPU. On some systems, this resulted in hundreds of such kthreads on systems with only a few tens of CPUs. But RCU can choose to create the rcuo kthreads only for CPUs that actually come online. Problem solved!

Mostly solved, that is.

Yanko Kaneti, Jay Vosburgh, Meelis Roos, and Eric B Munson discovered the “mostly” part when they encountered hangs in _rcu_barrier(). So what is rcu_barrier()?

The rcu_barrier() primitive waits for all pre-existing callbacks to be invoked. This is useful when you want to unload a module that uses call_rcu(), as described in this LWN article. It is important to note that rcu_barrier() does not necessarily wait for a full RCU grace period. In fact, if there are currently no RCU callbacks queued, rcu_barrier() is within its rights to simply return immediately. Otherwise, rcu_barrier() enqueues a callback on each CPU that already has callbacks, and waits for all these callbacks to be invoked. Because RCU is careful to invoke all callbacks posted to a given CPU in order, this guarantees that by the time rcu_barrier() returns, all pre-existing RCU callbacks will have already been invoked, as required.

However, it is possible to offload invocation of a given CPU's RCU callbacks to rcuo kthreads, as described in this LWN article. This kthread might well be executing on some other CPU, which means that the callbacks are moved from one list to another as they pass through their lifecycles. This makes it difficult for rcu_barrier() to reliably determine whether or not there are RCU callbacks pending for an offloaded CPU. So rcu_barrier() simply unconditionally enqueues an RCU callback for each offloaded CPU, regardless of that CPU's state.

In fact, rcu_barrier() even enqueues a callback for offloaded CPUs that are offline. The reason for this odd-seeming design decision is that a given CPU might enqueue a huge number of callbacks, then go offline. It might take the corresponding rcuo kthread significant time to work its way through this backlog of callbacks, which means that rcu_barrier() cannot safely assume that an offloaded CPU is callback-free just because it happens to be offline. So, to come full circle, rcu_barrier() enqueues an RCU callback for all offloaded CPUs, regardless of their state.

This approach works quite well in practice.

At least, it works well on systems where the firmware provides the Linux kernel with an accurate count of the number of CPUs. However, it breaks horribly when the firmware over-reports the number of CPUs, because then the system will then have CPUs that never ever come online. If these CPUs have been designated as offloaded CPUs, this means that their rcuo kthreads will never ever be spawned, which in turn means that any callbacks enqueued for these mythical CPUs will never ever be invoked. And because rcu_barrier() waits for all the callbacks that it posts to be invoked, rcu_barrier() ends up waiting forever, which can of course result in hangs.

The solution is to make rcu_barrier() refrain from posting callbacks for offloaded CPUs that have never been online, in other words, for CPUs that do not yet have an rcuo kthread.

With some luck, this patch will clear things up. And I did take the precaution of reviewing all of RCU's uses of for_each_possible_cpu(), so here is hoping! ;-)

October 27, 2014 10:09 PM

James Morris: Linux Security Summit 2014 Wrap-Up

The slides from the 2014 Linux Security Summit in August may be found linked at the schedule.

LWN covered both the James Bottomley keynote, and the SELinux on Android talk by Stephen Smalley.

We had an engaging and productive two days, with strong attendance throughout.  We’ll likely follow a similar format next year at LinuxCon.  I hope we can continue to expand the contributor base beyond mostly kernel developers.  We’re doing ok, but can certainly do better.  We’ll also look at finding a sponsor for food next year.

Thanks to those who contributed and attended, to the program committee, and of course, to the events crew at Linux Foundation, who do all of the heavy lifting logistics-wise.

See you next year!

October 27, 2014 12:56 PM

October 23, 2014

Dave Jones: Trinity and pages of random data.

Something trinity uses a lot, are pages of random data. They get passed around to syscalls, ioctls, whatever. 5 years ago, before I’d even added multiple children to trinity, this was done using ‘page_rand’. A single page allocated on startup, that was passed around, and scribbled over by anyone who needed something to scribble over.

After the VM work I did earlier this year, where we recycle successful calls to mmap, and inherit them across children, quite a few places started passing around map structs instead. This was good, because it started shaking out the many many kernel bugs that we had lingering in huge page support.

It kind of sucked that we had two sets of routines for doing things like “get a page”, “dirty a page” etc which were fundamentally the same operations, except one set worked on a pointer, and one on a struct. It also sucked that the page_rand code was actually buggy in a number of ways, which showed up as overruns.

Over time, I’ve been trying to move all the code that used page_rand to using mappings instead. Today I finished that work, and ripped out the last vestiges of page_rand support. The only real remnants of the supporting code was some of the dirtying code. We used to have separate ‘dirty page_rand’ and ‘dirty an mmap’ routines. After todays work, there’s now a single set of functions for mappings. There’s still a bunch more consolidation and cleanup to do, which I’ll get fixed up and merged over the next week.

The only feature that’s now missing is periodic dirtying of mappings. We did this every 100 syscalls for page_rand. Right now we only dirty mmap’s after a mmap() call succeeds, or on an mremap(). I plan on getting this done tomorrow.

The motivation for ripping out all this code, and unifying a lot of the support code is that a lot of code paths get simpler, and more importantly, the code in place now takes ‘len’ arguments, so we’re in a better position to make sure we’re not passing buffers that are too small when we do random syscalls.

In other news: while I was happy to report a few days ago that 3.18rc1 fixed up the btrfs bug that had been bothering me for a while, I’ve now managed to discover two new btrfs bugs [1]. [2]. Grumble.

Trinity and pages of random data. is a post from: codemonkey.org.uk

October 23, 2014 02:33 AM

October 19, 2014

Pete Zaitcev: Laptop bleg

I'm considering a laptop (actually two). Requirements:

Where it comes from is mostly my wife's Sony Vaio Z. I used to have a Z back in 2001 or so, when they were in 12" format. It was the best laptop ever, but unfortunately it succumbed to a DC-DC converter failure. The modern Z is not like that Z. The most super annoying problem is that the screws holding the battery failed in an interesting way: it is impossible to remove the battery now. Also, the contact between the battery and the moterboard is marginal. I managed to fix the problem by manufacturing a finely shaped wooden wedge that I drove into a gap and thus extended the life of that thing, but man, Sony, this is disappointing.

Unfortunately, I don't remember if it was Kota or Daisuke, but one of Japanese guys at a recent Swift Hackathon in Boston had a Z of the similar vintage, and it looked impeccable. Maybe Sony figured that it's going to be predominant mode of care that their wares receive, and so why not make the modern Z this much cheaper than the old, indestructable Z. But they still charge exorbitant prices.

Lenovo wins a special notice because I had a T400 for 3 years and swore never deal with it ever again. The biggest problem is the keyboard layout, because I use left pinky for control key. I could live with their idiotic placement of Escape, but I refuse to deal with 3 years of physical pain again. Also, their famous qualify seems slipping, as my mouse button broke within 3 years. Battery died, too. However, the T400 had a very good display, and I would like another like that, if possible.

October 19, 2014 04:29 PM

October 18, 2014

Pavel Machek: N900 nfs root

So you'd like to develop on Nokia N900... It has serial port, but with "interesting" connector. It has keyboard, but with "interesting" keyboard map, you mostly need full X to be useful... and it is too small for serious typing, anyway. You could put root filesystem on SD card, but that is disconnected when back cover is removed. And with back cover in place, you can't reset the machine.

Ok, so NFS. Insecure, tricky to setup, but actually makes the development usable. I started with commit 4f3e8d263^ (because that should have working usb networking according to mailing lists).. and with config from same page. Disadvantage is that video does not work with that configuration... but setting up system blind should not be that hard, right?

Assemblying minimal system with busybox from so I could run second-stage of debootstrap was tricky, and hacking into the resulting debian was not easy, either, but now I have telnet connections and things should only improve.

October 18, 2014 07:43 PM

Dave Jones: Trinity updates

Over a month ago, I posted about some pthreads work I was experimenting with in Trinity, and how that wasn’t really working out. After taking a short vacation, I came back with no real epiphanies, and decided to back-burner that work for now, and instead refocus on fixing up some other annoying problems that I’d stumbled across while doing that experimenting. Some of these problems were actually long-standing bugs in trinity. So that’s pretty much all I’ve been working on for the last month, and I’m now pretty happy with how long it runs for (providing you don’t hit a kernel bug first).

The primary motivation was to fix a problem where trinity’s internal data structures would get corrupted. After a series of debugging patches, I found a number of places where a child process would overrun a buffer it had allocated.

First up: the code that takes syscalls arguments and renders them into a human-readable string. In some cases this would write huge strings past the end of the buffer. One example of this was the instance where trinity would generate a random pathname. It would sometimes generate complete garbage, which was fine until it came to printing it out. Fixed by deleting lots of code in the pathname generator. Stressing the negative dentry case was never that interesting anyway. After fixing up a few other cases in the argument generator I looked at the code that performs rendering to buffers. None of this code took length parameters, or took into account the remaining space in the buffers. Fairly quick rewrite took care of that.

After these bugs were fixed trinity would (on a good kernel) run for a really long time without incident. With longer runtimes, a few more obscure corner cases turned up.

There were 2-3 cases where the watchdog process would hang waiting for a condition that would never be met (due to losing track of how many running child processes there were). I’m still not happy that this can even occur but it is at least a little less likely to hang when it happens now. I’ll investigate the actual cause for this later.

Another fun watchdog bug: we keep track of the time stamp a child performed its last syscall at, and check to make sure 1 second later that it has increased by some small amount. To make sure we haven’t corrupted our own state, there’s also a sanity check that we haven’t jumped into the future. But we also have to compensate for the possibility that adjtimex was the random syscall we did. That takes a maximum offset of 2145. The code checked for that but forgot to also add the one second since the last time we checked.

There’s been a bunch of small 1-2 fixes like this lately, but I’m sitting on a larger set of changes that I’ll start to trickle into git next week, which moves towards cleaning up the “create a random page to pass to syscalls” code, which has been another fun source of corruption bugs.

In kernel news: The only interesting bugs this week that Trinity has shown up, have been two ext4 bugs. Diagnosing those has pointed out some more enhancements that are needed to the post-mortem code in trinity. Once I’ve cleared the current backlog of patches, I’ll work on adding better tracking of fd’s in the logging code. In other news, the btrfs bug trinity hit in August is now fixed in 3.17+ git.

Trinity updates is a post from: codemonkey.org.uk

October 18, 2014 03:11 PM

October 16, 2014

Michael Kerrisk (manpages): man-pages-3.75 is released

I've released man-pages-3.75. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This is a quite small release. The most notable changes in man-pages-3.75 the following:

October 16, 2014 08:47 AM

October 10, 2014

Grant Likely: git.secretlab.ca is down

For anyone who has been using git.secretlab.ca, the server is currently down and I don’t know when it will be back up. I’ve moved my Linux kernel tree over to kernel.org. The new tree can be found here:

https://git.kernel.org/cgit/linux/kernel/git/glikely/linux.git

The other trees will be back when git.secretlab.ca returns to life.

October 10, 2014 01:53 PM

October 06, 2014

Valerie Aurora: Operating systems war story: How feminism helped me solve one of file systems’ oldest conundrums

A smiling woman with pink hair wearing a t shirt with the word "O_PONIES" in Courier font

Valerie Aurora in 2009 (keep reading to find out why my shirt says “O_PONIES”)
CC BY NC-SA Robert Kaye

Hi, my name is Valerie Aurora, and I am the inventor of a software feature that has prevented billions of unnecessary writes to hard drives, saving energy and making our computers faster. My invention is called “relative atime,” and this is the story of how my feminist approach to computing helped me invent it – and what you can do to support women in open source software. (If you’re already convinced we need more women in open source, here’s a link to donate now to the Ada Initiative’s 2014 fundraising drive. My operating systems war story will still be here when you’re finished!)

Donate now

First, a little background for those of you who don’t live and breathe UNIX file systems performance. Ingo Molnar once called the access time, or “atime” feature of UNIX file systems “perhaps the most stupid Unix design idea of all times.” That’s harsh but fair. See, every time you read a file on a UNIX operating system – which includes OS X, Linux, and Android[1] – it is supposed to update the file to record the last time it was read, or accessed. This is called the access time or atime. Cool, right? You can imagine why it’s helpful to know when was the last time anything read a particular file – you can tell if you have new mail, for example, or figure out which files you haven’t used in a while and can throw away.

The problem with the atime feature is that updating atime requires writing to the disk. So every read to a file creates a tiny disk write – and writes are expensive and slow. (SSDs don’t get rid of this problem; you still don’t want to do unnecessary writes and most of the world’s data is still on spinning disks.) Here’s what Ingo said about this in 2006: “Atime updates are by far the biggest IO performance deficiency that Linux has today. Getting rid of atime updates would give us more everyday Linux performance than all the pagecache speedups of the past 10 years, _combined_.

So, atime is terrible idea – why don’t we just turn it off? That’s what many people did, using the “noatime” option that many file systems provide. The problem was that many programs did need to know the atime of a file to work properly. So most Linux distributions shipped with atime on, and it was up to the user to remember to turn it off (if they could). It was a bad situation.

A cartoon of a woman driving a robot penguin

LinuxChix logo

In 2006, I was a Linux file systems developer and also an active member of LinuxChix, a group for women who used Linux. LinuxChix existed in part because it was impossible to have technical discussions about Linux on most mailing lists without people insulting and flaming you for asking the simplest questions – and it was ten times worse for people with feminine usernames. Tell a cautionary story about installing RAM correctly, and the response might be a sneering, “Oh, you didn’t let out the magic smoke, did you?” On LinuxChix, that kind of obnoxiousness wasn’t allowed (though we still got a lot of what is now called mansplaining.)

So when I advised several people in LinuxChix to turn off atime, a friend felt safe telling me that hey, performance on her laptop was better, but now Mutt, the email reader we both used, thought she always had new email. This is because in her configuration, Mutt would look at an email file and compared its atime with the file’s last written time to figure out if any new email had arrived since the last time it read the file.

Now, the typical answer to “Mutt doesn’t work with noatime” was “Switch to a slower directory-based method,” or “Use a file size hack that had bugs,” or any number of other unhelpful things. Mostly, people just wouldn’t bother reporting things that broke with noatime. But I was part of a culture – a feminist culture – in which I respected people like my friend and programmers that attempted to use fully defined, useful features of UNIX in order to implement features efficiently.

I decided to look at the problem from a human point of view. What my friend and the Mutt programmers really wanted to know was this: Has this file been written since the last time I read it? They didn’t particularly care about the exact time of the last read, they just wanted to know if it had been read before or after the last write. I had an idea: What if we only updated a file’s atime if it would change the answer to the question, “Has this file been read since the last time it was written?” I called it “relative atime.”[2]

The amazing thing is: it worked! Matthew Garrett (also a known feminist), Ingo Molnar, and Andrew Morton made some changes to patch, including updating the atime if the current atime was more than 24 hours ago. Other than that, this incredibly simple algorithm worked well enough that in 2009, relative atime became the default in the mainline Linux kernel tree. Now, by default, people’s computers were fast and their programs worked.

I came up with this idea and the original patch in 2006, when the atime problem had been known for many years. Previous solutions had taken a very file-system-centric point of view, mainly along the lines of buffering up atime updates in memory and writing them out when we ran out of memory. What led me to a creative, simple, and extremely fast solution was being part of a feminist community in which people felt comfortable sharing their technical problems, wanted to help each other, and respected each other’s intelligence. Those are all feminist principles, and they make file systems development better.

I try to take that human-centered, feminist approach with other topics in file systems, including the great fsync()/rename() debate of 2009 (a.k.a “O_PONIES”) in which I argued that file systems developers should strive to make life easier for developers and users, not harder. As recently as 2013, a leading file systems developer was still arguing that file systems didn’t have to save file data reliably by mocking users for playing computer games.

I was working on another human-centered file system feature, union mounts, when I heard that a friend of mine had been groped at an open source conference for the third time in one year. While I loved my file systems work, I felt like stopping sexual harassment and assault of women in open source was more urgent, and that I was uniquely qualified to work on it. (I myself had been groped by another Linux storage developer.) So I quit my job as a Linux kernel developer and co-founded the Ada Initiative, whose mission is supporting women in open technology and culture. Unfortunately, as a result of my work, several more Linux storage developers came out publicly in favor of harassment and assault.

That’s one reason why I’m so excited that Ceph developer Sage Weil challenged the open storage community to raise $8192 for the Ada Initiative by Wednesday, Oct. 8 – and he’ll personally match that amount if we reach the goal! UPDATED: Sage and Mike Perez raised this to $16,384!!! The number of Linux file systems and storage developers who both donated to Sage’s challenge and wanted to be listed publicly as supporters is reminding me that the vast majority of the people I worked with in Linux want women to feel safe and comfortable in their community. I love file systems development, I love writing kernel code, and I miss working with and seeing my Linux friends. And as you can tell by the lack of something like union mounts in the mainline kernel 21 years after the first implementation, Linux file systems and storage does not have enough developers, and can’t afford to keep driving off women developers.

A woman sitting at a table explaining soemthing with her hands

Me teaching the Ally Skills Workshop

The Ada Initiative is capable of changing this situation. In August 2014, I taught the first Ally Skills Workshop at a Linux Foundation-run conference, LinuxCon North America. The Ally Skills Workshop teaches men simple everyday ways to support women in their workplaces in communities, and teaching it is my favorite part of my work. I was happy to see several Linux file systems and storage developers at the workshop. I was still nervous about running into the developers who support harassment and assault, but seeing how excited people were after the Ally Skills Workshop made it all worthwhile.

If you’d like to see more people working on Linux storage and file systems, and especially more women, please join Sage Weil and more than 30 other open storage developers in supporting the Ada Initiative. Donate now:

Donate now

Edited to add 10/6/2014: Sage made his goal, hurray! And here’s my favorite comment from the HN thread about this story, the only one actually flagged into non-existence (plenty of other creepy misogyny elsewhere though):

Screen Shot 2014-10-05 at 10.18.37 PM

Also, I had no idea Lennart Poettering planned to post this detailed description of the abuse, harassment, and death threats he’s suffered as an open source developer.

We’re still raising money for Ada Initiative to fight this kind of harassment, so feel free to donate:

Donate now

[1] Yes, Android is Linux too, I’m just naming the brands that non-operating systems experts would recognize.

[2] “Relative atime” isn’t so bad, but the name of the option that you pass to the kernel, “relatime”, showed my usual infelicity with naming things as it looks like a misspelling of “realtime”.


Tagged: ada initiative, feminism, filesystems, kernel, linux

October 06, 2014 09:36 PM

Michael Kerrisk (manpages): man-pages-3.74 is released

I've released man-pages-3.74. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

Aside from various minor changes to many pages, the most notable changes in man-pages-3.74 the following:

October 06, 2014 06:06 PM

October 03, 2014

Daniel Vetter: Neat drm/i915 stuff for 3.18

Since Dave Airlie moved the feature cut-off of the drm-next tree roughly one month ahead it is already time for our regular look at what's ahead. Even though the 3.17 features aren't even released yet.

On the modeset side of things we now have the final pieces for plane rotation support from Sonika Jindal and Ville. The DisplayPort code has also seen lots of improvements, with updated training values in preparation of the latest eDP standard (Sonika Jindal) and support for DP training pattern 3 (Ville). DSI panels now support burst mode (Shobhit) and hdmi conformance has been improved with some fixes from Clint Taylor.

For eDP panels we also have improved panel power sequencing code, mostly to fix issues on Cherryview, from Ville. Ville has also contributed fixes to the VDD handling code, which is used to temporarily enable panel power. And the backlight code learned to handle the bl_power setting so that the backlight can be turned off completely without upsetting the panel's power sequencing, contributed by Jani.

Chris Wilson has also been fairly busy on the modeset code: 3.18 includes his patches to cache EDIDs for a single probe call - unfortunately the full caching solution to keep the EDID around between multiple probe calls isn't merged yet. And pageflips have now improved error detection and recovery logic: In case something goes wrong we shouldn't end up stuck any longer waiting for a pageflip to complete that has been lost by either the hardware or the driver.

Moving on to platform specific work there's been lots of preparations for Skylake, most of it from Damien and Sonika. The actual intial platform enabling is delayed for 3.19 though. On the other end of the timeline Ville fixed up i830M modeset support on a rainy w/e in his vacation, and 3.18 now has all that code. And there has been a lot of Cherryview fixes all over.

Cherryview also gained support for power wells and hence runtime pm (Ville). And for platform agnostic feature a lot of the preparation for DRRS (dynamic refresh rate switching) is merged, hopefully the actual feature patches from Vandana Kannan will land in 3.19.

Moving on the render side of the driver there's been a lot of patches to beat the full ppgtt support into shape. The context code has been cleaned up, lifetime handling for ppgtt address spaces is fixed and bad interactions with secure batches are now also rectified. Enabling full ppgtt missed the feature cutoff by a hair though, but it's already enabling for the following release.

Basic support for execlists command submission from Ben Widawsky, Oscar Mateo and Thomas Daniel was also merged. This is the fancy new way to submit commands available on Gen8 and subsequent platforms. It's not yet enabled by default, but since it's a requirement for a lot of cool new features keep an eye on what's going on here. There is also a lot of work going on underneath to enable all this new code in GEM, like preparing to switch away from sequence numbers to tracking gpu progress more abstractly using the driver's request structures.

And this time around there is also some cool stuff going on in the drm core worth of a shout-out: The vblank handling code is massively revamped, hopefully plugging all the small races, inconsistencies and inefficiencies in that code. And thanks to David Herrmann it is finally possible to write a drm driver without the drm midlayer getting completely in the way of a proper driver load and unload sequence! Unfortunately i915 can't be converted right away since the legacy usermodesetting code crucial relies on this midlayer functionality. But that's now well deprecated and hopefully can be removed in one of the next releases.

October 03, 2014 03:47 PM

October 02, 2014

Matthew Garrett: Actions have consequences (or: why I'm not fixing Intel's bugs any more)

A lot of the kernel work I've ended up doing has involved dealing with bugs on Intel-based systems - figuring out interactions between their hardware and firmware, reverse engineering features that they refuse to document, improving their power management support, handling platform integration stuff for their GPUs and so on. Some of this I've been paid for, but a bunch has been unpaid work in my spare time[1].

Recently, as part of the anti-women #GamerGate campaign[2], a set of awful humans convinced Intel to terminate an advertising campaign because the site hosting the campaign had dared to suggest that the sexism present throughout the gaming industry might be a problem. Despite being awful humans, it is absolutely their right to request that a company choose to spend its money in a different way. And despite it being a dreadful decision, Intel is obviously entitled to spend their money as they wish. But I'm also free to spend my unpaid spare time as I wish, and I no longer wish to spend it doing unpaid work to enable an abhorrently-behaving company to sell more hardware. I won't be working on any Intel-specific bugs. I won't be reverse engineering any Intel-based features[3]. If the backlight on your laptop with an Intel GPU doesn't work, the number of fucks I'll be giving will fail to register on even the most sensitive measuring device.

On the plus side, this is probably going to significantly reduce my gin consumption.

[1] In the spirit of full disclosure: in some cases this has resulted in me being sent laptops in order to figure stuff out, and I was not always asked to return those laptops. My current laptop was purchased by me.

[2] I appreciate that there are some people involved in this campaign who earnestly believe that they are working to improve the state of professional ethics in games media. That is a worthy goal! But you're allying yourself to a cause that disproportionately attacks women while ignoring almost every other conflict of interest in the industry. If this is what you care about, find a new way to do it - and perhaps deal with the rather more obvious cases involving giant corporations, rather than obsessing over indie developers.

For avoidance of doubt, any comments arguing this point will be replaced with the phrase "Fart fart fart".

[3] Except for the purposes of finding entertaining security bugs

comment count unavailable comments

October 02, 2014 11:27 PM

October 01, 2014

Matt Domsch: Spamfighting: updated opendmarc packages, handling DMARC p=reject

I took a few months off from dealing with my spam problems, choosing to stick my head in the sand. Probably not my wisest move…

In the interim, the opendmarc developers have been busy, releasing version 1.3.0, which also adds the nice feature of doing SPF checking internally. This lets me CLOSE WONTFIX the smf-spf and libspf2 packages from the Fedora review process and remove them from my system. “All code has bugs. Unmaintained code with bugs that you aren’t running can’t harm you.” New packages and the open Fedora review are available.

I’ve also had several complaints from friends, all @yahoo.com users, who have been sending mail to myself and family @domsch.com. In most cases, @domsch.com simply forwards the emails on to yet other mail provider – it’s providing a mail forwarding service for a vanity domain name. However, now that Yahoo and AOL are publishing DMARC p=reject rules, after smtp.domsch.com forwarded the mail on to its ultimate home, those downstream servers were rejecting the messages (presumably on SPF grounds – smtp.domsch.com isn’t a valid mail server for @yahoo.com).

My solution to this is a bit akward, but will work for a while. Instead of forwarding mail from domains with DMARC p=reject or p=quarantine, I now store them and serve them up via POP3/IMAP to their ultimate destination. I’m using procmail to do the forwarding:


DEFAULT="/home/mdomsch/Mail/"
SENDER=`formail -c -x Return-Path`
SENDMAILFLAGS="-oi -f $SENDER"

# forward all mail except dmarc policy reject|quarantine.
:0 H
* ? formail -x'From:' | grep -o '[[:alnum:]+\.\_\-]*@[[:alnum:]+\.\_\-]*' | xargs opendmarc-check | egrep -s 'Domain policy: (reject|quarantine)'
${DEFAULT}

:0
! mdomsch@example.com

This introduces quite a bit of latency (on the order of an hour) for mail delivery from my friends with @yahoo.com addresses, but it keeps them from getting rejected due to their email provider’s lousy choice of policy.

Tim Draegen, the guy behind the excellent dmarcian.com, is chairing a new IETF working group focusing on proper handling on “indirect email flows” such as mailing lists and vanity domain forwarding. I’m hoping to have time to get involved there. If you care, follow along on their mailing lists.

I’m choosing to ignore the fact that domsch.com is getting spoofed 800k times a week (as reported by 8 mail providers and visualized nicely on dmarcian.com), at least for now. I’m hoping the new working group can come up with a method to help solve this.

Do your friends use a mail service publishing DMARC p=reject? Has it caused problems for you? Let me know in the comments below.

October 01, 2014 08:45 PM

Eric Sandeen: Sage’s challenge to the open storage community – support the Ada Iniative!

The Ada Initiative supports women in open technology and culture – looking around my place of work and various conference halls I’ve visited, I think there’s little doubt that we’ve still got an old boy network running here, and I’d like to see that change.  I have daughters who may or may not get deep into tech culture, and I’m glad there are people like Val working to make it a better place for them if they do.

And she’s recently gotten a nice bit of potential help: Sage Weil of Ceph & Dreamhost fame has issued a challenge: Raise $8192 in 8 days, and he’ll match it.  They’re already on their way.  If you can help, please do.  Power-of-two donations encouraged, but not required.  :) Click the counter below to see their donation page.  Thanks!


Donate now

October 01, 2014 08:14 PM

September 26, 2014

Matt Domsch: [REPOST] Who am I?

I’ve started blogging again on the Dell TechCenter site, Enterprise Mobility Management section, along with the rest of my team.

Here’s the intro to my first post, “Who am I?”:

The existential question, asked by everyone and everything throughout their lifetimes – who am I? High school seniors choosing a college, college seniors considering grad school or entering the job market, adults in the midst of their mid-life crisis—the question comes far easier than the answer.

In the world of technology, who you are depends on the technology with which you are interacting. On Facebook, you are your quirky personal self, with pictures of your family and vacations you take. On LinkedIn, you are your professional self, sharing articles and achievements that are aligned with your career.

What about on the myriad devices you carry around? On the smartphone in my pocket, I have several personas—personal, business, gamer (my kids borrow my phone), constantly context-switching between them. In the not-too-distant past, people would carry two phones—one for personal use and one for work, keeping the personas separate via physical separation—two personas, two devices.

Read more…

September 26, 2014 03:47 PM

September 25, 2014

Andy Grover: A program calling command-line tools is the moral equivalent of web scraping.

I gave this talk at LPC 2012. It promotes the idea that programs layered on top of human-centric interfaces is a bad idea.

Download the PDF file .

The timing of this post with the announcement of the most recent bash vulnerability is not entirely coincidental.

September 25, 2014 06:48 PM

September 24, 2014

Ted Tso: “How Google Works” giveaway

The authors of “How Google Works” have given electronic versions of “How Google Works” to all Google employees. Since I had already purchased a copy via pre-order, to make life interesting, I’ve decided to give my Google Play coupon code to someone via an electronic lottery.  (Edit: the coupon code will only work for people with Google accounts in the US; if you live outside of the US, my apologies, but the coupon code will not work for you.)

I will be using the procedure documented by RFC-3797 to select someone from the list of people who have sent-email to google-book-giveaway@thunk.org, snapshotted at Noon US/Eastern on Friday, September 26th, 2014, using as inputs into the RFC-3797 algorithm: (1) the daily volume of GOOG on September 26th, 2014 as reported by finance.google.com, (2) the daily volume of GOOGL on September 26th, 2014 as reported by finance.google.com, and (3) the Massachusetts Powerball Lottery Numbers for September 27th, 2014. If any of these values are not available for any reason, the values for the next trading day or lottery draw will be used.

Have fun!

September 24, 2014 02:55 PM

Matthew Garrett: My free software will respect users or it will be bullshit

I had dinner with a friend this evening and ended up discussing the FSF's four freedoms. The fundamental premise of the discussion was that the freedoms guaranteed by free software are largely academic unless you fall into one of two categories - someone who is sufficiently skilled in the arts of software development to examine and modify software to meet their own needs, or someone who is sufficiently privileged[1] to be able to encourage developers to modify the software to meet their needs.

The problem is that most people don't fall into either of these categories, and so the benefits of free software are often largely theoretical to them. Concentrating on philosophical freedoms without considering whether these freedoms provide meaningful benefits to most users risks these freedoms being perceived as abstract ideals, divorced from the real world - nice to have, but fundamentally not important. How can we tie these freedoms to issues that affect users on a daily basis?

In the past the answer would probably have been along the lines of "Free software inherently respects users", but reality has pretty clearly disproven that. Unity is free software that is fundamentally designed to tie the user into services that provide financial benefit to Canonical, with user privacy as a secondary concern. Despite Android largely being free software, many users are left with phones that no longer receive security updates[2]. Textsecure is free software but the author requests that builds not be uploaded to third party app stores because there's no meaningful way for users to verify that the code has not been modified - and there's a direct incentive for hostile actors to modify the software in order to circumvent the security of messages sent via it.

We're left in an awkward situation. Free software is fundamental to providing user privacy. The ability for third parties to continue providing security updates is vital for ensuring user safety. But in the real world, we are failing to make this argument - the freedoms we provide are largely theoretical for most users. The nominal security and privacy benefits we provide frequently don't make it to the real world. If users do wish to take advantage of the four freedoms, they frequently do so at a potential cost of security and privacy. Our focus on the four freedoms may be coming at a cost to the pragmatic freedoms that our users desire - the freedom to be free of surveillance (be that government or corporate), the freedom to receive security updates without having to purchase new hardware on a regular basis, the freedom to choose to run free software without having to give up basic safety features.

That's why projects like the GNOME safety and privacy team are so important. This is an example of tying the four freedoms to real-world user benefits, demonstrating that free software can be written and managed in such a way that it actually makes life better for the average user. Designing code so that users are fundamentally in control of any privacy tradeoffs they make is critical to empowering users to make informed decisions. Committing to meaningful audits of all network transmissions to ensure they don't leak personal data is vital in demonstrating that developers fundamentally respect the rights of those users. Working on designing security measures that make it difficult for a user to be tricked into handing over access to private data is going to be a necessary precaution against hostile actors, and getting it wrong is going to ruin lives.

The four freedoms are only meaningful if they result in real-world benefits to the entire population, not a privileged minority. If your approach to releasing free software is merely to ensure that it has an approved license and throw it over the wall, you're doing it wrong. We need to design software from the ground up in such a way that those freedoms provide immediate and real benefits to our users. Anything else is a failure.

(title courtesy of My Feminism will be Intersectional or it will be Bullshit by Flavia Dzodan. While I'm less angry, I'm solidly convinced that free software that does nothing to respect or empower users is an absolute waste of time)

[1] Either in the sense of having enough money that you can simply pay, having enough background in the field that you can file meaningful bug reports or having enough followers on Twitter that simply complaining about something results in people fixing it for you

[2] The free software nature of Android often makes it possible for users to receive security updates from a third party, but this is not always the case. Free software makes this kind of support more likely, but it is in no way guaranteed.

comment count unavailable comments

September 24, 2014 06:59 AM

September 23, 2014

Grant Likely: Don’t fear ACPI on ARM

Before everyone freaks out about Matthew Garrett’s post regarding ACPI on ARM, here are a few things to keep in mind:

First, when we’re talking about Linux and ACPI on ARM, we’re talking about general purpose servers. In the general purpose server market, Linux is already the dominant OS, regardless of the CPU architecture. Servers are designed, built and sold to run Linux. It is already the situation that x86 server vendors build their ACPI tables to work with Linux. Supporting Linux on ARM servers is merely an extension of what vendors are already doing to support Linux on x86. Despite Matthew’s concern, I don’t think we’re entering new territory in this regard.

Second, many of us have bad memories of getting ACPI to work with Linux. However, it is worth remembering that most of our problems have been with machines where the vendor really doesn’t care about Linux – usually desktop or laptop PCs. It’s not surprising that we have problems with these machines since they’ve only been tested with Windows! Server vendors, on the other hand, have a vested interest in ensuring that Linux runs well on their hardware and so they regularly test with Linux. The negative lessons learned in the laptop and desktop markets don’t carry over to machines built to run Linux.

Third, the ACPI world has changed in the last 2 years. It used to be that the ACPI spec was governed in a closed process by 5 companies: HP, Intel, Microsoft, Phoenix, and Toshiba, with nary a Linux person to be seen. Last year ACPI governance was transferred to the UEFI Forum and we’ve got plenty of Linux engineers sitting at the table. In light of that, it is no longer true that ACPI only caters to the needs of Windows, and we have the ability to propose changes to the spec. In fact, if you look at the revision history in version 5.1 of the spec, you’ll find changes that were proposed by Linux engineers to make ARMv8 work.

That said, the issues raised by Matthew are important. There is a big question about how Linux should declare itself to the platform. Claiming to be compatible with “Windows 8″ in the ACPI _OSI (Operating System Interface) method obviously isn’t appropriate on ARM. There is some talk about removing _OSI entirely on ARM since the way Linux uses it isn’t actually useful, and the _OSC (Operating System Capability) method has been proposed as a better way to declare what the OS supports. There is also a need to make sure vendors are testing with linux-next and mainline kernels so that we know when breakage happens and we can either do something about it, or work with vendors to fix their firmware.

Both of these are important issues and I think we need to propose solutions before merging ARM ACPI support into the kernel. Some of this work has already started: Linaro is running Canonical’s Firmware Test Suite (FWTS), the ACPI API tests, and the ACPI ASL tests on ARM, and we’re porting the Linux UEFI Verification (LUV) project which packages all the test suites into an easy to use distribution.

While I agree with Matthew that getting the interface between firmware and the OS is hard, I do not see the nightmare scenario he is describing. It certainly hasn’t played out that way on x86 servers where Linux is already the preferred OS. Besides, I really cannot agree with the premise that Linux being the dominant OS is a bad thing! We have a lot more influence than we give ourselves credit for.

September 23, 2014 10:00 PM

September 21, 2014

Michael Kerrisk (manpages): man-pages-3.73 is released

I've released man-pages-3.73. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

The most notable changes in man-pages-3.73 are various new and modified pages describing namespaces in general, and user and PID namespaces in detail:

September 21, 2014 11:39 AM

September 16, 2014

Matthew Garrett: ACPI, kernels and contracts with firmware

ACPI is a complicated specification - the latest version is 980 pages long. But that's because it's trying to define something complicated: an entire interface for abstracting away hardware details and making it easier for an unmodified OS to boot diverse platforms.

Inevitably, though, it can't define the full behaviour of an ACPI system. It doesn't explicitly state what should happen if you violate the spec, for instance. Obviously, in a just and fair world, no systems would violate the spec. But in the grim meathook future that we actually inhabit, systems do. We lack the technology to go back in time and retroactively prevent this, and so we're forced to deal with making these systems work.

This ends up being a pain in the neck in the x86 world, but it could be much worse. Way back in 2008 I wrote something about why the Linux kernel reports itself to firmware as "Windows" but refuses to identify itself as Linux. The short version is that "Linux" doesn't actually identify the behaviour of the kernel in a meaningful way. "Linux" doesn't tell you whether the kernel can deal with buffers being passed when the spec says it should be a package. "Linux" doesn't tell you whether the OS knows how to deal with an HPET. "Linux" doesn't tell you whether the OS can reinitialise graphics hardware.

Back then I was writing from the perspective of the firmware changing its behaviour in response to the OS, but it turns out that it's also relevant from the perspective of the OS changing its behaviour in response to the firmware. Windows 8 handles backlights differently to older versions. Firmware that's intended to support Windows 8 may expect this behaviour. If the OS tells the firmware that it's compatible with Windows 8, the OS has to behave compatibly with Windows 8.

In essence, if the firmware asks for Windows 8 support and the OS says yes, the OS is forming a contract with the firmware that it will behave in a specific way. If Windows 8 allows certain spec violations, the OS must permit those violations. If Windows 8 makes certain ACPI calls in a certain order, the OS must make those calls in the same order. Any firmware bug that is triggered by the OS not behaving identically to Windows 8 must be dealt with by modifying the OS to behave like Windows 8.

This sounds horrifying, but it's actually important. The existence of well-defined[1] OS behaviours means that the industry has something to target. Vendors test their hardware against Windows, and because Windows has consistent behaviour within a version[2] the vendors know that their machines won't suddenly stop working after an update. Linux benefits from this because we know that we can make hardware work as long as we're compatible with the Windows behaviour.

That's fine for x86. But remember when I said it could be worse? What if there were a platform that Microsoft weren't targeting? A platform where Linux was the dominant OS? A platform where vendors all test their hardware against Linux and expect it to have a consistent ACPI implementation?

Our even grimmer meathook future welcomes ARM to the ACPI world.

Software development is hard, and firmware development is software development with worse compilers. Firmware is inevitably going to rely on undefined behaviour. It's going to make assumptions about ordering. It's going to mishandle some cases. And it's the operating system's job to handle that. On x86 we know that systems are tested against Windows, and so we simply implement that behaviour. On ARM, we don't have that convenient reference. We are the reference. And that means that systems will end up accidentally depending on Linux-specific behaviour. Which means that if we ever change that behaviour, those systems will break.

So far we've resisted calls for Linux to provide a contract to the firmware in the way that Windows does, simply because there's been no need to - we can just implement the same contract as Windows. How are we going to manage this on ARM? The worst case scenario is that a system is tested against, say, Linux 3.19 and works fine. We make a change in 3.21 that breaks this system, but nobody notices at the time. Another system is tested against 3.21 and works fine. A few months later somebody finally notices that 3.21 broke their system and the change gets reverted, but oh no! Reverting it breaks the other system. What do we do now? The systems aren't telling us which behaviour they expect, so we're left with the prospect of adding machine-specific quirks. This isn't scalable.

Supporting ACPI on ARM means developing a sense of discipline around ACPI development that we simply haven't had so far. If we want to avoid breaking systems we have two options:

1) Commit to never modifying the ACPI behaviour of Linux.
2) Exposing an interface that indicates which well-defined ACPI behaviour a specific kernel implements, and bumping that whenever an incompatible change is made. Backward compatibility paths will be required if firmware only supports an older interface.

(1) is unlikely to be practical, but (2) isn't a great deal easier. Somebody is going to need to take responsibility for tracking ACPI behaviour and incrementing the exported interface whenever it changes, and we need to know who that's going to be before any of these systems start shipping. The alternative is a sea of ARM devices that only run specific kernel versions, which is exactly the scenario that ACPI was supposed to be fixing.

[1] Defined by implementation, not defined by specification
[2] Windows may change behaviour between versions, but always adds a new _OSI string when it does so. It can then modify its behaviour depending on whether the firmware knows about later versions of Windows.

comment count unavailable comments

September 16, 2014 10:51 PM

Andy Grover: Emacs and using multiple C code styles

I primarily work on Linux, so I put this in my Emacs config:

; Linux mode for C
(setq c-default-style
      '((c-mode . "linux") (other . "gnu")))

However, other projects like QEMU have their own style preferences. So here’s what I added to use a different style for that. First, I found the qemu C style defined here. Then, to only use this on some C code, we attach a hook that only overrides the default C style if the filename contains “qemu”, an imperfect but decent-enough test.

(defconst qemu-c-style
  '((indent-tabs-mode . nil)
    (c-basic-offset . 4)
    (tab-width . 8)
    (c-comment-only-line-offset . 0)
    (c-hanging-braces-alist . ((substatement-open before after)))
    (c-offsets-alist . ((statement-block-intro . +)
                        (substatement-open . 0)
                        (label . 0)
                        (statement-cont . +)
                        (innamespace . 0)
                        (inline-open . 0)
                        ))
    (c-hanging-braces-alist .
                            ((brace-list-open)
                             (brace-list-intro)
                             (brace-list-entry)
                             (brace-list-close)
                             (brace-entry-open)
                             (block-close . c-snug-do-while)
                             ;; structs have hanging braces on open
                             (class-open . (after))
                             ;; ditto if statements
                             (substatement-open . (after))
                             ;; and no auto newline at the end
                             (class-close)
                             ))
    )
  "QEMU C Programming Style")

(c-add-style "qemu" qemu-c-style)

(defun maybe-qemu-style ()
  (when (and buffer-file-name
       (string-match "qemu" buffer-file-name))
    (c-set-style "qemu")))

(add-hook 'c-mode-hook 'maybe-qemu-style)

September 16, 2014 01:16 AM

September 12, 2014

Dave Jones: Trinity threading improvements and misc

Since my blogging tsunami almost a month ago, I’ve been pretty quiet. The reason being that I’ve been heads down working on some new features for trinity which have turned out to be a lot more involved than I initially anticipated.

Trinity does all of its work in child processes continually forked off from a main process. For a long time I’ve had “investigate using pthreads” as a TODO item, but after various conversations at kernel summit, I decided to bump the priority of that up a little, and spend some time looking at it. I initially guessed that it would have take maybe a few weeks to have something usable, but after spending some time working on it, every time I make progress on one issue, it becomes apparent that there’s something else that is also going to need changing.

I’m taking a week off next week to clear my head and hopefully return to this work with fresh eyes, and make more progress, because so far it’s been mostly frustrating, and there may be an easier way to solve some of the problems I’ve been hitting. Sidenote: In the 15+ years I’ve been working on Linux, this is the first time I recall actually ever using pthreads in my own code. I can’t say I’ve been missing out.

Unrelated to that work, a month or so ago I came up with a band-aid fix for a problem where trinity would corrupt its own structures. That ‘fix’ turned out to break the post-mortem work I implemented a few months prior, so I’ve spent some time this week undoing that, and thinking about how I’m going to fix that properly. But before coming up with a fix, I needed to reproduce the problem reliably, and naturally now that I’ve added debug code to determine where the corruption is coming from, the bug has gone into hiding.

I need this vacation.

Trinity threading improvements and misc is a post from: codemonkey.org.uk

September 12, 2014 08:14 PM

September 07, 2014

Michael Kerrisk (manpages): man-pages-3.72 is released

I've released man-pages-3.72. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This is a small release; the  more notable changes in man-pages-3.72 are the addition of three new pages by Peter Schiffer that document glibc commands used for memory profile and malloc tracing:

September 07, 2014 01:36 PM

September 06, 2014

Pavel Machek: Fraud attempt from DAD GmbH

Got snail mail from DAD GmbH, Postfach 11 35 68, 20435. I should update my business info (which I never gave to them) and by submitting updated info, they would charge me 500 euro (small notice so that you are likely to miss it). I hope they go to jail for this.

September 06, 2014 09:38 PM

September 04, 2014

James Morris: New GPG Key

Just an FYI, I lost my GPG key a few months back during an upgrade, and have created a new one.  This was signed by folk at LinuxCon/KS last month.

The new key ID / fingerprint is: D950053C / 8327 23D0 EF9D D46D 9AC9  C03C AD98 4BBF D950 053C

Please use this key and not the old one!

September 04, 2014 09:38 PM

September 03, 2014

Pavel Machek: Boot shell

Yesterday I got electric shock. Yes, the device was supposed to be turned off by remote-control outlet, but I was still stupid to play with it.

Have you ever played the "press any key to stop autoboot" game, followed by copying boot commands from your notes, because you wanted to keep boot loader in original (early project phases) or final (late project phases) configuration? Have you reached level 2, playing autoboot game over internet?

If so, you may want to take a look at boot shell (bs) from Not Universal Test System project. In ideal case, it knows how to turn off/on the target, break into autoboot, boot your target in development mode, and login as root when user land is ready.

September 03, 2014 09:09 AM

August 29, 2014

Daniel Vetter: Review Training Slides

We currently have a large influx of new people contributing to i915 - for the curious just check the git logs. As part of ramping them up I've done a few trainings about upstream review, and a bunch of people I've talked with at KS in Chicago were interested in that, too. So I've cleaned up the slides a bit and dropped the very few references to Intel internal resources. No speaker notes or video recording, but I think this is useful all in itself. And of course if you have comments or see big gaps - feedback is very much welcome:

Upstream Review Training Slides

August 29, 2014 04:14 PM

August 21, 2014

Michael Kerrisk (manpages): man-pages-3.71 is released

I've released man-pages-3.71. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

As well as many smaller fixes to various pages, the more notable changes in man-pages-3.71 are the following:

August 21, 2014 01:27 PM

August 19, 2014

Rusty Russell: POLLOUT doesn’t mean write(2) won’t block: Part II

My previous discovery that poll() indicating an fd was writable didn’t mean write() wouldn’t block lead to some interesting discussion on Google+.

It became clear that there is much confusion over read and write; eg. Linus thought read() was like write() whereas I thought (prior to my last post) that write() was like read(). Both wrong…

Both Linux and v6 UNIX always returned from read() once data was available (v6 didn’t have sockets, but they had pipes). POSIX even suggests this:

The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading.

But write() is different. Presumably so simple UNIX filters didn’t have to check the return and loop (they’d just die with EPIPE anyway), write() tries hard to write all the data before returning. And that leads to a simple rule.  Quoting Linus:

Sure, you can try to play games by knowing socket buffer sizes and look at pending buffers with SIOCOUTQ etc, and say “ok, I can probably do a write of size X without blocking” even on a blocking file descriptor, but it’s hacky, fragile and wrong.

I’m travelling, so I built an Ubuntu-compatible kernel with a printk() into select() and poll() to see who else was making this mistake on my laptop:

cups-browsed: (1262): fd 5 poll() for write without nonblock
cups-browsed: (1262): fd 6 poll() for write without nonblock
Xorg: (1377): fd 1 select() for write without nonblock
Xorg: (1377): fd 3 select() for write without nonblock
Xorg: (1377): fd 11 select() for write without nonblock

This first one is actually OK; fd 5 is an eventfd (which should never block). But the rest seem to be sockets, and thus probably bugs.

What’s worse, are the Linux select() man page:

       A file descriptor is considered ready if it is possible to
       perform the corresponding I/O operation (e.g., read(2)) without
       blocking.
       ... those in writefds will be watched to see if a write will
       not block...

And poll():

	POLLOUT
		Writing now will not block.

Man page patches have been submitted…

August 19, 2014 01:57 PM

August 15, 2014

Dave Jones: A breakdown of Linux kernel networking related issues from Coverity scan

For the last of these breakdowns, I’ll focus on fifth place: networking.

Linux supports many different network protocols, so I spent quite a while splitting the net/ tree into per-protocol components. The result looks like this.

Net-802 8
Net-Bluetooth 15
Net-CAIF 9
Net-Core 11
Net-DCCP 5
Net-DECNET 5
Net-IRDA 17
Net-NFC 11
Net-SCTP 18
Net-SunRPC 21
Net-Wireless 9
Net-XFRM 6
Net-bridge 14
Net-ipv4 24
Net-ipv6 16
Net-mac80211 12
Net-sched 5
everything else 124

The networking code has gotten noticably better over the last year. When I initially introduced these components they were all well into double figures. Now, even crap like DECNET has gotten better (both users will be very happy).

“Everything else” above is actually a screw-up on my part. For some reason around 50 or so netfilter issues haven’t been categorized into their appropriate component. The remaining ~70 are quite a mix, but nearly all small numbers of issues in many components.Things like 9p, atm, ax25, batman, can, ceph, l2tp, rds, rxrpc, tipc, vmwsock, and x25. The Lovecraftian protocols you only ever read about.

So networking is in pretty good shape considering just how much stuff it supports. While there’s 24 issues in a common protocol like ipv4, they tend to be mostly benign things rather than OMG 24 WAYS THE NSA IS OWNING YOUR LINUX RIGHT NOW.

That’s the last of these breakdowns I’ll do for now. I’ll do this again maybe in six months to a year, if things are dramatically different, but I expect any changes to be minor and incremental rather than anything too surprising.

After I get back from kernel summit and recover from travelling, I’ll start a series of posts showing code examples of the top checkers.

A breakdown of Linux kernel networking related issues from Coverity scan is a post from: codemonkey.org.uk

August 15, 2014 09:37 PM

Dave Jones: Breakdown of Linux kernel wireless drivers in Coverity scan

In fourth place on the list of hottest areas of the kernel as seen by Coverity, is drivers/net/wireless.

rtlwifi 96
Atheros 74
brcm80211 67
mwifiex 33
b43 16
iwlwifi 15
everything else 65

I mentioned in my drivers/staging examination that the realtek wifi drivers stood odd as especially problematic. Here we see the same situation. Larry Finger has been working on cleaning up this (and other drivers) for some time, but it apparently still has a long way to go.

It’s worth noting that “Atheros” here is actually a number of drivers (ar5523, ath10k, ath5k, ath6k, ath9k, carl9170, wcn36xx, wil6210). I’ve not had time to break those down into smaller components yet, though a quick look shows that ath9k in particular accounts for a sizable portion of those 74 issues)

I was actually surprised at how low the iwlwifi and b43 counts were. I guess there’s something to be said for ubiquitous hardware.

What of all the ancient wireless drivers ? The junky pcmcia/pccard drivers like orinoco and friends ?
They’re in with those 65 “everything else” bugs, and make up < 5-6 issues each. Considering their age, and lack of any real maintenance these days, they’re in surprisingly good shape.

Just for fun, here’s how the drivers above compare against the wireless drivers currently in staging.

rtl8821 102 (Staging)
rtlwifi 96
Atheros 74
brcm80211 67
rtl8188eu 42 (Staging)
mwifiex 33
rtl8712 22 (Staging)
rtl8192u 21 (Staging)
rtl8192e 17 (Staging)
b43 16
iwlwifi 15
everything else 65

Breakdown of Linux kernel wireless drivers in Coverity scan is a post from: codemonkey.org.uk

August 15, 2014 09:12 PM

Dave Jones: A breakdown of Linux kernel filesystem issues in Coverity scans

The filesystem code shows up in the number two position of the list of hottest areas of the kernel. Like the previous post on drivers/scsi, this isn’t because “the filesystem code is terrible”, but more that Linux supports so many filesystems, the accumulative effect of issues present in all of them adds up to a figure that dominates the statistics.

The breakdown looks like this.

fs/*.c 77
9P 3
AFS 3
BTRFS 84
CEPH 9
CIFS 32
EXTn 36
GFS2 12
HFS 3
HFSPlus 4
JBD 6
JFFS2 6
JFS 7
NFS 24
NFSD 15
NILFS2 10
NTFS 12
OCFS2 35
PROC 1
Reiserfs 12
UBIFS 21
UDF 14
UFS 6
XFS 33

fs/*.c accounts for the VFS core, AIO, binfmt parsers, eventfd, epoll, timerfd’s, xattr code and a bunch of assorted miscellany. Little wonder it show up with so high, it’s around 62,000 LOC by itself. Of all the entries on the list, this is perhaps the most concerning area given it affects every filesystem.

A little more concerning perhaps is that btrfs is so high on the list. Btrfs is still seeing a lot of churn each release, so many of these issues come and go, but it seems to be holding roughly at the same rate of new incoming issues each release.

EXTn counts for ext2, ext3, and ext4 combined. Not too bad considering that’s around 74,000 LOC combined. (and another 15K LOC for jbd/jbd2)

The CIFS, NFS and OCFS filesystems stand out as potentially something that might be of concern, especially if those issues are over-the-wire trigger-able.

XFS has been improving over the past year. It was around 60-70 when I started doing regular scans, and continues to move downward each release, with few new issues getting added.

The remaining filesystems: not too shabby. Especially considering some of the niche ones don’t get a lot of attention.

A breakdown of Linux kernel filesystem issues in Coverity scans is a post from: codemonkey.org.uk

August 15, 2014 03:40 PM

Dave Jones: A closer look at drivers/scsi Coverity scans.

drivers/scsi showed up in third place in the list of hottest areas of the kernel. Breaking it down into sub-components, it looks like this.

aic7xxx 15
be2iscsi 15
bfa 26
bnx2fc 6
csiostor 10
isci 11
lpfc 38
megaraid 10
mpt2sas 17
mpt3sas 15
pm8001 9
qla2xxx 42
qla4xxx 17
Everything else 152

All these components have been steadily improving over the last year. The obvious stand-out is “Everything else” that looks like it needs to be broken out into more components.
But drivers/scsi is one area of the kernel where we have a *lot* of legacy drivers, many of them 10-15 years old. (Remarkably, some of these are even still in regular use). Looking over the list of filenames matching the “Everything else” component, pretty much every driver that isn’t broken out into its own component is on the list. 3w-9xxx, NCR5380, aacraid, advansys, aic94xx, arcmsr, atp870, bnx2i, cxgbi, dc395x, dpt_i2o, eata, esas2, fdomain, fnic, gdth, hpsa, imm, ipr, ips, mvsas, mvumi, osst, pmcraid, qla1280, qlogicfas, stex, storvsc_drv, sym53x8xx, tmscsim.
None of these are particularly worse than the others, most averaging less than a half dozen issues each.

Ignoring the problems I currently have adding more components, it’s not particularly helpful to break it down further when the result is going to be components with a half dozen issues. It’s not that there’s a few awful drivers dragging down the average, it’s that there’s so many of them, and they all contribute a little bit of awful.

Something I’d like to component-ize, but can’t easily without crafting and maintaining ugly regexps, is the core scsi functionality and its libraries. The problem is that drivers/scsi/*.c includes both legacy drivers, and also scsi core functionality & library functions. I discussed potentially moving all the old drivers to a “legacy” or “vintage” sub-directory at LSF/MM earlier this year with James, but he didn’t seem overly enthusiastic. So it’s going to continue to be lumped in with “Everything else” for now.

The difficulty with figuring out whether many of these issues are real concerns is that because they are hardware drivers, the scanner has no way of knowing what range of valid responses the HBA will return. So there are a number of issues which are of the form “This can’t actually happen, because if the HBA returned this, then we would have called this other function instead”.
Not a problem unique to SCSI, and something that’s seen across many different parts of the kernel.

And for those ancient 15 year old drivers ? It’s tough to find someone who either remembers how they work on a chip level, or cares enough to go back and revisit them.

A closer look at drivers/scsi Coverity scans. is a post from: codemonkey.org.uk

August 15, 2014 02:59 PM

Dave Jones: drivers/staging under the Coverity microscope.

In my previous post, I mentioned that drivers/staging took the top spot for number of issues in a component.

Here’s a ‘zoomed in’ look at the sub-components under drivers/staging.

bcm 103
comedi 45
iio 13
line6 7
lustre 133
media 10
rtl8188eu 42
rtl8192e 17
rtl8192u 21
rtl8712 22
rtl8821 102
rts5208 19
unisys 14
vt6655 47
vt6656 4
everything else in drivers/staging/ (40 other uncategorized drivers) 95

Some of the sub-components with < 10 issues are likely to have their categories removed soon. When they were initially added, the open issues counts were higher, but over time they’ve improved to the point where they could just be lumped in with “everything else”

When Lustre was added back in 3.12, it caused a noticable jump in new issues detected. The largest delta from any one single addition since I’ve been doing regular scans. It’s continuing to make progress, with 20 or so issues being knocked out each release, and few new issues being introduced. Lustre doesn’t suffer from any one issue overly, but has a grab-bag of issues from the many checkers that Coverity has.
Amusingly, Lustre is the only part of the kernel that has Coverity annotations in the code.

Second on the list is the bcm Wimax driver. This has been around in staging for years, and has had a metric shitload of checkpatch type stylistic changes made to it, but relatively few actual functionality fixes. (confession: I was guilty of ~30 of those cleanups myself, but I couldn’t bare to look at the 1906 line bcm_char_ioctl function: Splitting that up did have a nice side-effect though). A lot of the issues in this driver are duplicates due to a problem in a macro being picked up as a new issue for every instance it gets used.

Something that sticks out in this list is the cluster of rtl* drivers. At time of writing there are seven drivers for various Realtek wireless chips, all of varying quality. Much of the code between these drivers is cut-and-pasted from previous drivers. It seems each time Realtek rev new silicon, they do another code-drop with a new driver. Worse yet, many of the fixes that went into the kernel variants don’t make it back to the driver they based their new work on. There have been numerous cases where a bug fixed in one driver has been reintroduced in a new variant months later. There’s a ton of work going on here, and a lot more needed.
Somewhat depressingly, even the not-in-staging rtlwifi driver that lives in drivers/net/wireless has ~100 issues. Many of them the exact same issues as those in the staging drivers.

As bad as it seems, staging is serving its purpose for the most part, and things have gotten a lot quieter each merge window when the staging tree gets pulled. It’s only when it contains something new and huge like Lustre that it really shows up noticeably in the daily stats after each scan. The number of new issues being added are generally lower than the number being fixed. For the 3.17 pull for example, 67 new issues, 132 eliminated. (Note: Those numbers are kernel wide, not *just* staging, but staging made up the majority of the results change on that day).

Something that bothers me slightly is that a number of drivers have ‘escaped’ drivers/staging into the kernel proper, with a high number of open issues. That said, many of those escapees are no worse than drivers that were added 10+ years ago when standards were lower. More on that in a future post.

drivers/staging under the Coverity microscope. is a post from: codemonkey.org.uk

August 15, 2014 01:50 AM

Dave Jones: Linux kernel Coverity scan ‘hot’ areas.

One of the time-consuming parts of organizing the data generated by Coverity has been sorting it into categories, (or components as Coverity refers to them). A component is a wildcard (or exact filename) that matches a specific subsystem, driver, filesystem etc.

As the Linux kernel has thousands of drivers, it isn’t really practical to add a component per-driver, so I started by generalizing into subsystems, and from there, broke down the larger groupings into per-driver components, while still leaving an “everything else” catch-all for drivers within a subsystem that hadn’t been broken out.

According to discussions I’ve had with Coverity, we are actually one of the more ‘heavy’ users of components, and we’ve hit a few scalability problems as we’ve added more and more of them, which has been another reason I’ve not broken things down more than the ~150 components we have so far. Also, if a component has less than 10 or so issues, it’s really not worth the effort of splitting up. (I may revise that cut-off even higher at some point just to keep things managable).

Before the big reveal, some caveats:

Right now, the top ten ‘hot areas’ of the kernel (these include accumulated broken-out drivers), sorted by number of issues are:

drivers/staging 694
fs/ 465
drivers/scsi/ 382
drivers/net/wireless 366
net/ 324
drivers/ethernet/ 285
drivers/media/ 262
drivers/usb/ 140
drivers/infiniband/ 109
arch/x86/ 95
sound/ 89

It should come as no surprise really that the staging drivers take the number one spot. If something had beaten it, I think it would have highlighted a somewhat embarrassing problem in our development methods.

In the next posts, I’ll drill down into each of these categories, and figure out exactly why they’re at the top of the list.

For the impatient: once this series is over, I intend to show breakdowns of the various types of issues being detected, but it’s going to take me a while to get to (probably post kernel summit). There’s an absolute ton of data to dig through, and I’m trying to present as much of it in bite-sized chunks as possible, rather than dozens of pages of info.

Linux kernel Coverity scan ‘hot’ areas. is a post from: codemonkey.org.uk

August 15, 2014 01:34 AM

August 13, 2014

Dave Jones: The first year of Coverity Linux kernel scans.

Next week at kernel summit, I’m going to be speaking about the Coverity scans, and have come up with more material than I have time to present in the short slot, so I’ve decided to turn it into a series of blog posts in a hope to kickstart some discussion ahead of time.

I started doing regular scans against the Linux kernel in July 2013. In that time, I’ve sent a bunch of patches, reported many bugs, and spent hours going through the database categorizing, diagnosing, and closing out issues where possible.

I’ve been doing at least one build per day during each merge window (except obviously on days when there haven’t been any commits), and at least one per -rc once the merge window closes.

A few people have asked me about the config file that I use for the builds.
It’s pretty much an ‘allmodconfig’, except where choices have to be made, I’ve tried to pick the general case that a distribution would select. For some of these, I will occasionally flip between them (for eg, SLAB/SLOB/SLUB, PREEMPT_NONE/PREEMPT_VOLUNTARY/PREEMPT) just for coverage. In total, currently 6955 CONFIG_ options are enabled, 117 disabled. (None by choice, they are all the deselected parts of multi-choice options).

The builds are done x86-64 only. At this time, it’s the only architecture Coverity scan supports. I do have CONFIG_COMPILE_TEST set, so non-x86 drivers that can be built do get scanned. The architecture specific code in arch/ and drivers not covered under COMPILE_TEST being the only parts of the kernel we’re not covering.

Builds take about an hour to build on a 24-core Nehalem. The results are then uploaded to a server which takes another 20 minutes. Then a script kicks something at Coverity to pick up the new tarball and scan it. This can take any number of hours. At best, around 5-6 hours, at worst I’ve seen it take as long as 12 hours. This hopefully answers why I don’t do even more builds, or builds of variant trees. (Although I’m still trying to figure out a way to scan linux-next while having it inherit the results of the issues already marked in Linus tree). Thankfully much of the build/upload/scan process is automated, so I can do other things while I wait for it to finish.

Over the year, the overall defect density has been decreasing.

3.11 0.68
3.12 0.62
3.13 0.59
3.14 0.55
3.15 0.55
3.16 0.53

Moving in the right direction, though things have slowed a little the last few releases. At least in part due to my spending more time on Trinity than going through the Coverity backlog. The good news is that the incoming rate of new bugs each window has also slowed.

Newer issues when they are getting introduced, are getting jumped on faster than before. Many developers have signed up for accounts and are looking over their subsystems each release, which is great. It means I have to spend less time sending email :)
Eventually I hope that Coverity implements a feature I asked for allowing each component to have a designated email address that new reports get sent to. With that in place, plus active triage on the backlog, a real dent could be made in the ~4700 outstanding issues.

Throughout the past year Coverity has made a number of improvements server-side, some at the behest of the scans, resulting in fewer false positives being found by some checkers. A good example of this was some additional heuristics being added to spot intentional ‘missing break in switch statement’ situations. I’ve also been in constant communication whenever an interesting bug was found upstream that Coverity didn’t detect, so over time, additional checkers should be added to catch more bugs.

How do we compare against other projects ?
I picked a few at random.

FreeBSD 0.54 (~15m LOC) 14655 total, 6446 fixed, 8093 outstanding.
Firefox 0.70 (~5.4m LOC) 9008 total. 5066 fixed. 3786 outstanding.
Linux 0.53 (~9m LOC) 13337 total. 7202 fixed. 4761 outstanding.
Python 0.03 ! (~400k LOC) 1030 total. 895 fixed. 3 outstanding.

(LOC based on C preprocessor output)

FreeBSD’s defect density is pretty much the same as Linux right now, despite having a lot more code. I think they include all their userspace in their scans also, so it’s picked up gcc, sendmail, binutils etc etc.

The Python people have made a big effort to keep their defect density low (afaik, the lowest of all projects in scan). They did however have a lot fewer issues to begin with, and have a much smaller codebase. Firefox by comparison seems to have a lot of the same problems Linux has. A large corpus of pre-existing issues, and a large codebase (probably with few people with ‘global’ knowledge)

In my next post, I’ll go into some detail about where some of the more dense areas of the kernel are for Coverity issues. Much of it should be no surprise (old, unmaintained/neglected code etc), but there are a few interesting cases).

update : added FreeBSD statistics.
update 2 : (hi hackernews!) added blurb about coverity improvements.

The first year of Coverity Linux kernel scans. is a post from: codemonkey.org.uk

August 13, 2014 07:07 PM

Paul E. Mc Kenney: A practitioner at a formal-methods conference

I had the privilege of being asked to present on ordering, RCU, and validation at a joint meeting of the REORDER (Third International Workshop on Memory Consistency Models) and EC2 (7th International Workshop on Exploiting Concurrency Efficiently and Correctly) workshops.

Before my talk, Michael Tautschnig gave a presentation (based on this paper) on an interesting prototype tool (called “mole,” with the name chosen because the gestation period of a mole is about 42 days) that helps identify patterns of usage in large code bases. It is early days for this tool, but one could imagine it growing into something quite useful, perhaps answering questions such as “what are the different ways in which the Linux kernel uses reference counting?” He also took care to call out the many disadvantages of testing, which include not being able to test all paths, all possible races, all possible inputs, or all possible much of anything, at least not in finite time.

I started my talk with an overview of split counters, where each CPU (or task or whatever) updates its own counter, and the aggregate counter is read out by summing all the per-CPU counters. There was some concern expressed by members of the audience about the possibility of inconsistent results. For example, if one CPU adds five, another CPU adds seven, and a third CPU adds 11 to initially zero counter, then two CPUs reading out the counter might see 12 and 18, respectively, which are inconsistent (they differ by six, and no CPU added six). To their credit, the attendees picked right up on a reasonable correctness criterion. The idea is that the aggregate counter's value varies with time, and that any given reader will be guaranteed to return a value between that of the counter when the reader started and that of the counter when the reader ended: Consistency is neither needed nor provided in a great number of situations.

I then gave my usual introduction to RCU, and of course it is always quite a bit of fun introducing RCU to people who have never encountered anything like it. There was quite a bit of skepticism initially, as well as a lot of questions and comments.

I then turned to validation, noting the promise of some formal-validation tooling. I ended by saying that although I agreed with the limitations of testing called out by the previous speaker, the fact remains that a number of people have devised tests that had found RCU bugs (thank you, Stephen, Dave, and Fengguang!), but no one has yet devised a hard-core formal-validation tool that has found any bugs in RCU. I also pointed out that this is definitely not because there are no bugs in RCU! (Yes, I have gotten rid of all day-one bugs in RCU, but only by having also gotten rid of all day-one code in RCU.) When asked if I meant bugs in RCU usage or in RCU itself, I replied “Either would be good.” Several people wrote down where to find RCU in the Linux kernel, so it will be interesting to see what they come up with. (Perhaps all too interesting!)

There were several talks on analyzing weakly ordered systems, but keep in mind that for these guys, even x86 is weakly ordered. After all, it allows prior stores to be reordered with later loads.

Another interesting talk was given by Kapil Vaswani on the topic of wait freedom. Recall that in a wait-free algorithm, every process is guaranteed to make some progress in a finite time, even in the presence of arbitrarily long delays for any given process. In contrast, in a lock-free algorithm, only one process is guaranteed to make some progress in a finite time, again, even in the presence of arbitrarily long delays for any given process. It is worth noting that neither of these guarantees is sufficient for real-time programs, which require a specified amount of progress (not merely some progress) in a bounded amount of time (not merely a finite amount of time). Wait-freedom and lock-freedom are nevertheless important forward-progress guarantees, and there are numerous other similar guarantees including obstruction freedom, deadlock freedom, starvation freedom, many more besides.

It turns out that most software in production, even in real-time systems, is not wait-free, which has been a source of consternation for many researchers for quite some time. Kapil went on to describe how Alistarh et al. showed that, roughly speaking, given a non-hostile scheduler and crash-free execution, lock-free algorithms have wait-free behavior.

The interesting thing about this is that you can take it quite a bit farther, and those of you who know me well won't be surprised to learn that I did just that in a question to the speaker. If you have a non-hostile scheduler, crash-free execution, FIFO locks, bounded lock-hold times, no lock nesting, a finite number of processes, and so on, you can obtain the benefits of the more aggressive forward-progress guarantees. The general idea is that if you have at most N processes, and if the maximum lock-hold time is T, then you can wait at most (N-1)T time to acquire a given lock. (Those wishing greater rigor should read Bjoern B. Brandenburg's dissertation — Full disclosure: I was on Bjoern's committee.) In happy contrast to the authors of the paper mentioned in the previous paragraph, the speaker and audience seemed quite intrigued by this line of thought.

In all, it was an extremely interesting and thought-provoking time. With some luck, perhaps we will see some powerful software tools introduced by this group of researchers.

August 13, 2014 03:40 AM

August 10, 2014

Matthew Garrett: Birthplace

For tedious reasons, I will at this stage point out that I was born in Galway, Ireland.

comment count unavailable comments

August 10, 2014 11:44 PM

August 08, 2014

Dave Jones: Week of kernel bugs in review

With the 3.17 merge window opening up this week, it’s been kinda busy.
I also made a few enhancements to Trinity, so it found some bugs that have been there for a while.

In addition to this, I started pulling together a talk for kernel summit based on all the stuff that Coverity has been finding. I’ll eventually get around to turning those into blog posts too, as there’s a lot of material.

Productive week.

Week of kernel bugs in review is a post from: codemonkey.org.uk

August 08, 2014 07:36 PM

Dave Jones: compiler sanitizers.

I only recently discovered the sanitizer libraries that both gcc and llvm support despite them being a few years old now. (libasan, liblsan, libtsan and my favorite libubsan for undefined behaviour detection). LLVM also has a -fsanitize=memory.

Building code with -fsanitize={address|leak|undefined} has turned up a number of hard to find issues in various userspace code I’ve written. (Unfortunately doing this on something like Trinity produces a lot of false positives, as it deliberately generates undefined behavior in many cases, like creating an mmap, never writing to it, and then passing it to something that reads it).

There’s also a variant of libasan for the kernel which looks interesting. I know that’s found a bunch of issues in concert with fuzzing via Trinity, and expect it’s something we’ll see more of if/when that functionality gets merged.

Today I was reading about the recent gcc meeting, and these slides by the sanitizer developers caught my attention. What I found of particular interest was the “MSan for Chromium” slide, where they mention they rebuilt ~40 libraries to link with the sanitizer.

I’ve been contemplating doing this for a subset of some userspace packages in Fedora that I care about for a while, but I’ve not had spare cycles to even look into it. I dogfood a lot of bleeding edge code on all my machines, and have been curious for some time to see what the fallout looks like from such a rebuild of various network facing daemons. I suspect with Chromium being more focused on the client side, there hasn’t been a huge amount of research into this for server side code. Looking at ASan’s found bugs wiki page, it does seem to support that hypothesis. I’m curious to see what would fall out from a rebuilt Apache, Bind, Sendmail, nginx, etc.
Hopefully the developers of all the network facing code we ship are just as curious.

There are obvious comparisons to valgrind, which doesn’t require rebuilding, but in my experience so far, the sanitizers have found a bunch of issues that valgrind didn’t (or got lost in the noise). Also, just like with fuzzers, different tools tend to find different bugs even if they have the same intent. I think there’s room for both approaches.

compiler sanitizers. is a post from: codemonkey.org.uk

August 08, 2014 06:54 PM

August 07, 2014

Daniel Vetter: Neat stuff for 3.17

So with the 3.16 kernel out of the door it's time to look at what's queued up for the Intel graphics driver in 3.17.

This release features the universal plane support from Matt Roper, all enabled already by default. This is prep work for atomic modesetting and pageflipping support: Since a while we support additional (overlay) planes in the DRM core and the i915 driver, but there have always been two implicit planes directly attached to the CRTC: The primary plane used by the SetCrtc and PageFlip functions, and the optional cursor support. But with the atomic ioctl these implicit planes it's easier to handle everything as an explicit plane, so Matt's patches split them away into separate real plane objects. This is a nice cleanup of the kms api in general since a lot of SoC hardware has unified plane hardware, where cursor, primary plane and any overlays are fully interchangeable. So we already expose this to userspace, if it sets the corresponding feature flag.

Another big feature on the display side is the improved PSR support, which is now enabled by default on Haswell and Broadwell. The tricky bit with PSR (and also with FBC) and the reason we didn't yet enable this by default is correctly support legacy frontbuffer rendering (for example for X). The hardware provides a bit of support to do that, but it doesn't catch all possible frontbuffer rendering and has a lot of other limitations. To finally fix this for real we've added accurate frontbuffer tracking in software. This should finally allow us to enable a lot of display power saving features by default like PSR on Baytrail, FBC (on all platforms) and DRRS (dynamic refresh rate switching).

On actual platform display enabling we have lots of improvements all over: Baytrail MIPI DSI support has greatly stabilized, backlight and power sequencer fixes, mmio based flips to work around issues with stalls and hangs for blitter ring based flips and plenty of other work. The core drm pieces for plane rotation support have also landed, unfortunately the i915 parts didn't make the cut for 3.17.

Another big area, as usual, has been general power management improvements. We now support runtime PM for DPMS Off and not just when the output is completely disabled. This was fairly invasive work since our current modesetting code assumed that a DPMS Off/On cycle will not destroy register state, but that's exactly what runtime PM can do. On the plus side this reorganization greatly cleaned up the code base and prepared the driver for atomic modesetting, which requires a similar separation between state computation and actual hw state updating like this feature.

Jesse Barnes implemented S0ix support for system suspend/resume. Marketing has some crazy descriptions for this, but essentially this means that we use the same power saving knobs for system suspend as for runtime PM - the entire machine is still running, just at a very low power state. Long-term this should simplify our system suspend code a bit since we can just reuse all the code used to implement runtime PM.

Moving on to the render side of the gpu there have been again improvements to the rps code. Chris Wilson further tuned the rps boost logic, and Ville and Deepak implemented rps support for Cherrytrail.
Jesse contributed ppgtt support for Baytrail which will be a lot more interesting once we enable full ppgtt again (hopefully in 3.18).

For Broadwell semaphores support from Ben and Rodrigo was merged, but it looks like we need to disable that again due to stability issues. Oscar Mateo also implemented a large pile of interrupt handling improvements which hopefully address the small races and bugs we've had in the past on some platforms. There's also a lot of refactoring patches to prepare for execlist support from Oscar. Excelists are the new way of submitting work to the gpu, first supported on Broadwell (but not yet mandatory). The key feature compared to legacy ringbuffer submission is that we'll finally be able to preempt gpu tasks.

And as usual there have been tons of bugsfixes and improvements all over. Oh and: User mode setting has moved one step further on the path to deprecation and is now fully disabled. If no one complains about this we can finally rip out all that code in one of the next kernel releases.

August 07, 2014 03:36 PM

August 05, 2014

Dave Jones: Linux 3.16 coverity stats

date rev Outstanding fixed defect density
Jun/8/2014 v3.15 4928 6397 0.55
Jun/16/2014 v3.16-rc1 4817 6651 0.53
Jun/23/2014 v3.16-rc2 4815 6653 0.53
Jun/29/2014 v3.16-rc3 4810 6659 0.53
Jul/6/2014 v3.16-rc4 4806 6661 0.53
Jul/14/2014 v3.16-rc5 4801 6663 0.53
Jul/21/2014 v3.16-rc6 4827 7022 0.53
Jul/28/2014 v3.16-rc7 4820 7022 0.53
Aug/4/2014 v3.16 4817 7023 0.53

The 3.16 cycle really started putting a dent in the backlog of older issues. Hundreds of older issues got fixed in -rc1.
There was a small bump at rc5 in new issues being detected, when Coverity upgraded scan.coverity.com to their 7.5.0 release.
Improvements in that upgrade also meant it closed out more issues than it found new (395 new: 409 eliminated)

Many of the new issues detected look to be real problems. 50 or so of them come from a new checker that looks for patterns like

if (condition)
    bar();
else
    bar();

In a lot of drivers however, it seems to be intentional, as these cases come with FIXME comments suggesting that the author doesn’t know what the right thing to do is in the ‘else’ case, or some functionality doesn’t work right yet, so it falls back to doing the same thing in both branches.

It’s now been a year since I first started doing regular builds in Coverity. In that time, the detected defect density has dropped from 0.68 to 0.53 today. We used to see upticks in new issues every time the merge window opened. Now, we’re seeing as many as (or more) issues closed as we are seeing new. As an example: day 1 of the 3.17 merge window yesterday featured 3638 new changes, including all the questionable code in drivers/staging/ Coverity picked up 67 new issues, but 132 got eliminated).

I’m hoping things continue to improve at this rate.

Linux 3.16 coverity stats is a post from: codemonkey.org.uk

August 05, 2014 04:27 PM

Dave Jones: Linux 3.15 coverity stats

date rev Outstanding fixed defect density
Mar/31/2014 v3.14 4811 6126 0.55
Apr/14/2014 v3.15-rc1 4909 6337 0.55
Apr/15/2014 v3.15-rc2 4881 6369 0.55
Apr/21/2014 v3.15-rc3 4878 6375 0.55
Apr/28/2014 v3.15-rc4 4966 6382 0.56
May/9/2014 v3.15-rc5 4960 6389 0.56
May/22/2014 v3.15-rc6 4956 6390 0.56
May/27/2014 v3.15-rc7 4954 6392 0.56
Jun/2/2014 v3.15-rc8 4932 6393 0.55
Jun/8/2014 v3.15 4928 6397 0.55

A belated dump of the statistics coverity gathers on defect density for the 3.15 kernel.
The most interesting thing for this cycle was the bump around rc4. The number of outstanding bugs increased by almost a hundred new defects. This was due to Coverity implementing a new checker for detecting the heartbleed bug in openssl.

After dismissing a bunch of false positives/intentional cases, we ended the cycle with a delta vs the previous release of just over a hundred new outstanding issues, but the overall defect density remained at 0.55.

Linux 3.15 coverity stats is a post from: codemonkey.org.uk

August 05, 2014 04:05 PM

Dave Jones: Vacation over, back to oopses.

First day back after a week off.
Started the day by screwing up a coverity build, because people.fedoraproject.org went away. Fixed up my scripts and started over, running the analysis on 3.16 which came out yesterday. I’ll do another report tomorrow on how things have changed over the last release or two (sudden realization that I haven’t done one since 3.14).

Spent some time in the afternoon beginning some new functionality in trinity, to have some shared files that all threads write/seek/truncate etc to. It didn’t take much code at all before I was staring at my first oops, when btrfs blew up. Pretty sad, given I’m not even really started doing anything particularly interesting with this code yet.

Vacation over, back to oopses. is a post from: codemonkey.org.uk

August 05, 2014 03:15 AM

August 02, 2014

Rusty Russell: ccan/io: revisited

There are numerous C async I/O libraries; tevent being the one I’m most familiar with.  Yet, tevent has a very wide API, and programs using it inevitably descend into “callback hell”.  So I wrote ccan/io.

The idea is that each I/O callback returns a “struct io_plan” which says what I/O to do next, and what callback to call.  Examples are “io_read(buf, len, next, next_arg)” to read a fixed number of bytes, and “io_read_partial(buf, lenp, next, next_arg)” to perform a single read.  You could also write your own, such as pettycoin’s “io_read_packet()” which read a length then allocated and read in the rest of the packet.

This should enable a convenient debug mode: you turn each io_read() etc. into synchronous operations and now you have a nice callchain showing what happened to a file descriptor.  In practice, however, debug was painful to use and a frequent source of bugs inside ccan/io, so I never used it for debugging.

And I became less happy when I used it in anger for pettycoin, but at some point you’ve got to stop procrastinating and start producing code, so I left it alone.

Now I’ve revisited it.   820 insertions(+), 1042 deletions(-) and the code is significantly less hairy, and the API a little simpler.  In particular, writing the normal “read-then-write” loops is still very nice, while doing full duplex I/O is possible, but more complex.  Let’s see if I’m still happy once I’ve merged it into pettycoin…

August 02, 2014 06:58 AM

Matt Domsch: Ottawa Linux Symposium needs your help

If you have ever attended the Ottawa Linux Symposium (OLS), read a paper on a technology first publicly suggested at OLS, or use Linux today, please consider donating to help the conference and Andrew Hutton, the conference’s principal organizer since 1999.

I first attended OLS in the summer of 2003. I had heard of this mythical conference in Canada each summer, a long way from Austin yet still considered domestic rather than international for the purposes of business travel authorization, so getting approval to attend wasn’t so hard. I met Val on the walk from Les Suites to the conference center on the first morning, James Bottomley during a storage subsystem breakout the first afternoon, Jon Masters while still in his manic coffee phase, and countless others that first year. Willie organized the bicycle-chain keysigning that helped people put faces to names we only knew via LKML posts. I remember meeting Andrew in the ever-present hallway track, and somehow wound up on the program committee for the following year and the next several.

I went on to submit papers in 2004 (DKMS), 2006 (Firmware Tools), 2008 (MirrorManager). Getting a paper accepted meant great exposure for your projects (these three are still in use today). It also meant an invitation to my first exposure to the party-within-the-party – the excellent speaker events that Andrew organized as a thank-you to the speakers. Scotch-tastings with a haggis celebrated by Stephen Tweedie. A cruise on the Ottawa River. An evening in a cold war fallout shelter reserved for Parliament officials with the most excellent Scotch that only Mark Shuttleworth could bring. These were always a special treat which I always looked forward to.

Andrew, and all the good people who helped organize OLS each year, put on quite a show, being intentional about building the community – not by numbers (though for quite a while, attendance grew and grew) – but providing space to build deep personal connections that are so critical to the open source development model. It’s much harder to be angry about someone rejecting your patches when you’ve met them face to face, and rather than think it’s out of spite, understand the context behind their decisions, and how you can better work within that context. I first met many of the Linux developers face-to-face at OLS that became my colleagues for the last 15 years.

I haven’t been able to attend for the last few years, but always enjoyed the conference, the hallway talks, the speaker parties, and the intentional community-building that OLS represents.

Several economic changes conspired to put OLS into the financial bind it is today. You can read Andrew’s take about it on the Indiegogo site. I think the problems started before the temporary move to Montreal. In OLS’s growth years, the Kernel Summit was co-located, and preceded OLS. After several years with this arrangement, the Kernel Summit members decided that OLS was getting too big, that the week got really really long (2 days of KS plus 4 days of OLS), and that everyone had been to Ottawa enough times that it was time to move the meetings around. Cambridge, UK would be the next KS venue (and a fine venue it was). But in moving KS away, some of the gravitational attraction of so many kernel developers left OLS as well.

The second problem came in moving the Ottawa Linux Symposium to Montreal for a year. This was necessary, as the conference facility in Ottawa was being remodeled (really, rebuilt from the ground up), which prevented it from being held there. This move took even more of the wind out of the sails. I wasn’t able to attend the Montreal symposium, nor since, but as I understand it, attendance has been on the decline ever since. Andrew’s perseverance has kept the conference alive, albeit smaller, at a staggering personal cost.

Whether or not the conference happens in 2015 remains to be seen. Regardless, I’ve made a donation to support the debt relief, in gratitude for the connections that OLS forged for me in the Linux community. If OLS has had an impact in your career, your friendships, please make a donation yourself to help both Andrew, and the conference.

Visit the OLS Indigogo site to show your respect.

August 02, 2014 03:52 AM

July 30, 2014

Pavel Machek: Friends don't let friends freeze their hard drives

Hour and 15 minutes later, platters look really frozen... and heads are leaving watery trails on the harddrive, that clicks. Ok, this is not looking good.

Should not have let it run with water on board -- outside tracks are physically destroyed.

Next candidate: WD Caviar, WD200, 20GB.

This one is actually pretty impressive. It clearly has place for four (or so) platters, and there's only one populated. And this one actually requires cover for operation, otherwise it produces "interesting sounds" (and no data).

It went to refrigerator for few hours but then I let it thaw before continuing operation. Disk still works with few bad sectors. I overwrote the disk with zeros, and that recovered the bad sectors.

Did fingerprint on the surface. Bad idea, that killed the disk.

Ok, so we have two information from advertising confirmed: freezing can and will kill the disk, and some hard drives need their screws for operation.

July 30, 2014 04:33 PM

Pavel Machek: I have seen the future

...and did not like what I saw. I installed Debian/testing. Now I know why everyone hates systemd: it turned minor error (missing firmware for wlan card) into message storm (of increasing speed) followed by forkbomb. Only OOM stopped the madness.

Now, I've seen Gnome3 before, and it is unusable -- at least on X60 hardware. So I went directly into Mate, hoping to see friendly Gnome2-like desktop. Well, it look familiar but slightly different. After a while I discovered I'm actually in Xfce. So log-out, log-in, and yes, this looks slightly more familiar. Unfortunately, theme is still different, window buttons are smaller and Terminal's no longer can be resized using lower-right corner. I also tried to restore my settings (cp -a /oldhome/.[a-z]* .) and it did not have the desired effect.

July 30, 2014 04:27 PM

Dave Airlie: you have a long road to walk, but first you have to leave the house

or why publishing code is STEP ZERO.

If you've been developing code internally for a kernel contribution, you've probably got a lot of reasons not to default to working in the open from the start, you probably don't work for Red Hat or other companies with default to open policies, or perhaps you are scared of the scary kernel community, and want to present a polished gem.

If your company is a pain with legal reviews etc, you have probably spent/wasted months of engineering time on internal reviews and stuff, so think all of this matters later, because why wouldn't it, you just spent (wasted) a lot of time on it, so it must matter.

So you have your polished codebase, why wouldn't those kernel maintainers love to merge it.

Then you publish the source code.

Oh, look you just left your house. The merging of your code is many many miles distant and you just started walking that road, just now, not when you started writing it, not when you started legal review, not when you rewrote it internally the 4th time. You just did it this moment.

You might have to rewrite it externally 6 times, you might never get it merged, it might be something your competitors are also working on, and the kernel maintainers would rather you cooperated with people your management would lose their minds over, that is the kernel development process.

step zero: publish the code. leave the house.

(lately I've been seeing this problem more and more, so I decided to write it up, and it really isn't directed at anyone in particular, I think a lot of vendors are guilty of this).

July 30, 2014 01:47 AM

July 29, 2014

Rusty Russell: Pettycoin Alpha01 Tagged

As all software, it took longer than I expected, but today I tagged the first version of pettycoin.  Now, lots more polish and features, but at least there’s something more than the git repo for others to look at!

July 29, 2014 07:53 AM

July 24, 2014

Pavel Machek: Nowcasting for whole Europe, international travel script

CHMI changed their webpages, so that old.chmi.cz no longer worked, so I had to adapt nowcast. My first idea was to use radareu.cz, that has nice coverage of whole europe, but pictures are too big and interpolated... and handling them takes time. So I updated it once more, now it supports new format of chmi pictures. But it also means that if you are within EU and want to play with weather nowcasting, you now can... just be warned it is sligtly slow... but very useful, especially in rainy/stormy weather these days.

Now, I don't know about you, but I always forget something when travelling internationally. Like.. power converters, or the fact that target is in different time zone. Is there some tool to warn you about differences between home and target countries? (I'd prefer it offline, for privacy reasons, but...) I started country script, with some data from wikipedia, but it is quite incomplete and would need a lot of help.

July 24, 2014 01:00 PM

Pavel Machek: More fun with spinning rust

So I took an old 4GB (IBM) drive for a test. Oops, it sounds wrong while spinning up. Perhaps I need to use two usb cables to get enough power?

Lets take 60GB drive... that one works well. Back to 4GB one. Bad, clicking sounds.

IBM actually used two different kinds of screws, so I can not non-destructively open this one... and they actually made platters out of glass. Noone is going to recover data from this one... and I have about 1000 little pieces of glass to collect.

Next candidate: Seagate Barracuda ATA III ST320414A, 20GB.

Nice, cca 17MB/sec transfer, disk is now full of photos. Data recovery firms say that screw torque matters. I made all of them very loose, then removed them altogether, then found the second hidden screw and then ran the drive open. It worked ok.

Air filter is not actually secured in any way, and I guess I touched the platters with the cover while opening. Interestingly, these heads do not stick to surface, even when manually moved.

Friends do not let friends freeze their hard drives, but this one went into two plastic back and into refrigerator. Have you noticed how the data-recovery firms placed the drive there without humidity protection?

So, any bets if it will be operational after I remove it from the freezer?

July 24, 2014 12:51 PM

July 23, 2014

Michael Kerrisk (manpages): Linux/UNIX System Programming course scheduled for October 2014

I've scheduled a further 5-day Linux/UNIX System Programming course to take place in Munich, Germany, for the week of 6-10 October 2014.

The course is intended for programmers developing system-level, embedded, or network applications for Linux and UNIX systems, or programmers porting such applications from other operating systems (e.g., Windows) to Linux or UNIX. The course is based on my book, The Linux Programming Interface (TLPI), and covers topics such as low-level file I/O; signals and timers; creating processes and executing programs; POSIX threads programming; interprocess communication (pipes, FIFOs, message queues, semaphores, shared memory),  network programming (sockets), and server design.
     
The course has a lecture+lab format, and devotes substantial time to working on some carefully chosen programming exercises that put the "theory" into practice. Students receive a copy of TLPI, along with a 600-page course book containing the more than 1000 slides that are used in the course. A reading knowledge of C is assumed; no previous system programming experience is needed.

Some useful links for anyone interested in the course:

Questions about the course? Email me via training@man7.org.

July 23, 2014 05:49 AM

July 17, 2014

Rusty Russell: API Bug of the Week: getsockname().

A “non-blocking” IPv6 connect() call was in fact, blocking.  Tracking that down made me realize the IPv6 address was mostly random garbage, which was caused by this function:

bool get_fd_addr(int fd, struct protocol_net_address *addr)
{
   union {
      struct sockaddr sa;
      struct sockaddr_in in;
      struct sockaddr_in6 in6;
   } u;
   socklen_t len = sizeof(len);
   if (getsockname(fd, &u.sa, &len) != 0)
      return false;
   ...
}

The bug: “sizeof(len)” should be “sizeof(u)”.  But when presented with a too-short length, getsockname() truncates, and otherwise “succeeds”; you have to check the resulting len value to see what you should have passed.

Obviously an error return would be better here, but the writable len arg is pretty useless: I don’t know of any callers who check the length return and do anything useful with it.  Provide getsocklen() for those who do care, and have getsockname() take a size_t as its third arg.

Oh, and the blocking?  That was because I was calling “fcntl(fd, F_SETFD, …)” instead of “F_SETFL”!

July 17, 2014 03:31 AM

July 16, 2014

Dave Jones: Closure on some old bugs.

Closure on some old bugs. is a post from: codemonkey.org.uk

July 16, 2014 03:10 AM

July 15, 2014

James Morris: Linux Security Summit 2014 Schedule Published

The schedule for the 2014 Linux Security Summit (LSS2014) is now published.

The event will be held over two days (18th & 19th August), starting with James Bottomley as the keynote speaker.  The keynote will be followed by referred talks, group discussions, kernel security subsystem updates, and break-out sessions.

The refereed talks are:

Discussion session topics include Trusted Kernel Lock-down Patch Series, led by Kees Cook; and EXT4 Encryption, led by Michael Halcrow & Ted Ts’o.   There’ll be kernel security subsystem updates from the SELinux, AppArmor, Smack, and Integrity maintainers.  The break-out sessions are open format and a good opportunity to collaborate face-to-face on outstanding or emerging issues.

See the schedule for more details.

LSS2014 is open to all registered attendees of LinuxCon.  Note that discounted registration is available until the 18th of July (end of this week).

See you in Chicago!

July 15, 2014 11:00 PM