Kernel Planet

July 23, 2014

Michael Kerrisk (manpages): Linux/UNIX System Programming course scheduled for October 2014

I've scheduled a further 5-day Linux/UNIX System Programming course to take place in Munich, Germany, for the week of 6-10 October 2014.

The course is intended for programmers developing system-level, embedded, or network applications for Linux and UNIX systems, or programmers porting such applications from other operating systems (e.g., Windows) to Linux or UNIX. The course is based on my book, The Linux Programming Interface (TLPI), and covers topics such as low-level file I/O; signals and timers; creating processes and executing programs; POSIX threads programming; interprocess communication (pipes, FIFOs, message queues, semaphores, shared memory),  network programming (sockets), and server design.
     
The course has a lecture+lab format, and devotes substantial time to working on some carefully chosen programming exercises that put the "theory" into practice. Students receive a copy of TLPI, along with a 600-page course book containing the more than 1000 slides that are used in the course. A reading knowledge of C is assumed; no previous system programming experience is needed.

Some useful links for anyone interested in the course:

Questions about the course? Email me via training@man7.org.

July 23, 2014 05:49 AM

July 17, 2014

Rusty Russell: API Bug of the Week: getsockname().

A “non-blocking” IPv6 connect() call was in fact, blocking.  Tracking that down made me realize the IPv6 address was mostly random garbage, which was caused by this function:

bool get_fd_addr(int fd, struct protocol_net_address *addr)
{
   union {
      struct sockaddr sa;
      struct sockaddr_in in;
      struct sockaddr_in6 in6;
   } u;
   socklen_t len = sizeof(len);
   if (getsockname(fd, &u.sa, &len) != 0)
      return false;
   ...
}

The bug: “sizeof(len)” should be “sizeof(u)”.  But when presented with a too-short length, getsockname() truncates, and otherwise “succeeds”; you have to check the resulting len value to see what you should have passed.

Obviously an error return would be better here, but the writable len arg is pretty useless: I don’t know of any callers who check the length return and do anything useful with it.  Provide getsocklen() for those who do care, and have getsockname() take a size_t as its third arg.

Oh, and the blocking?  That was because I was calling “fcntl(fd, F_SETFD, …)” instead of “F_SETFL”!

July 17, 2014 03:31 AM

July 16, 2014

Dave Jones: Closure on some old bugs.

Closure on some old bugs. is a post from: codemonkey.org.uk

July 16, 2014 03:10 AM

July 15, 2014

James Morris: Linux Security Summit 2014 Schedule Published

The schedule for the 2014 Linux Security Summit (LSS2014) is now published.

The event will be held over two days (18th & 19th August), starting with James Bottomley as the keynote speaker.  The keynote will be followed by referred talks, group discussions, kernel security subsystem updates, and break-out sessions.

The refereed talks are:

Discussion session topics include Trusted Kernel Lock-down Patch Series, led by Kees Cook; and EXT4 Encryption, led by Michael Halcrow & Ted Ts’o.   There’ll be kernel security subsystem updates from the SELinux, AppArmor, Smack, and Integrity maintainers.  The break-out sessions are open format and a good opportunity to collaborate face-to-face on outstanding or emerging issues.

See the schedule for more details.

LSS2014 is open to all registered attendees of LinuxCon.  Note that discounted registration is available until the 18th of July (end of this week).

See you in Chicago!

July 15, 2014 11:00 PM

Dave Jones: catch-up after a brief hiatus.

Yikes, almost a month since I last posted.
In that time, I’ve spent pretty much all my time heads down chasing memory corruption bugs in Trinity, and whacking a bunch of smaller issues as I came across them. Some of the bugs I’ve been chasing have taken a while to reproduce, so I’ve deliberately held off on changing too much at once this last few weeks, choosing instead to dribble changes in a few commits at a time, just to be sure things weren’t actually getting worse. Every time I thought I’d finally killed the last bug, I’d do another run for a few hours, and then see the same corrupted structures. Incredibly frustrating. After a process of elimination (I found a hundred places where the bug wasn’t), I think I’ve finally zeroed in on the problematic code, in the functions that generate random filenames.
I pretty much gutted that code today, which should remove both the bug, and a bunch of unnecessary operations that never found any kernel bugs anyway. I’m glad I spent the time to chase this down, because the next bunch of features I plan to implement leverage this code quite heavily, and would have only caused even more headache later on.

The one plus side of chasing this bug the last month or so has been all the added debugging code I’ve come up with. Some of it has been committed for re-use later, while some of the more intrusive debug features (like storing backtraces for every locking operation) I opted not to commit, but will keep the diff around in case it comes in handy again sometime.

Spent the afternoon clearing out my working tree by committing all the clean-up patches I’ve done while doing this work. Some of them were a little tangled and needed separating into multiple commits). Next, removing some lingering code that hasn’t really done anything useful for a while.

I’ve been hesitant to start on the newer features until things calmed down, but that should hopefully be pretty close.

catch-up after a brief hiatus. is a post from: codemonkey.org.uk

July 15, 2014 03:26 PM

July 14, 2014

Pavel Machek: Fun with spinning rust

Got a hard drive that would not spin up, to attempt recovery. Getting necessary screwdrivers was not easy, but eventually I managed to open the drive. (After hitting it few times in an attempt to unstick the heads). Now, this tutorial does not sound that bad, and yes, I managed to un-stick the heads. Drive now spins up... and keeps seeking, not getting ready. I tried to run the drive open, and heads only go to the near half of platters... I assume something is wrong there? I tried various torques on the screws as some advertising video suggested.

(Also, drives immediately stick to the platters when I move them manually. I guess that's normal?)

Drive is now in the freezer, and probably beyond repair... but if you have some ideas, or some fun uses for dead hard drive, I guess I can try them. Data on the disk are not important enough to do platter-transplantation.

July 14, 2014 10:00 PM

July 09, 2014

Michael Kerrisk (manpages): man-pages-3.70 is released

I've released man-pages-3.70. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This is a relatively small release. As well as many smaller fixes to various pages, the more notable changes in man-pages-3.70 are the following:

July 09, 2014 11:24 AM

July 04, 2014

Matthew Garrett: Self-signing custom Android ROMs

The security model on the Google Nexus devices is pretty straightforward. The OS is (nominally) secure and prevents anything from accessing the raw MTD devices. The bootloader will only allow the user to write to partitions if it's unlocked. The recovery image will only permit you to install images that are signed with a trusted key. In combination, these facts mean that it's impossible for an attacker to modify the OS image without unlocking the bootloader[1], and unlocking the bootloader wipes all your data. You'll probably notice that.

The problem comes when you want to run something other than the stock Google images. Step number one for basically all of these is "Unlock your bootloader", which is fair enough. Step number two is "Install a new recovery image", which is also reasonable in that the key database is stored in the recovery image and so there's no way to update it without doing so. Except, unfortunately, basically every third party Android image is either unsigned or is signed with the (publicly available) Android test keys, so this new recovery image will flash anything. Feel free to relock your bootloader - the recovery image will still happily overwrite your OS.

This is unfortunate. Even if you've encrypted your phone, anyone with physical access can simply reboot into recovery and reflash /system with something that'll stash your encryption key and mail your data to the NSA. Surely there's a better way of doing this?

Thankfully, there is. Kind of. It's annoying and involves a bunch of manual processes and you'll need to re-sign every update yourself. But it is possible to configure Nexus devices in such a way that you retain the same level of security you had when you were using the Google keys without losing the freedom to run whatever you want. Here's how.

Note: This is not straightforward. If you're not an experienced developer, you shouldn't attempt this. I'm documenting this so people can create more user-friendly approaches.

First: Unlock your bootloader. /data will be wiped.
Second: Get a copy of the stock recovery.img for your device. You can get it from the factory images available here
Third: Grab mkbootimg from here and build it. Run unpackbootimg against recovery.img.
Fourth: Generate some keys. Get this script and run it.
Fifth: zcat recovery.img-ramdisk.gz | cpio -id to extract your recovery image ramdisk. Do this in an otherwise empty directory.
Sixth: Get DumpPublicKey.java from here and run it against the .x509.pem file generated in step 4. Replace /res/keys from the recover image ramdisk with the output. Include the "v2" bit at the beginning.
Seventh: Repack the ramdisk image (find . | cpio -o -H newc | gzip > ../recovery.img-ramdisk.gz) and rebuild recovery.img with mkbootimg.
Eighth: Write the new recovery image to your device
Ninth: Get signapk from here and build it. Run it against the ROM you want to sign, using the keys you generated earlier. Make sure you use the -w option to sign the whole zip rather than signing individual files.
Tenth: Relock your bootloader
Eleventh: Boot into recovery mode and sideload your newly signed image.

At this point you'll want to set a reasonable security policy on the image (eg, if it grants root access, ensure that it requires a PIN or something), but otherwise you're set - the recovery image can't be overwritten without unlocking the bootloader and wiping all your data, and the recovery image will only write images that are signed with your key. For obvious reasons, keep the key safe.

This, well. It's obviously an excessively convoluted workflow. A *lot* of it could be avoided by providing a standardised mechanism for key management. One approach would be to add a new fastboot command for modifying the key database, and only permit this to be run when the bootloader is unlocked. The workflow would then be something like

which seems more straightforward. Long term, individual projects could do the signing themselves and distribute their public keys, resulting in the install process becoming as easy as which is actually easier than the current requirement to install an entirely new recovery image.

I'd actually previously criticised Google on the grounds that using custom keys wasn't possible on Android devices. I was wrong. It is, it's just that (as far as I can tell) nobody's actually documented it before. It's important that users not be forced into treating security and freedom as mutually exclusive, and it's great that Google have made that possible.

[1] This model fails if it's possible to gain root on the device. Thankfully this would never hold on what's that over there is that a distraction?

comment count unavailable comments

July 04, 2014 10:10 PM

July 03, 2014

Pavel Machek: Web browser limits desktop on low-powered machines

It seems that web browser is the limit when it comes to low-powered machines. Chromium is pretty much unusable with 512MB, usable with 1GB and nice with 2GB. Firefox is actually usable with 512MB -- it does not seem to have so big per-tab overhead -- but seems to be less responsive.

Anyway, it seems I'll keep using x86 for desktops for now.

July 03, 2014 10:29 AM

June 30, 2014

Paul E. Mc Kenney: Confessions of a Recovering Proprietary Programmer, Part XII

We have all suffered from changing requirements. We are almost done with implemention, maybe even most of the way done testing, and then a change in requirements invalidates all of our hard work. This can of course be extremely irritating.

It turns out that changing requirements are not specific to software, nor are they anything new. Here is an example from my father's area of expertise, which is custom-built large-scale food processing equipment.

My father's company was designing a new-to-them piece of equipment, namely a spinach chopper able to chop 15 tons of spinach per hour. Given that this was a new area, they built a small-scale prototype with which they ran a number of experiments on “small” batches of spinach (that is, less than a ton of spinach). The customer provided feedback on the results of these experiments, which fed into the design of the full-scale production model.

After the resulting spinach chopper had been in production for a short while, there was a routine follow-up inspection. The purpose of the inspection was to check for unexpected wear and other unanticipated problems with the design and fabrication. While the inspector was there, the chopper kicked a large rock into the air. It turned out that spinach from the field can contain rocks up to six inches (15cm) in diameter, and this requirement was not been communicated during development. This situation of course inspired a quick engineering change, installing a cage that captured the rocks.

Of course, it is only fair to point out that this requirement didn't actually change. After all, spinach from the field has always contained rocks up to six inches in diameter. There had been a failure to communicate an existing requirement, not a change in the actual requirement.

However, from the viewpoint of the engineers, the effect is the same. Regardless of whether there was an actual change in the requirements or the exposure of a previously hidden requirement, redesign and rework will very likely required.

One other point of interest to those who know me... The spinach chopper was re-inspected some five years after it started operation. Its blades had never been sharpened, and despite encountering the occasional rock, still did not need sharpening. So to those of you who occasionally accuse me of over-engineering, all I can say is that I come by it honestly! ;–)

June 30, 2014 10:36 PM

Pavel Machek: Warning: don't use 3.16-rc1

As Andi found, and it should be fixed in newest -rcs, but I just did

root@amd:~# mkfs.ext4 -c /dev/mapper/usbhdd

(yes, obscure 4GB bug, how could it hit me?)

And now I have

root@amd:/# dumpe2fs -b /dev/mapper/usbhdd
dumpe2fs 1.41.12 (17-May-2010)
1011347
2059923
3108499
4157075

and

>>> (2059923-1011347)/1024.
1024.0
>>> (3108499-1011347)/1024.
2048.0
>>>

. Yes, badblocks detected error every 4GB.

I'll update now, and I believe disk errors will mysteriously disappear.

June 30, 2014 08:03 PM

June 22, 2014

Pavel Machek: Feasibility of desktop on ARM cpu

Thinkpad X60 is old, Core Duo@1.8GHz, 2GB RAM notebook. But it is still pretty usable desktop machine, as long as Gnome2 is used, number of Chromium tabs does not grow "unreasonable", and development is not attempted there. But eats a bit too much power.

OLPC 1.75 is ARM v7@0.8GHz, .5GB RAM. According to my tests, it should be equivalent to Core Solo@0.43GHz. Would that make an usable desktop?

Socrates is dual ARM v7@1.5GHz, 1GB RAM. It should be equivalent to Core Duo@0.67GHz. Oh, and I'd have to connect display over USB. Would that be usable?

Ok, lets try. "nosmp mem=512M" makes thinkpad not boot. "nosmp mem=512M@1G" works a bit better. 26 chromium tabs make machine unusable: mouse lags, and system is so overloaded that X fails to
interpret keystrokes correctly. (Or maybe X and input subsystem sucks so much that it fails to interpret input correctly on moderate system load?)

I limited CPU clock to 1GHz; that's as low as thinkpad can go:
/sys/devices/system/cpu/cpu0/cpufreq# echo 1000000 > scaling_max_freq

Machine feels slightly slower, but usable as long as chromium is stopped. Even video playback is usable at SD resolution.

With limited number of tabs (7), situation is better, but single complex tab (facebook) still makes machine swap and unusable. And... slow CPU makes "unresponsive tabs" pop up way too often.

Impressions so far: Socrates CPU might be enough for marginally-usable desktop. 512MB RAM definitely is not. Will retry with 1GB one day.

June 22, 2014 02:59 PM

June 21, 2014

Rusty Russell: Alternate Blog for my Pettycoin Work

I decided to use github for pettycoin, and tested out their blogging integration (summary: it’s not very integrated, but once set up, Jekyll is nice).  I’m keeping a blow-by-blow development blog over there.

June 21, 2014 12:14 AM

June 20, 2014

Dave Jones: Transferring maintainership of x86info.

On Mon Feb 26 2001, I committed the version of x86info. 13 years later I’m pretty much done with it.
Despite a surge of activity back in February, when I reached a new record monthly commit count record, it’s been a project I’ve had little time for over the last few years. Luckily, someone has stepped up to take over. Going forward, Nick Black will be maintaining it here.

I might still manage the occasional commit, but I won’t be feeling guilty about neglecting it any more.

An unfinished splinter project that I started a while back x86utils was intended to be a replacement for x86info of sorts, by moving it to multiple small utilities rather than the one monolithic blob that x86info is. I’m undecided if I’ll continue to work on this, especially given my time commitments to Trinity and related projects.

Transferring maintainership of x86info. is a post from: codemonkey.org.uk

June 20, 2014 06:34 PM

June 19, 2014

Pavel Machek: debootstrap, olpc, and gnome

Versioned Fedora setup that is by default on OLPC-1.75 is a bit too strange for me, so I attempted to get Debian/ARM to work there. First, I used multistrap, but that uses separate repositories. debootstrap might be deprecated, but at least it works... after I figured need to do debootstrap --second-stage in chroot. I added firmware, set user password, installed system using tasksel, and things started working.

So now, I have a working gnome on OLPC... with everything a bit too small. I don't know how OLPC solved high-dpi support, but Debian certainly does not do the right thing by default.

June 19, 2014 02:49 PM

June 17, 2014

Dave Jones: Daily log June 16th 2014

Catch up from the last week or so.
Spent a lot of time turning a bunch of code in trinity inside out. After splitting the logging code into “render” and “write out buffer”, I had a bunch of small bugs to nail down. While chasing those I kept occasionally hitting some strange bug where occasionally the list of pids would get corrupted. Two weeks on, and I’m still no closer to figuring out the exact cause, but I’ve got a lot more clues (mostly by now knowing what it _isn’t_ due to a lot of code rewriting).
Some of that rewriting has been on my mind for a while, the shared ‘shm’ structure between all processes was growing to the point that I was having trouble remembering why certain variables were shared there rather than just be globals in main, inherited at fork time. So I moved a bunch of variables from the shm to another structure named ‘childdata’, of which there are an array of ptrs to in the shm.

Some of the code then looked a little convoluted with references that used to be shm->data being replaced with shm->children[childno].data, which I remedied by having each child instantiate a ‘this_child’ var when it starts up, allowing this_child->data to point to the right thing.

All of this code churn served in part to refresh my memory on how some of this worked, hoping that I’d stumble across the random scribble that occasionally happens. I also added a bunch of code to things like dump backtraces, which has been handy for turning up 1-2 unrelated bugs.

On Friday afternoon I discovered that the bug only shows up if MALLOC_PERTURB_ is set. Which is curious, because the struct in question that gets scribbled over is never free()’d. It’s allocated once at startup, and inherited in each child when it fork()’s.

Debugging continues..

Daily log June 16th 2014 is a post from: codemonkey.org.uk

June 17, 2014 04:11 AM

June 16, 2014

Rusty Russell: Rusty Goes on Sabbatical, June to December

At linux.conf.au I spoke about my pre-alpha implementation of Pettycoin, but progress since then has been slow.  That’s partially due to yak shaving (like rewriting ccan/io library), partially reimplementation of parts I didn’t like, and partially due to the birth of my son, but mainly because I have a day job which involves working on Power 8 KVM issues for IBM.  So Alex convinced me to take 6 months off from the day job, and work 4 days a week on pettycoin.

I’m going to be blogging my progress, so expect several updates a week.  The first few alpha releases will be useless for doing any actual transactions, but by the first beta the major pieces should be in place…

June 16, 2014 08:50 AM

June 15, 2014

Michael Kerrisk (manpages): man-pages-3.69 is released

I've released man-pages-3.69. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

Aside from minor improvements and fixes to various pages, the notable changes in man-pages-3.69 are the following:

June 15, 2014 07:33 PM

June 11, 2014

Daniel Vetter: Documentation for drm/i915

So over the past few years the drm subsystem gained some very nice documentation. And recently we've started to follow suite with the Intel graphics driver. All the kernel documenation is integrated into one big DocBook and I regularly upload latest HTML builds of the Linux DRM Developer's Guide. This is built from drm-intel-nightly so has slightly freshed documentation (hopefully) than the usual DocBook builds from Linus' main branch which can be found all over the place. If you want to build these yourself simply run

$ make htmldocs

For testing we now also have neat documentation for the infrastructure and helper libraries found in intel-gpu-tools. The README in the i-g-t repository has detailed build instructions - gtkdoc is a bit more of a fuzz to integrate.

Below the break some more details about documentation requirements relevant for developers.

So from now on I expect reasonable documentation for new, big kernel features and for new additions to the i-g-t library.

For i-g-t the process is simple: Add the gtk-doc comment blocks to all newly added functions, install and build with gtk-doc enabled. Done. If the new library is tricky (for example the pipe CRC support code) a short overview section that references some functions to get people started is useful, but not really required. And with the exception of the still in-flux kernel modesetting helper library i-g-t is fully documented, so there's lots of examples to copy from.

For the kernel this is a bit more involved, mostly since kerneldoc sucks more. But we also only just started with documenting the drm/i915 driver itself.

  1. First extract all the code for your new feature into a new file. There's unfortunately no other way to sensibly split up and group the reference documentation with kerneldoc. But at least that will also be a good excuse to review the related interfaces before extracting them.
  2. Create reference kerneldoc comments for the functions used as interfaces to the rest of the driver. It's always a bit a judgement call what to document and what not, since compared to the DRM core where functions must be explicitly exported to drivers there's no clean separate between the core parts and subsystems and more mundane platform enabling code. For big and complicated features it's also good practice to have an overview DOC: section somewhere at the beginning of the file.
  3. Note that kerneldoc doesn't have support for markdown syntax (or anything else like that) and doesn't do automatic cross-referencing like gtk-doc. So if you documentation absolutely needs a table or a list you have to do it twice unfortunately: Once as a plain code comment and once as a DocBook marked-up table or list. Long-term we want to improve the kerneldoc markup support, but for now we have to deal with what we have.
  4. As with all documentation don't document the details of the implementation - otherwise it will get stale fast because comments are often overlooked when updating code.
  5. Integrate the new kerneldoc section into the overall DRM DocBook template. Note that you can't go deeper than a section2 nesting for otherwise the reference documentation won't be lists, and due to the lack of any autogenerated cross-links inaccessible and useless. Build the html docs to check that your overview summary and reference sections have all been pulled in and that the kerneldoc parser is happy with your comments.
A really nice example for how to do this all is the documentation for the gen7 cmd parser in i915_cmd_parser.c.

June 11, 2014 06:51 PM

June 10, 2014

Daniel Vetter: Neat drm/i915 stuff for 3.16

Linus decided to have a bit fun with the 3.16 merge window and the 3.15 release, so I'm a bit late with our regular look at the new stuff for the Intel graphics driver.
First things first, Baytrail/Valleyview has finally gained support for MIPI DSI panels! Which means no more ugly hacks to get machines like the ASUS T100 going for users and no more promises we can't keep from developers - it landed for real this time around. Baytrail has also seen a lot of polish work in e.g. the infoframe handling, power domain reset, ...

Continuing on the new hardware platform this release features the first version of our prelimary support for Cherryview. At a very high level this combines a Gen8 render unit derived from Broadwell with a beefed-up Valleyview display block. So a lot of the enabling work boiled down to wiring up existing code, but of course there's also tons of new code to get all the details rights. Most of the work has been done by Ville and Chon Ming Lee with lots of help from other people.

Our modeset code has also seen lots of improvements. The user-visible feature is surely support for large cursors. On high-dpi panels 64x64 simply doesn't cut it and the kernel (and latest SNA DDX) now support up to the hardware limit of 256x256. But there's also been a lot of improvements under the hood: More of Ville's infrastructure for atomic pageflips has been merged - slowly all the required pieces like unified plane updates for modeset, two stage watermark updates or atomic sprite updates are falling into place. Still a lot of work left to do though. And the modesetting infrasfrastucture has also seen a bit of work by the almost complete removal of the ->mode_set hooks. We need that for both atomic modeset updates and for proper runtime PM support.

On that topic: Runtime power management is now enabled for a bunch of our recent platforms - all the prep work from Paulo Zanoni and Imre Deak in the past few releases has finally paid off. There's still leftovers to be picked up over the coming releases like proper runtime PM support for DPMS on all platforms, addressing a bunch of crazy corner cases, rolling it out on the newer platforms like Cherryview or Broadwell and cleaning the code up a bit. But overall we're now ready for what the marketing people call "connected standy", which means that power consumption with all devices turned off through runtime pm should be as low as when doing a full system suspend. It crucially relies upon userspace not sucking and waking the cpu and devices up all the time, so personally I'm not sure how well this will work out really.

Another piece for proper atomic pageflip support is the universal primary plane support from Matt Roper. Based upon his DRM core work in 3.15 he now enabled the universal primary plane support in i915 properly. Unfortunately the corresponding patches for cursor support missed 3.16. The universal plane support is hence still disabled by default. For other atomic modeset work a shout-out goes to Rob Clark who's locking conversion to wait/wound mutexes for modeset objects has been merged.

On the GEM side Chris Wilson massively improved our OOM handling. We are now much better at surviving a crash against the memory brickwall. And if we don't and indeed run out of memory we have much better data to diagnose the reason for the OOM. The top-down PDE allocator from Ben Widawsky better segregates our usage of the GTT and is one of the pieces required before we can enable full ppgtt for production use. And the command parser from Brad Volkin is required for some OpenGL and OpenCL features on Haswell. The parser itself is fully merged and ready, but the actual batch buffer copying to a secure location missed the merge window and hence it's not yet enabled in permission granting mode.

The big feature to pop the champagne though is the userptr support from Chris - after years I've finally run out of things to complain about and merged it. This allows userspace to wrap up any memory allocations obtained by malloc() (or anything else backed by normal pages) into a GEM buffer object. Useful for faster uploads and downloads in lots of situation and currently used by the DDX to wrap X shmem segments. But OpenCL also wants to use this.

We've also enabled a few Broadwell features this time around: eDRAM support from Ben, VEBOX2 support from Zhao Yakui and gpu turbo support from Ben and Deepak S.

And finally there's the usual set of improvements and polish all over the place: GPU reset improvements on gen4 from Ville, prep work for DRRS (dynamic refresh rate switching) from Vandana, tons of interrupt and especially vblank handling rework (from Paulo and Ville) and lots of other things.

June 10, 2014 09:30 AM

June 07, 2014

Rusty Russell: Donation to Jupiter Broadcasting

Chris Fisher’s Jupiter Broadcasting pod/vodcasting started 8 years ago with the Linux Action Show: still their flagship show, and how I discovered them 3 years ago.  Shows like this give access to FOSS to those outside the LWN-reading crowd; community building can be a thankless task, and as a small shop Chris has had ups and downs along the way.  After listening to them for a few years, I feel a weird bond with this bunch of people I’ve never met.

I regularly listen to Techsnap for security news, Scibyte for science with my daughter, and Unfilter to get an insight into the NSA and what the US looks like from the inside.  I bugged Chris a while back to accept bitcoin donations, and when they did I subscribed to Unfilter for a year at 2 BTC.  To congratulate them on reaching the 100th Unfilter episode, I repeated that donation.

They’ve started doing new and ambitious things, like Linux HOWTO, so I know they’ll put the funds to good use!

June 07, 2014 11:45 PM

June 06, 2014

Dave Jones: Monthly Fedora kernel bug statistics – May 2014

  19 20 rawhide  
Open: 88 308 158 (554)
Opened since 2014-05-01 9 112 20 (141)
Closed since 2014-05-01 26 68 17 (111)
Changed since 2014-05-01 49 293 31 (373)

Monthly Fedora kernel bug statistics – May 2014 is a post from: codemonkey.org.uk

June 06, 2014 04:16 PM

June 05, 2014

Michael Kerrisk (manpages): Linux/UNIX System Programming course scheduled for July

I've scheduled a further 5-day Linux/UNIX System Programming course to take place in Munich, Germany, for the week of 21-25 July 2014.

The course is intended for programmers developing system-level, embedded, or network applications for Linux and UNIX systems, or programmers porting such applications from other operating systems (e.g., Windows) to Linux or UNIX. The course is based on my book, The Linux Programming Interface (TLPI), and covers topics such as low-level file I/O; signals and timers; creating processes and executing programs; POSIX threads programming; interprocess communication (pipes, FIFOs, message queues, semaphores, shared memory),  network programming (sockets), and server design.
     
The course has a lecture+lab format, and devotes substantial time to working on some carefully chosen programming exercises that put the "theory" into practice. Students receive a copy of TLPI, along with a 600-page course book containing the more than 1000 slides that are used in the course. A reading knowledge of C is assumed; no previous system programming experience is needed.

Some useful links for anyone interested in the course:

Questions about the course? Email me via training@man7.org.

June 05, 2014 01:38 PM

June 03, 2014

Dave Jones: Daily log June 2nd 2014

Spent the day making some progress on trinity’s “dump buffers after we detect tainted kernel” code, before hitting a roadblock. It’s possible for a fuzzing child process to allocate a buffer to create (for eg) a mangled filename. Because this allocation is local to the child, attempts to reconstruct what it pointed to from the dumping process will fail.
I spent a while thinking about this before starting on a pretty severe rewrite of how the existing logging code works. Instead of just writing the description of what is in the registers to stdout/logs, it now writes it to a buffer, which can be referenced directly by any of the other processes. A lot of the output code actually looks a lot easier to read as an unintended consequence.
No doubt I’ve introduced a bunch of bugs in the process, which I’ll spend time trying to get on top of tomorrow.

Daily log June 2nd 2014 is a post from: codemonkey.org.uk

June 03, 2014 03:48 AM

May 28, 2014

Dave Jones: Daily log May 27th 2014

Spent much of the day getting on top of the inevitable email backlog.
Later started cleaning up some of the code in trinity that dumps out syscall parameters to tty/log files.
One of the problems that has come up with trinity is that many people run it with logging disabled, because it’s much faster when you’re not constantly spewing to files, or scrolling text. This unfortunately leads to situations where we get an oops and we have no idea what actually happened. We have some information that we could dump in that case, which we aren’t right now.
So I’ve been spending time lately working towards a ‘post-mortem dump’ feature. The two pieces missing are 1. moving the logging functions away from being special cased to be called from within a child process, so that they can just chew on a syscall record. (This work is now almost done), and 2. maintain a ringbuffer of these records per child. (Not a huge amount of work once everything else is in place).

Hopefully I’ll have this implemented soon, and then I can get back on track with some of the more interesting fuzzing ideas I’ve had for improving.

Daily log May 27th 2014 is a post from: codemonkey.org.uk

May 28, 2014 03:11 PM

Michael Kerrisk (manpages): man-pages-3.68 is released

I've released man-pages-3.68. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

Aside from a larger-than-usual set of minor improvements and fixes to various pages, the notable changes in man-pages-3.68 are the following:

May 28, 2014 02:57 PM

May 27, 2014

Rusty Russell: Effects of packet/data sizes on various networks

I was thinking about peer-to-peer networking (in the context of Pettycoin, of course) and I wondered if sending ~1420 bytes of data is really any slower than sending 1 byte on real networks.  Similarly, is it worth going to extremes to avoid crossing over into two TCP packets?

So I wrote a simple Linux TCP ping pong client and server: the client connects to the server then loops: reads until it gets a ’1′ byte, then it responds with a single byte ack.  The server sends data ending in a 1 byte, then reads the response byte, printing out how long it took.  First 1 byte of data, then 101 bytes, all the way to 9901 bytes.  It does this 20 times, then closes the socket.

Here are the results on various networks (or download the source and result files for your own analysis):

On Our Gigabit Lan

Interestingly, we do win for tiny packets, but there’s no real penalty once we’re over a packet (until we get to three packets worth):

Over the Gigabit Lan

Over the Gigabit Lan

gigabit-lan-closeup

Over Gigabit LAN (closeup)

On Our Wireless Lan

Here we do see a significant decline as we enter the second packet, though extra bytes in the first packet aren’t completely free:

Wireless LAN (all results)

Wireless LAN (all results)

Wireless LAN (closeup)

Wireless LAN (closeup)

Via ADSL2 Over The Internet (Same Country)

Ignoring the occasional congestion from other uses of my home net connection, we see a big jump after the first packet, then another as we go from 3 to 4 packets:

ADSL over internet in same country

ADSL over internet in same country

ADSL over internet in same country (closeup)

ADSL over internet in same country (closeup)

Via ADSL2 Over The Internet (Australia <-> USA)

Here, packet size is completely lost in the noise; the carrier pidgins don’t even notice the extra weight:

Wifi + ADSL2 from Adelaide to US

Wifi + ADSL2 from Adelaide to US

Wifi + ADSL2 from Adelaide to US (closeup)

Wifi + ADSL2 from Adelaide to US (closeup)

Via 3G Cellular Network (HSPA)

I initially did this with Wifi tethering, but the results were weird enough that Joel wrote a little Java wrapper so I could run the test natively on the phone.  It didn’t change the resulting pattern much, but I don’t know if this regularity of delay is a 3G or an Android thing.  Here every packet costs, but you don’t win a prize for having a short packet:

3G network

3G network

3G network (closeup)

3G network (closeup)

Via 2G Network (EDGE)

This one actually gives you a penalty for short packets!  800 bytes to 2100 bytes is the sweet-spot:

2G (EDGE) network

2G (EDGE) network

2G (EDGE) network (closeup)

2G (EDGE) network (closeup)

Summary

So if you’re going to send one byte, what’s the penalty for sending more?  Eyeballing the minimum times from the graphs above:

Wired LAN Wireless ADSL 3G 2G
Penalty for filling packet 30%  15%  5%  0%  0%*
Penalty for second packet 30%  40%  15%  20%  0%
Penalty for fourth packet 60%  80%  25%  40%  25%

* Average for EDGE actually improves by about 35% if you fill packet

May 27, 2014 10:19 AM

Pete Zaitcev: Next stop, SMT

The little hardware project, mentioned previously, continues to chug along. A prototype is now happily blinking LEDs on a veroboard:

zag-avr-2.photo1

Now it's time to use a real technology, so the thing could be used onboard a car or airplane. Since the heart of the design is a chip in LGA-14 package, we are looking at soldering a surface-mounted element with 0.35 mm pitch.

I reached to members of a local hackerspace for advice, and they suggested to forget what people post to the Internet about wave-soldering in a oven. Instead, buy a cheap 10x microscope, fine tip for my iron, and some flux-solder paste, then do it all by hand. As Yoda said, there is only do and do not, there is no try.

May 27, 2014 12:50 AM

May 26, 2014

Daniel Vetter: drm/i915 Branches Explained

Since I've started maintaining the drm/i915 over a year ago I've fined-tuned the branching model a lot. So I think it's time for an update, especially since Sedat Dilek poked me about it.

There are two main branches I merge patches into, namely drm-intel-fixes and drm-intel-next-queued. drm-intel-fixes contains bugfixes for the current -rc upstream kernels, whereas drm-intel-next-queued (commonly shortened to dinq) is for feature work. The first special thing is that contrary to subsystem trees I don't close dinq while the upstream merge window is open - stalling QA and our own driver feature and review work for usually more than 2 weeks is a bit too disruptive. But the linux-next tree maintained by upstream forbids this, hence why I also have the for-linux-next branch: Usually it's a copy of the drm-intel-next-queued branch safe when the merge window is open, then it's a copy of the drm-intel-fixes branch. This way we can keep the process in our team going without upsetting Stephen Rothwell.
Usually once the merge-window opens I have a mix of feature work (which then obviously missed the deadline) and bugfixes oustanding (i.e. not yet merged to Dave Airlie's drm branches). Therefore I split out the bugfixes from dinq into -fixes and rebase the remaining feature work in dinq on top of that.

Now for our own nightly QA testing we want a lightweight linux-next just for graphics. Hence there's the drm-intel-nightly branch which currently merges together drm-intel-fixes and -next-queued plus the drm-next and drm-fixes branches from Dave Airlie. We've had some ugly regressions in the past due to changes in the drm core which we've failed to detect for months, hence why we now also include the drm branches.

Unfortunately not all of our QA procedures are fully automated and can be run every night, so roughly every two weeks I try to stabilize dinq a bit and freeze it as drm-intel-next. I also push out a snapshot of -nightly to drm-intel-testing, so that QA has a tree with all the bugfixes included to do the extensive and labour-intensive manual tests. Once that's done (usually takes about a week) and nothing bad popped up I send a pull request to Dave Airlie for that drm-intel-next branch.

There's also still the for-airlied branch around, but since I now use git tags both for -fixes and -next pull requests I'll probably remove that branch soon.
Finally there's the rerere-cache branch. I have a bunch of scripts which automatically create drm-intel-nightly for me, using git rerere and a set of optional fixup patches. To be able to seamlessly do both at home with my workstation and on the road with my laptop the scripts also push out the latest git rerere data and fixup branches to this branch. When I switch machines I can then quickly sync up all the maintainer branch state to the next machine with a simple script.

Now if you're a developer and wonder on which branch you should base your patches on it's pretty simple: If it's a bugfix which should be merged into the current -rc series it should be based on top of drm-intel-fixes. Feature work on the other hand should be based on drm-intel-nightly - if the patches don't apply cleanly to plain dinq I'll look at the situation and decide whether an explicit backport or just resolving the conflicts is the right approach. Upstream maintainers don't like backmerges, hence I try to avoid them if it makes sense. But that also means that dinq and -fixes can diverge quite a bit and that some of the patches in -fixes would break things on plain dinq in interesting ways. Hence the recommendation to base feature development on top of -nightly.

Addendum: All these branches can be found in the drm-intel git repository.

Update: The drm-intel repository move, so I've updated the link.

May 26, 2014 07:13 AM

May 25, 2014

Pete Zaitcev: Digital life in 2014

I do not get out often and my clamshell cellphone incurs a bill of $3/month. So, imagine my surprise when at a dinner with colleagues everyone pulled out a charger brick and plugged their smartphone into it. Or, actually, one guy did not have a brick with him, so someone else let him tap into a 2-port brick (pictured above). The same ritual repeated at every dinner!

I can offer a couple of explanations. One is that it is much cheaper to buy a smartphone with a poor battery life and a brick than to buy a cellphone with a decent battery life (decent being >= 1 day, so you could charge it overnight). Or, that market forces are aligned in such a way that there are no smartphones on the market that can last for a whole day (anymore).

UPDATE: My wife is a smartphone user and explained it to me. Apparently, sooner or later everyone hits an app which is an absolute energy hog for no good reason, and is unwilling to discard it (in her case it was Pomodoro, but it could be anything). Once that happens, no technology exists to pack enough battery into a smartphone. Or so the story goes. In other words, they buy a smartphone hoping that they will not need the brick, but inevitably they do.

May 25, 2014 03:58 AM

May 23, 2014

Pete Zaitcev: OpenStack Atlanta 2014

The best part, I think, came during the Swift Ops session, when our PTL, John Dickinson, asked for a show of hands. For example -- without revealing any specific proprietary information -- how many people run clusters of less than 10 nodes, 10 to 100, 100 to 1000, etc. He also asked how old the production clusters were, and if anyone deployed Swift in 2014. Most clusters were older, and only 1 man raised his hand. John asked him what were the difficulties setting it up, and the man said: "we had some, but then we paid Red Hat to fix it up for us, and they did, so it works okay now." I felt so useful!

The hardware show was pretty bare, compared to Portland, where Dell and OCP brought out cool things.

HGST showed their Ethernet drive, but they used a chassis so dire, I don't even want to post a picture. OCP did the same this year: they brought a German partner who demonstrated a storage box that looked built in a basement in Kherson, Ukraine, while under siege by Kiev forces.

Here's a poor pic of cute Mellanox Ethernet wares: a switch, NICs, and some kind of modern equivalent of GBICs for fiber.

Interestingly enough, although Mellanox displayed Ethernet only, I heard in other sessions that Infiniband was not entirely dead. Basically if you need to step beyond bonded 10GbE, there's nothing else for your overworked Swift proxies: it's Infiniband or nothing. My interlocutor from SwiftStack implied that a router existed into which you could plug your Infiniband pipe, I think.

May 23, 2014 06:38 PM

Matt Domsch: Web Software Developer role on my Dell Software team

My team in Dell Software continues to grow. I have two critical roles open now, and am looking for fantastic people to join my team. The first I posted a few weeks ago, is for a Principal Web Software Engineer, a very senior engineering role. This second one is for a Java web application developer (my team is using JBoss Portal Server / GateIn for the web UI layer). Both give you the chance to enjoy life in beautiful Austin, TX while working on cool projects! Let me know if you are interested.

http://jobs.dell.com/round-rock/engineering/jobid5450194-web-software-engineer-jobs

Web Software Engineer- Round Rock, TX

Dell Software empowers companies of all sizes to experience Dell’s “Power to Do More” by delivering scalable yet simple-to-use solutions to drive value and accelerate results. We are uniquely positioned to address today’s most pressing business and IT challenges with holistic, connected software offerings. Dell Software Group’s portfolio now includes the products and technologies of Quest Software, AppAssure, Enstratius, Boomi, KACE and SonicWALL. For more information, please visit http://software.dell.com/.

This role is for a unique and talented technical contributor in the Dell Software Group responsible for a number of activities required to design, develop, test, maintain and operate web software applications built using J2EE and JBoss GateIn portal.

Role Responsibilities
-Develop server software in the Java environment
-Create, maintain, and manipulate web application content in HTML, CSS, JavaScript, and Java languages
-Unit test web application software on the client and on the server
-Provide guidance and support to application testing and quality assurance teams
-Contribute to process diagrams, wiki pages, and other documentation
-Work in an Agile software development environment
-Work in a Linux environment

Requirements
- Strong technical and creative skills as well as an understanding of software engineering processes, technologies, tools, and techniques
-Red Hat JBoss Enterprise Application Platform (or similar J2EE frameworks), JBoss GateIn, HTML5, CSS, JavaScript, jQuery, SAML, and web security
-3+ years of client or server Java experience
-JBoss Portal Server applications and/or other web applications
-Posses visual web application design skills
-Implementation knowledge in internationalization and localization
-HTML, CSS, JavaScript, RESTful interface design and utilization skills, OAuth or other token-based authentication and authorization

Preferences
-Experience with Atlassian products, continuous integration
-Be willing and able to learn new technologies and concepts
-Demonstrate resourcefulness in resolving technical issues.
-Advanced interpersonal skills and be able to work independently and as part of the team.

Life At Dell
Equal Employment Opportunity Policy
Equal Opportunity Employer/Minorities/Women/Veterans/Disabled

Job ID: 14000MXW – See more at: http://jobs.dell.com/round-rock/engineering/jobid5450194-web-software-engineer-jobs#sthash.2XYIl7CC.dpuf

May 23, 2014 01:23 PM

Michael Kerrisk (manpages): man-pages-3.67 is released

I've released man-pages-3.67. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

Aside from a larger-than-usual set of minor improvements and fixes to various pages, the notable changes in man-pages-3.67 are the following:

May 23, 2014 07:30 AM

Dave Airlie: DisplayPort MST dock support for Fedora 20 copr repo.

So I've created a copr

http://copr.fedoraproject.org/coprs/airlied/mst/

With a kernel + intel driver that should provide support for DisplayPort MST on Intel Haswell hardware. It doesn't do any of the fancy Dell monitor stuff yet, it primarily for people who have Lenovo or Dell docks and laptops that can't currently multihead.

The kernel source is from this branch which backports a chunk of stuff to v3.14 to support this.

http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-i915-mst-v3.14

It might still have some bugs and crashes, but the basics should in theory work.

May 23, 2014 02:52 AM

Dave Jones: Daily log May 22nd 2014

Found another hugepage VM bug.
This one VM_BUG_ON_PAGE(PageTail(page), page). The filemap.c bug has gone into hiding since last weekend. Annoying.

On the trinity list, Michael Ellerman reported a couple bugs that looked like memory corruption since my changes yesterday to make fd registration more dynamic. So I started the day trying out -fsanitize=address without much success. (I had fixed a number of bugs recently using it). However, reading the gcc man page, I found out about -fsanitize=undefined, which I was previously unaware of. This is new in gcc 4.9, so ships with Fedora rawhide, but my test box is running F20. Luckily, clang has supported it even longer.

So I spent a while fixing up a bunch of silly bugs, like (1<<31) where it should have been (1UL<<31) and some not so obvious alignment bugs. A dozen or so fixes for bugs that had been there for years. Sadly though, nothing that explained the corruption Michael has been seeing. I spent some more time looking over yesterdays changes without success. Annoying that I don’t see the corruption when I run it.

Decided not to take on anything too ambitious for the rest of the day given I’m getting an early start on the long weekend, by taking off tomorrow. Hopefully when I return with fresh eyes on Tuesday I’ll have some idea of what I can do to reproduce the bug.

Daily log May 22nd 2014 is a post from: codemonkey.org.uk

May 23, 2014 01:46 AM

May 21, 2014

Dave Jones: Daily log May 21st 2014

Fell out of the habit of blogging regularly again.
Recap: Took Monday off, to deal with fallout from a flooded apartment that happened at 2am Sunday night. Not fun in the least.

Despite this, I’ve found time to do a bunch of trinity work this last week. Biggest highlight being todays work gutting the generation of file descriptors on startup, moving it away from a nasty tangled mess of goto’s and special cases to something more manageable, involving dynamic registration. The code is a lot more readable now, and hopefully as a result more maintainable.

This work will continue over the next few days, extending to the filename cache code. For a while, I’ve felt that maintaining one single filename cache of /dev, /sys & /proc wasn’t so great, so I’ll be moving that to also be a list of caches.

Eventually, the network socket code also needs the same kind of attention, moving it away from a single array of sockets, to per-protocol buckets. But that’s pretty low down my list right now.

Asides from the work mentioned above, lots of commits changing code around to use syscallrecord structs to make things clearer, and making sure functions get passed the parameters they need rather than have them recomputing them.

Daily log May 21st 2014 is a post from: codemonkey.org.uk

May 21, 2014 11:39 PM

Pavel Machek: I love Python

Timetable conversion finally finished. It only run for 3.5 days...

159645.09user 151296.77system 311825.86 (5197m5.860s) elapsed 99.71%CPU


And as a bonus... Did you ever see ssh man-in-the-middle in the wild?

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle
attack)!
It is also possible that the RSA host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
84:ed:69:16:c4:6b:61:49:75:91:84:3a:49:f9:c4:60.
Please contact your system administrator.
Add correct host key in /data/pavel/.ssh/known_hosts to get rid of
this message.
Offending key in /data/pavel/.ssh/known_hosts:137
RSA host key for ....de has changed and you have requested
strict checking.
Host key verification failed.


That was Hotel Kossak in Krakow. Apparently when your internet connection key
expires, they launch a free attack at you as a bonus.

May 21, 2014 10:31 AM

May 19, 2014

Matthew Garrett: The desktop and the developer

I was at the OpenStack Summit this week. The overwhelming majority of OpenStack deployments are Linux-based, yet the most popular laptop vendor (by a long way) at the conference was Apple. People are writing code with the intention of deploying it on Linux, but they're doing so under an entirely different OS.

But what's really interesting is the tools they're using to do so. When I looked over people's shoulders, I saw terminals and a web browser. They're not using Macs because their development tools require them, they're using Macs because of what else they get - an aesthetically pleasing OS, iTunes and what's easily the best trackpad hardware/driver combination on the market. These are people who work on the same laptop that they use at home. They'll use it when they're commuting, either for playing videos or for getting a head start so they can leave early. They use an Apple because they don't want to use different hardware for work and pleasure.

The developers I was surrounded by aren't the same developers you'd find at a technical conference 10 years ago. They grew up in an era that's become increasingly focused on user experience, and the idea of migrating to Linux because it's more tweakable is no longer appealing. People who spend their working day making use of free software (and in many cases even contributing or maintaining free software) won't run a free software OS because doing so would require them to compromise on things that they care about. Linux would give them the same terminals and web browser, but Linux's poorer multitouch handling is enough on its own to disrupt their workflow. Moving to Linux would slow them down.

But even if we fixed all those things, why would somebody migrate? The best we'd be offering is a comparable experience with the added freedom to modify more of their software. We can probably assume that this isn't a hugely compelling advantage, because otherwise it'd probably be enough to overcome some of the functional disparity. Perhaps we need to be looking at this differently.

When we've been talking about developer experience we've tended to talk about the experience of people who are writing software targeted at our desktops, not people who are incidentally using Linux to do their development. These people don't need better API documentation. They don't need a nicer IDE. They need a desktop environment that gives them access to the services that they use on a daily basis. Right now if someone opens an issue against one of their bugs, they'll get an email. They'll have to click through that in order to get to a webpage that lets them indicate that they've accepted the bug. If they know that the bug's already fixed in another branch, they'll probably need to switch to github in order to find the commit that contains the bug number that fixed it, switch back to their issue tracker and then paste that in and mark it as a duplicate. It's tedious. It's annoying. It's distracting.

If the desktop had built-in awareness of the issue tracker then they could be presented with relevant information and options without having to click through two separate applications. If git commits were locally indexed, the developer could find the relevant commit without having to move back to a web browser or open a new terminal to find the local checkout. A simple task that currently involves multiple context switches could be made significantly faster.

That's a simple example. The problem goes deeper. The use of web services for managing various parts of the development process removes the need for companies to maintain their own infrastructure, but in the process it tends to force developers to bounce between multiple websites that have different UIs and no straightforward means of sharing information. Time is lost to this. It makes developers unhappy.

A combination of improved desktop polish and spending effort on optimising developer workflows would stand a real chance of luring these developers away from OS X with the promise that they'd spend less time fighting web browsers, leaving them more time to get on with development. It would also help differentiate Linux from proprietary alternatives - Apple and Microsoft may spend significant amounts of effort on improving developer tooling, but they're mostly doing so for developers who are targeting their platforms. A desktop environment that made it easier to perform generic development would be a unique selling point.

I spoke to various people about this during the Summit, and it was heartening to hear that there are people who are already thinking about this and hoping to improve things. I'm looking forward to that, but I also hope that there'll be wider interest in figuring out how we can make things easier for developers without compromising other users. It seems like an interesting challenge.

comment count unavailable comments

May 19, 2014 02:53 AM

May 18, 2014

Pavel Machek: Python

...easy to program in, but hard to program when you want your programs to run at useful speed.

I'm currently fighting with gtfs convertor for timetab (was running overnight; and that's after I spent yesterday speeding it up 30 times or so; still not done). And yes, I realize that Python is probably not best language for weather nowcasting...

Are there some books / online things one should read to learn Python? Or is there some different language I should be using? (Yes, I like the language. No, I don't like whitespace-based syntax. No, I don't the fact that I have to actually test the application to discover typos in variable names.)

May 18, 2014 06:27 AM

May 15, 2014

Pete Zaitcev: Seagate Kinetic and SMR

In the trivial things I never noticed department: Kinetic might have a technical merit. Due to the long history of vendors selling high-margin snake oil, I was somewhat sceptical of the object-addressed storage. However, they had a session with Joe Arnold of SwiftStack and a corporate person from Seagate (with an excessively complicated title), where they mentioned off-hand that actually all this "intelligent" stuff is supposed to help with SMR. As everyone probably know, shingle drives implement complicated read-modify-write cycles to support traditional sector-addressed model and the performance penalty is worse than 4K drive with 512-byte interface.

I cannot help thinking that it would be even better to find a more natural model to expose the characteristics of the drive to the host. Perhaps some kind of "log-structured-drive". I bet there are going to be all sorts of overheads in the drive's filesystem that negate the increase in the aerial density. After all, shingles only give you about 2x. The long-term performance of any object-addressed drive is also in doubt as the fragmentation mounts.

BTW, a Seagate guy swore to me that Kinetic is not patent-encumbered and that they really want other drive vendors to jump on the bandwagon.

UPDATE: Jeff Darcy brought up HGST on Twitter. The former-Hitachi guys (owned by WD now) do something entirely different: they allow apps, such as Swift object server, to run directly on the drive. It's cute, but does nothing to help the block-addressing API being unsifficient to manage a shingled drive. When software runs on the drive, it still has to talk to the rest of the drive somehow, and HGST did not add a different API to the kernel. All it does is kicking the can down the road and hoping a solution comes along.

UPDATE: Wow even Sage.

May 15, 2014 02:35 PM

May 14, 2014

Dave Jones: Daily log May 13th 2014

After yesterdays Trinity release, not too much fallout. Michael Ellerman reported a situation that causes the watchdog to hang indefinitely after the main process has exited. Hopefully this won’t affect too many people. It should now be fixed in the git tree anyway. Doesn’t seem worth doing another point release over right now.

Thomas Gleixner finally got to the bottom of the futex bugs that Trinity has been hitting for a while.

The only other kernel bugs I’m still seeing over and over are the filemap.c:202 BUG_ON, and a perf bug that has been around for a while.

Sasha seems to be hitting a ton of VM related issues on linux-next. Hopefully some of that stuff gets straightened out before the next merge window.

Daily log May 13th 2014 is a post from: codemonkey.org.uk

May 14, 2014 03:05 AM

May 12, 2014

Dave Jones: Trinity 1.4 release.

As predicted last week, I made v1.4 tarball release of Trinity today. The run I left going over the weekend seemed ok, so I started prepping a release. Then at the last minute, found a bug that would only show up if you had a partial socket cache. This caused trinity to exit immediately on startup. Why it waited until today to show up is a mystery, but it’s fixed now.

A few other last minute “polishing” patches, hushing some of the excessive spew etc, but it’s otherwise mostly the same as what I had prepared on Friday.

The weekend test run found the mm/filemap.c:202 BUG_ON again, where we have a page mapped that shouldn’t be. Really need to focus on trying to get a better reproducer for that sometime soon, as it’s been around for a while now.

Trinity 1.4 release. is a post from: codemonkey.org.uk

May 12, 2014 05:20 PM

May 11, 2014

Matthew Garrett: Oracle continue to circumvent EXPORT_SYMBOL_GPL()

Oracle won their appeal regarding whether APIs are copyrightable. There'll be ongoing argument about whether Google's use of those APIs is fair use or not, and perhaps an appeal to the Supreme Court, but that's the new status quo. This will doubtless result in arguments over whether Oracle's implementation of Linux APIs in Solaris 10 was a violation of copyright or not (and presumably Google are currently checking whether they own any code that Oracle reimplemented), but that's not what I'm going to talk about today.

Oracle own some code called DTrace (Wikipedia has a good overview here - what it actually does isn't especially relevant) that was originally written as part of Solaris. When Solaris was released under the CDDL, so was DTrace. The CDDL is a file-level copyleft license with some restrictions not present in the GPL - as a result, combining GPLed code with CDDLed code will (in the absence of additional permission grants) result in a work that is under an inconsistent license and cannot legally be distributed.

Oracle wanted to make DTrace available for Linux as part of their Unbreakable Linux product. Integrating it directly into the kernel would obviously cause legal issues, so instead they implemented it as a kernel module. The copyright status of kernel modules is somewhat unclear. The GPL covers derivative works, but the definition of derivative works is a function of copyright law and judges. Making use of explicitly exported API may not be sufficient to constitute a derivative work - on the other hand, it might. This is largely untested in court. Oracle appear to believe that they're legitimate, and so have added just enough in-kernel code (and GPLed) to support DTrace, while keeping the CDDLed core of DTrace separate.

The kernel actually has two levels of exposed (non-userspace) API - those exported via EXPORT_SYMBOL() and those exported via EXPORT_SYMBOL_GPL(). Symbols exported via EXPORT_SYMBOL_GPL() may only be used by modules that claim to be GPLed, with the kernel refusing to load them otherwise. There is no technical limitation on the use of symbols exported via EXPORT_SYMBOL().

(Aside: this should not be interpreted as meaning that modules that only use symbols exported via EXPORT_SYMBOL() will not be considered derivative works. Anything exported via EXPORT_SYMBOL_GPL() is considered by the author to be so fundamental to the kernel that using it would be impossible without creating a derivative work. Using something exported via EXPORT_SYMBOL() may result in the creation of a derivative work. Consult lawyers before attempting to release a non-GPLed Linux kernel module)

DTrace integrates very tightly with the host kernel, and one of the things it needs access to is a high-resolution timer that is guaranteed to monotonically increase. Linux provides one in the form of ktime_get(). Unfortunately for Oracle, ktime_get() is only exported via EXPORT_SYMBOL_GPL(). Attempting to call it directly from the DTrace module would fail.

Oracle work around this in their (GPLed) kernel abstraction code. A function called dtrace_gethrtimer() simply returns the value of ktime_get(). dtrace_gethrtimer() is exported via EXPORT_SYMBOL() and therefore can be called from the DTrace module.

So, in the face of a technical mechanism designed to enforce the author's beliefs about the copyright status of callers of this function, Oracle deliberately circumvent that technical mechanism by simply re-exporting the same function under a new name. It should be emphasised that calling an EXPORT_SYMBOL_GPL() function does not inherently cause the caller to become a derivative work of the kernel - it only represents the original author's opinion of whether it would. You'd still need a court case to find out for sure. But if it turns out that the use of ktime_get() does cause a work to become derivative, Oracle would find it fairly difficult to argue that their infringement was accidental.

Of course, as copyright holders of DTrace, Oracle could solve the problem by dual-licensing DTrace under the GPL as well as the CDDL. The fact that they haven't implies that they think there's enough value in keeping it under an incompatible license to risk losing a copyright infringement suit. This might be just the kind of recklessness that Oracle accused Google of back in their last case.

comment count unavailable comments

May 11, 2014 03:14 PM

May 09, 2014

Dave Jones: Getting closer to Trinity 1.4

I plan on doing a tarball release of Trinity on Monday. It’s been about six months since the last release, and I know I’m overdue a tarball release because I find myself asking more often “Are you using a build from the tarball, or git?”

Since the last release it’s got a lot better at tickling VM related bugs, in part due to the tracking and recycling of mmap() results from the child processes, but also from a bunch of related work to improve the .sanitise routines of various VM syscalls. There have also of course been a ton of other changes and fixes. I’ve spent the last week mostly doing last minute cleanups & fixes, plugging memory leaks etc, and finally fixing the locking routines to behave as I had originally intended.

So, touch wood, barring any last minute surprises, what’s in git today will be 1.4 on Monday morning.

I’m already looking forward to working on some new features I’ve had in mind for some time. (Some of which I originally planned on working on as a standalone tool, but after last months LSF/MM summit, I realised the potential for integrating those ideas into trinity). More on that stuff next week.

Getting closer to Trinity 1.4 is a post from: codemonkey.org.uk

May 09, 2014 06:00 PM

May 08, 2014

Michael Kerrisk (manpages): man-pages-3.66 is released

I've released man-pages-3.66. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

Aside from a number of minor improvements and fixes to various pages, the notable changes in man-pages-3.66 are the following:

May 08, 2014 08:43 PM

Michael Kerrisk (manpages): man-pages-3.65 is released

I've released man-pages-3.65. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

Aside from a number of minor improvements and fixes to various pages, the notable changes in man-pages-3.65 are the following:

May 08, 2014 08:21 PM

Pete Zaitcev: Mental note

I went through the architecture manual for Ceph and penciled down a few ideas that could be applied to Swift. The biggest one is that we could benefit from some kind of massive proxy or a PACO setup.

Unfortunately, I see problems with a large PACO. Memcached efficiency will nosedive, for one. But also, how are we going to make sure clients are spread right? There's no cluster map and thus clients can't know which proxy in PACO is closer to the location. In fact we deliberately prevent them from knowing too much. They don't even know cluster's partition size.

The reason this matters is that I feel that EC should increase requirements for CPU on proxies, which are CPU bound in most clusters already. Of course what I feel may not be what actually occurs, so maybe it does not matter.

May 08, 2014 07:25 PM

Rusty Russell: BTC->BPAY gateway (for Australians)

I tested out livingroomofsatoshi.com, which lets you pay any BPAY bill (see explanation from reddit).  Since I’d never heard of the developer, I wasn’t going to send anything large through it, but it worked flawlessly.  At least the exposure is limited to the time between sending the BTC and seeing the BPAY receipt.  Exchange rate was fair, and it was a simple process.

Now I need to convince my wife we should buy some BTC for paying bills…

May 08, 2014 02:55 AM

May 07, 2014

Dave Jones: Random link dump

random link dump of some videos related to performance, scalability and code optimization I found interesting recently.

Random link dump is a post from: codemonkey.org.uk

May 07, 2014 09:48 PM

Daniel Vetter: LinuxTag 2014

Tomorrow I'll be travelling to LinuxTag in Berlin for the first time. Should be pretty cool, and to top it off I'll give a presentation about the state of the intel kernel graphics driver. For those that can't attend I've uploaded the slides already, and if there's a video cut I'll link to that as soon as it's available.

May 07, 2014 09:45 PM

May 05, 2014

Dave Jones: Coverity, heartbleed, and the kernel.

I was a little surprised last week when I did the Coverity scan on the 3.15-rc3 kernel, and 118 new issues appeared. Usually once we’re past -rc1, the number of new issues between builds is in low double figures. It turned out to be due to Coverity’s new checks that flag byte swaps as potential sources of tainted data.

There’s a lot to go through, but from what I’ve seen so far, there hasn’t been anything as scary as what happened with OpenSSL. Though my unfamiliarity with some of the drivers may mean they could turn out to be something that needs fixing even if they aren’t exploitable.

What is pretty neat though, is that the scanner doesn’t just detect calls to the usual byte-swapping routines, it even picks up on open-coded variants, such as..

n_blocks = (p_src[0] < < 8) | p_src[1];

The tricky part is determining in various drivers if (ie, in the example above) 'p_src' is actually something a user can control, or is it just reading something back from hardware. Quite a few of them seem to be in things like video BIOS parsers, firmware loaders etc. Even handling of packets from a joystick involves byte-swapping.

Even though the hardware (should) limit the range of values these routines see, we do seem to do range checking in the kernel in most cases. The cases where we don't that I've seen so far, I'm not sure we care, as the code looks like it would still work correctly.

Somewhat surprising was how many of these new warnings turned up in mostly unmaintained or otherwise crappy parts of the kernel. ISDN. Staging wireless drivers. Megaraid. The list goes on..

Regardless of whether or not these new checks find bugs, re-reviewing some of this code can't really be a bad thing, as a lot of it probably never had a lot of review when it was merged 10-15 years ago.

Coverity, heartbleed, and the kernel. is a post from: codemonkey.org.uk

May 05, 2014 06:24 PM

May 02, 2014

Valerie Aurora: Here’s my favorite operating systems war story, what’s yours?

Val in her Dr. Pepper days

Val in her Dr Pepper days

When I was working on my operating systems project in university, I stayed up all night for weeks feverishly rebooting my test machine, hoping that THIS time my interrupt handler changes worked. I survived on a diet of raisin bagels, baby carrots, and Dr Pepper, and only left the computer lab to shower and sleep for a few hours before stumbling back in and opening up EMACS again. I loved it.

Julia Evans‘ blog posts about writing an operating system in Rust at Hacker School are making me miss the days when I thought everything about operating systems was magical. Julia is (a) hilarious, (b) totally honest, (c) incredibly enthusiastic about learning systems programming. (See “What does the Linux kernel even do?“, “After 5 days, my OS doesn’t crash when I press a key“, and “After 6 days, I have problems I don’t understand at all.”) I’m sure somewhere on Hacker News there is a thread getting upvoted about how Julia is (a) faking it, (b) a bad programmer, (c) really a man, but here in the real world she’s making me and a lot of other folks nostalgic for our systems programming days.

Yesterday’s post about something mysteriously zeroing out everything about 12K in her binary reminded me of one of my favorite OS debugging stories. Since I’m stuck at home recovering from surgery, I can’t tell anyone it unless I write a blog post about it.

VME crate (CC-BY-SA Sergio.ballestrero at en.wikipedia)

VME crate (CC-BY-SA Sergio.ballestrero at en.wikipedia)

In 2001, I got a job maintaining the Linux kernel for the (now defunct) Gemini subarchitecture of the PowerPC. The Gemini was an “embedded” SMP board in a giant grey metal VME cage with a custom BIOS. Getting the board in and out of the chassis required brute strength, profanity, and a certain amount of blood loss. The thing was a beast – loud and power hungry, intended for military planes and tanks where no one noticed a few extra dozen decibels.

The Gemini subarchitecture had not had a maintainer or even been booted in about 6 months of kernel releases. This did not stop a particularly enthusiastic PowerPC developer from tinkering extensively with the Gemini-specific bootloader code, which was totally untestable without the Gemini hardware. With sinking heart, I compiled the latest kernel, tftp’d it to the VME board, and told the BIOS to boot it.

It booted! Wow! What are the chances? Flushed with success, I made some minor cosmetic change and rebooted with the new kernel. Nothing, no happy printk’s scrolling down the serial console. Okay, somehow my trivial patch broke something. I booted the old binary. Still nothing. I thought for a while, made some random change, and booted again. It worked! Okay, this time I will reboot right away to make sure it is not a fluke. Reboot. Nothing. I guess it was a fluke. A few dozen reboots later, I went to lunch, came back, and tried again. Success! Reboot. Failure. Great, a non-deterministic bug – my favorite.

Eventually I noticed that the longer the machine had been powered down before I tried to boot, the more likely it was to boot correctly. (I turned the VME cage off whenever possible because of the noise from the fans and the hard disks, which were those old SCSI drives that made a high-pitched whining noise that bored straight through your brain.) I used the BIOS to dump the DRAM (memory) on the machine and noticed that each time I dumped the memory, more and more bits were zeroes instead of ones. Of course I knew intellectually that DRAM loses data when you turned the power off (duh) but I never followed it through to the realization that the memory would gradually turn to zeroes as the electrons trickled out of their tiny holding pens.

So I used the BIOS to zero out the section of memory where I loaded the kernel, and it booted – every time! After that, it didn’t take long to figure out that the part of the bootloader code that was supposed to zero out the kernel’s BSS section had been broken by our enthusiastic PowerPC developer. The BSS is the part of the binary that contains variables that are initialized to zero at the beginning of the program. To save space, the BSS is not usually stored as a string of zeroes in the binary image, but initialized to zero after the program is loaded but before it starts running. Obviously, it causes problems when variables that are supposed to be zero are something other than zero. I fixed the BSS zeroing code and went on to the next problem.

This bug is an example of what I love about operating systems work. There’s no step-by-step algorithm to figure out what’s wrong; you can’t just turn on the debugger and step through till you see the bug. You have to understand the computer software and hardware from top to bottom to figure out what’s going wrong and fix it (and sometimes you need to understand quite a bit of electrical engineering and mathematical logic, too).

If you have a favorite operating system debugging story to share, please leave a comment!

Updated to add: Hacker News had a strangely on-topic discussion about this post with lots more great debugging stories. Check it out!


Tagged: hacker school, kernel, linux, programming

May 02, 2014 08:18 PM

Dave Jones: Daily log May 1st 2014

Rough couple days. Didn’t get much done yesterday due to food poisoning.
Still chasing the futex bug I started hitting a few days ago.

More polishing towards a point release of Trinity. Fixed some code in the locking routines that never really worked by rewriting it. After spending an hour or so trying to iron out bugs in my rewritten code, I realized things could be done in a much simpler manner, and also a few hundred lines less code. As a result the locking routines got simpler, and the watchdog code got some long overdue cleanup.

If things shake out well over the next few days, I’ll call this 1.4 on Monday.

Daily log May 1st 2014 is a post from: codemonkey.org.uk

May 02, 2014 03:40 AM

May 01, 2014

Pete Zaitcev: Get off my lawn

My only contact with Erasure Codes previously was in the field of data transmission (naturally). While a student at Lomonosov MSU, I worked in a company that developed a network for PCs, called "Micross" and led by Andrey Kinash, IIRC. Originally it ran on top of MS-DOS 3.30, and was ported to 4.01 and later versions over time.

Ethernet was absurdly expensive in the country back in the 1985, so the hardware used the built-in serial. A small box with a relay attached the PC to a ring, where a software token was circulated. Primary means of keeping the ring integrity was the relay, controlled by the DTR signal. In case that was not enough, the system implemented a double-ring isolation not unlike FDDI's.

The baud rate was 115.2 kbit/s, and that interfered with the MS-DOS. I should note that notionally the system permitted data transfer while PCs ran usual office applications. But even if PC was otherwise idle, a hit by the system timer would block out interrupts long enough to drop 2 or 3 characters. Solution was to implement a kind of erasure coding, which recovered not only corruption, but also a loss of data. The math and implementation in Modula 2 were done by Mr. Vladimir Roganov.

I remember that at the time, the ability to share directories over LAN without a dedicated server was most impressive to the customers, but I thought that the Roganov's EC was perhaps the most praiseworthy, from nerdiness perspective. Remember that all of that ran on a 12 MHz 80286.

May 01, 2014 06:19 PM

Dave Jones: Monthly Fedora kernel bug statistics – April 2014

  19 20 rawhide  
Open: 104 234 153 (491)
Opened since 2014-04-01 7 81 29 (117)
Closed since 2014-04-01 15 46 18 (79)
Changed since 2014-04-01 26 107 42 (175)

Monthly Fedora kernel bug statistics – April 2014 is a post from: codemonkey.org.uk

May 01, 2014 04:50 PM

April 30, 2014

Dave Jones: Daily log April 29th 2014

Started the day by hitting an assertion in the rtmutex-debug code.

Fixed a long standing bug in trinity’s memory allocation routines.
On occasion, it would spew messages when it failed to allocate even a single page of memory. I’ve been meaning to dig into this for a while, wondering if there was some memory leak somewhere. The actual cause of it was a lot more dumb. If we randomly happened to call mlockall() in a child process, after a while, we’d fill up memory as none of those allocations could be paged out. Now that trinity is doing multi-megabyte allocations when it tries to map huge pages, this is a problem.
The answer, I decided was to munlockall if the allocator fails, and then retry. It seems to be holding up so far.

I’m spending the rest of the week wrapping up loose ends like this, and then cutting a new point release, as it’s been six months or so since the last one, and I want to try some more experimental things with file IO soon, which might upset it’s ability to trigger some of the existing kernel bugs for a while.

Daily log April 29th 2014 is a post from: codemonkey.org.uk

April 30, 2014 04:12 AM

April 29, 2014

James Morris: Linux Security Summit 2014 (Chicago) -– Call for Participation

The CFP for the 2014 Linux Security Summit is announced.

LSS 2014 will be co-located with LinuxCon North America in Chicago, on the 18th and 19th of August.  We’ll also be co-located with the Kernel Summit this year.

Note that, as always, we’re looking for participation from the general Linux community — not just kernel people, and not just developers.  We’re interested in hearing about feedback from users, and discussing what kinds of security problems we need to be addressing into the future.

This year, we’re looking for discussion topics as well as paper presentations, so if you have anything interesting to talk about, send in a proposal.

The CFP closes on 6th June 21st June.

April 29, 2014 11:41 AM

April 27, 2014

Pete Zaitcev: gEDA

zag-avr-1

For the next stage, I switched from Fritzing to gEDA and was much happier for it. Certainly, gEDA is old-fashioned. The UI of gschem is really weird, and even has problems dealing with click-to-focus. But it works.

I put the phase onto a Veroboard, as I'm not ready to deal with ordering PCBs yet.

April 27, 2014 01:41 AM

April 26, 2014

Matt Domsch: Job opening on my team in Dell Software

One of the things I’ve started to enjoy as a people manager at Dell, more than as an “individual contributor”, is that I have a lot of say in what skills my team needs, and can look for people I really want to have on my team. I’ve got one such opening posted now, for a Principal Web Software Engineer. Please contact me if you or someone you know would be a good fit.

Principal Web Software Engineer- Round Rock, TX

Dell Software empowers companies of all sizes to experience Dell’s “Power to Do More” by delivering scalable yet simple-to-use solutions to drive value and accelerate results. We are uniquely positioned to address today’s most pressing business and IT challenges with holistic, connected software offerings. Dell Software Group’s portfolio now includes the products and technologies of Quest Software, AppAssure, Enstratius, Boomi, KACE and SonicWALL. For more information, please visit http://software.dell.com/.

Come work on a dynamic team creating software that impacts enterprise customers on a day-to-day basis. Great
opportunity to work on cutting edge technology using evolving development tools and methods.

This role is for a unique and talented technical contributor in the Dell Software Group responsible for a number of activities required to design, develop, test, maintain and operate web software applications built using J2EE and JBoss GateIn portal.
The position requires strong technical and creative skills as well as an understanding of software engineering processes, technologies, tools, and techniques including: Red Hat JBoss Enterprise Application Platform (or similar J2EE frameworks), JBoss GateIn, HTML5, CSS, JavaScript, jQuery, SAML, and web security.

Role Responsibilities
-Develop high-level and low-level design documents for modules of the web application
-Provide technical leadership to a team of software engineers
-Develop server software in the Java environment
-Create, maintain, and manipulate web application content in HTML5, CSS, and JavaScript
-Unit test web application software on the client and on the server
-Provide guidance and support to application testing and quality assurance teams
-Support field sales on specific customer engagements
-Work in an Agile software development environment
-Work in a Linux environment
-Experience with Atlassian products, continuous integration,
-Experience with Integration Platform-as-a-Service (Dell Boomi)

Requirements
-8+ years of experience designing and developing Java-based web applications
- Have a deep understanding of the components and capabilities of J2EE and be able to help make architectural decisions on component choices
- Posses visual web application design skills, implementation knowledge in internationalization and localization, HTML, CSS, JavaScript, RESTful interface design and utilization skills, OAuth or other token-based authentication and authorization
-Willing and able to learn new technologies and concepts
-Demonstrate resourcefulness in resolving technical issues
-Advanced interpersonal skills and be able to work independently and as part of the team.

Job ID: 14000IXI

April 26, 2014 05:43 AM

April 25, 2014

Dave Jones: Weekly Fedora kernel bug statistics – April 25th 2014

  19 20 rawhide  
Open: 102 223 148 (473)
Opened since 2014-04-18 1 25 3 (29)
Closed since 2014-04-18 3 16 3 (22)
Changed since 2014-04-18 7 38 15 (60)

Weekly Fedora kernel bug statistics – April 25th 2014 is a post from: codemonkey.org.uk

April 25, 2014 04:29 PM