Kernel Planet

July 21, 2016

Pavel Machek: Something usable for N9/N950?

I have both N9 and N950 here. Unfortunately, both are useless, as developer mode can not be enabled (because Nokia servers are down?).

     
There's tutorial -- how to install Sailfish. I'd love that. Neccessary images for N9 are no longer available. N950 images are still there, but... tutorial needs developer mode, and that's no longer possible.
https://www.youtube.com/watch?v=bRncXK6dmMc&feature=youtu.be
Is there any way out of this for me?
Would it be possible to download image of working N9/N950 from some device, so that I could flash it?

July 21, 2016 08:54 PM

July 20, 2016

Andy Grover: Recording from my talk on Rust features used by Froyo

I spoke at PDXRust this month. A recording of the talk is available here.

July 20, 2016 06:28 PM

July 19, 2016

Dave Airlie: radv: initial hacking on a vulkan driver for AMD VI GPUs

(email sent to mesa-devel list).

I was waiting for an open source driver to appear when I realised I should really just write one myself, some talking with Bas later, and we decided to see where we could get.

This is the point at which we were willing to show it to others, it's not really a vulkan driver yet, so far it's a vulkan triangle demos driver.

It renders the tri and cube demos from the vulkan loader,
and the triangle demo from Sascha Willems demos
and the Vulkan CTS smoke tests (all 4 of them one of which draws a triangle).

There is a lot of work to do, and it's at the stage where we are seeing if anyone else wants to join in at the start, before we make too many serious design decisions or take a path we really don't want to.

So far it's only been run on Tonga and Fiji chips I think, we are hoping to support radeon kernel driver for SI/CIK at some point, but I think we need to get things a bit further on VI chips first.

The code is currently here:
https://github.com/airlied/mesa/tree/semi-interesting

There is a not-interesting branch which contains all the pre-history which might be useful for someone else bringing up a vulkan driver on other hardware.

The code is pretty much based on the Intel anv driver, with the winsys ported from gallium driver,
and most of the state setup from there. Bas wrote the code to connect NIR<->LLVM IR so we could reuse it in the future for SPIR-V in GL if required. It also copies AMD addrlib over, (this should be shared).

Also we don't do SPIR-V->LLVM direct. We use NIR as it has the best chance for inter shader stage optimisations (vertex/fragment combined) which neither SPIR-V or LLVM handles for us, (nir doesn't do it yet but it can).

If you want to submit bug reports, they will only be taken seriously if accompanied by working patches at this stage, and we've no plans to merge to master yet, but open to discussion on when we could do that and what would be required.

July 19, 2016 07:59 PM

Pavel Machek: Google play store playing with GPS?

Microsoft broke my father's computer: it made him update to Windows 10, when Windows 10 can not use two of 3 USB ports. Ouch.
Now it seems Google is trying to break my LG E720 cellphone. I did the right steps: it is running Cyanogen. One method is "we have great update for google Talk". Unfortunatelly, update is 20MB or more, basically filling internal flash and making phone unusable. And you can't disable update completely, you can only avoid clicking on it.
Now, Google seems to have invented new method of breaking my phone: Google Play Store started activating GPS. It does it on background. I tried downgrading to factory version of the play store, and not accepting its terms and conditions, but somehow I think it will come back.
Google Play Store definitely behaves like malware now. Unfortunately, on Android, it is hard to tell if malware-like behaviour is intentional by Google or if I have some third-party malware on the phone, too.

July 19, 2016 07:32 AM

Daniel Vetter: New Blog Engine!

I finally unlazied and moved my blog away from the Google mothership to something simply, fast and statically generated. It’s built on Jekyll, hosted on github. It’s not quite as fancy as the old one, but with some googling I figured out how to add pages for tags and an archive section, and that’s about all that’s really needed.

Comments are gone too, because I couldn’t be bothered, and because everything seems to add Orwellian amounts of trackers. Ping me on IRC, by mail or on twitter instead. The share buttons are also just plain links now without tracking for Twitter (because I’m there) and G+ (because all the cool kernel hackers are there, but I’m not cool enough).

And in case you wonder why I blatter for so long about this change: I need a new blog entry to double check that the generated feeds are still at the right spots for the various planets to pick them up …

July 19, 2016 12:00 AM

July 17, 2016

Michael Kerrisk (manpages): man-pages-4.07 is released

I've released man-pages-4.07. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from around 50 contributors. The release includes changes to over 140 man pages. Among which the more significant changes in man-pages-4.07 are the following:

July 17, 2016 07:20 PM

Michael Kerrisk (manpages): man-pages-4.06 is released

I've released man-pages-4.06. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.

This release resulted from patches, bug reports, reviews, and comments from around 20 contributors. The release includes changes to just over 40 man pages. Among which the more significant changes in man-pages-4.06 are the following:

July 17, 2016 06:27 PM

July 16, 2016

Pavel Machek: Updating userland on phone is... fun.

So I updated debian on my N900, because having different version on phone and PC sucks.
It booted, after I added "rw" to kernel command line, which is small miracle.
But X are without usable mouse, without panel, and without reasonable keymap. In particular, I don't have numbers, or any symbols besides ',.<>'. But I have root shell. Now I need to enable remote access somehow... which will be fun because sshd disappeared.
emacs to the rescue. By editing existing script, I can get access to characters such as '/', and launch telnetd. Installing sshd was easy with nmtui. Good.
Now, it seems something changed in the X land, and "xinput --set-prop --type=int 8 249 0 1" no longer works, which means touchscreen is
"upside down". If you know the replacement commands, let me know...

July 16, 2016 06:42 AM

July 13, 2016

Pavel Machek: Ad blocking: yes, its war now

idnes.cz: they put moving advertisment on that their web, making browsers unusable -- they eat 100% CPU and pages lag when scrolling. They put video ads inside text that appear when you scroll. They have video ads including audio...  (Advertisment for olympic games is particulary nasty, Core Duo, it also raises power consumption by like 30W). Then they are surpised of adblock and complain with popup when they detect one. I guess I am either looking for better news source, or for the next step in adblock war...
...hmm. Perhaps time to disable javascript for such pages? No. Javascript is needed for video play back. But this seems to do the trick:

@@||1gr.cz/js/ad$domain=idnes.cz
@@||1gr.cz/js/ad$domain=mobil.idnes.cz
@@||1gr.cz/js/ad$domain=technet.idnes.cz
@@||1gr.cz/js/ad$domain=auto.idnes.cz
g.cz##.advert-fallback.show-fallback

Ok, seems Richard Stallman was right once again, and non-free javascript is a problem. News websites now commonly serve javascript that goes against interests of users.
Now, disabling javascript completely solves the problem, but it also breaks video playback. Is there time to create whitelist of reasonable javascript code?

July 13, 2016 08:17 AM

July 12, 2016

Pavel Machek: Facebook spyware push and unexpected N900 advantages

Dear Facebook. I'm aware that your Facebook lite is only 1MB. It is also dangerous spyware. You try to push it to me every time I attempt to use m.facebook.com. Would not it be nice if you avoided pushing your spyware to phones that can not handle it? Yes, that's right, your spyware requires too new android, so my phone is actually immune from it.

At least N900 seems tu be immune from this problem, you don't even try to push your crap there. Good.

July 12, 2016 08:24 PM

Pavel Machek: Fun with the camera (on N900)

Ivaylo Dimitrov got camera to work on N900. Great. Unfortunately... there's works and works.
Working /dev/video* devices does not mean you get useful photos easily. I managed to get flash and autofocus hardware to work... But that still does not mean useful photos.
It seems that http://fcam.garage.maemo.org/fcamera.html is the best way to get the photos. Now... having phone act as a ssh server on local network, with gcc and development tools installed is very convenient... but sometimes you forget that this is not a powerful machine you normally work at.
Oh and BTW... is someone maintaining fcam-dev? Last update in svn is in 2011... I guess I'm maintaining it now...

Auto exposure is quite tricky to do right, and fcam-dev relied on some kernel patches to provide adequate performance. (If someone has pointer to the patch, that would be nice, BTW.) But... people were taking photos before auto exposure was available. And n900 has light sensor, so we can actually use it to bootstrap the auto exposure, making it slightly faster. Seems to work now. I had auto focus working with extremely ugly hacks... but I guess it will be easier to just let user focus manually. The camera is small enough that it should be feasible.

I played with the camera cover, and if you open it in the dark, it starts the flashlight (because that's what you probably want). If you open it with enough light, it will start camera for you. I feel proud.

N900 is a "little" loud in headphones. And it still plays PCM when mixer is set to 0. Weird. But probably topic for another investigation.

Charging (etc) on N900 is still funny. If you poweroff while charging, it will keep charging. That's probably a Linux bug. Charge counter (battery percent) are "kept" even if you replace the battery. Ouch. (So you replace empty battery with full one and still get empty reading. I guess normal people don't have 3 batteries for their phones?) That may be a hardware bug.
You can set charge voltage to more than 4.2V. That will make battery catch fire. I wonder what it is useful for, I guess when you want to have fun on the airplane? /sys/class/power_supply/bq24150a-0/registers is not exactly one value per file. Not sure about mode. model_name is definitely not one value per file.

dx.com 406496 : USB power meter. Seems to work quite well (I compared amper meter with other one, and they agreed +- 10%, so numbers do not look completely useless), and is very useful for debugging power issues. But it seems that USB power supplies often provide about half of their rated current...

July 12, 2016 08:14 PM

July 10, 2016

Paul E. Mc Kenney: Stupid RCU Tricks: An Early-1970s Example

Up to now, I have been calling out Kung's and Lehman's classic 1980 “Concurrent Manipulation of Binary Search Trees” as the oldest mention of something vaguely resembling RCU.

However, while looking into the history of reference counting, I found Weizenbaum's postively antique 1963 “Symmetric List Processor (SLIP)”, which describes a list-processing library, written in FORTRAN, no less. One of its features was storage reclamation based on reference counting, in which newly reclaimed list items are added to the end of SLIP's list of available space.

And, on page 413 of his 1973 “The Art of Computer Programming: Fundamental Algorithms”, none other than Donald Knuth points out that this add-at-end implementation means that “incorrect programs run correctly for awhile”. Here, “incorrect programs” is presumably referring to readers traversing SLIP lists while failing to increment the needed SLIP reference counters.

So it is that the earliest known mention of RCU-based algorithms dismisses them all as buggy. And rightfully so, given that SLIP has no notion of anything like a grace period. Thus, Kung and Lehman remain the first known implementers of something vaguely resembling RCU—but no longer the first mention!

July 10, 2016 09:34 PM

July 09, 2016

Matthew Garrett: "I recieved a free or discounted product in return for an honest review"

My experiences with Amazon reviewing have been somewhat unusual. A review of a smart switch I wrote received enough attention that the vendor pulled the product from Amazon. At the time of writing, I'm ranked as around the 2750th best reviewer on Amazon despite having a total of 18 reviews. But the world of Amazon reviews is even stranger than that, and the past couple of weeks have given me some insight into it.

Amazon's success is fairly phenomenal. It's estimated that there's over 50 million people in the US paying $100 a year to get free shipping on Amazon purchases, and combined with Amazon's surprisingly customer friendly service there's a lot of people with a very strong preference for choosing Amazon rather than any other retailer. If you're not on Amazon, you're hurting your sales.

And if you're an established brand, this works pretty well. Some people will search for your product directly and buy it, leaving reviews. Well reviewed products appear higher up in search results, so people searching for an item type rather than a brand will still see your product appear early in the search results, in turn driving sales. Some proportion of those customers will leave reviews, which helps keep your product high up in the results. As long as your products aren't utterly dreadful, you'll probably maintain that position.

But if you're a brand nobody's ever heard of, things are more difficult. People are unlikely to search for your product directly, so you're relying on turning up in the results for more generic terms. But if you're selling a more generic kind of item (say, a Bluetooth smart bulb) then there's probably a number of other brands nobody's ever heard of selling almost identical objects. If there's no reason for anybody to choose your product then you're probably not going to get any reviews and you're not going to move up the search rankings. Even if your product is better than the competition, a small number of sales means a tiny number of reviews. By the time that number's large enough to matter, you're probably onto a new product cycle.

In summary: if nobody's ever heard of you, you need reviews but you're probably not getting any.

The old way of doing this was to send review samples to journalists, but nobody's going to run a comprehensive review of 3000 different USB cables and even if they did almost nobody would read it before making a decision on Amazon. You need Amazon reviews, but you're not getting any. The obvious solution is to send review samples to people who will leave Amazon reviews. This is where things start getting more dubious.

Amazon run a program called Vine which is intended to solve this problem. Send samples to Amazon and they'll distribute them to a subset of trusted reviewers. These reviewers write a review as normal, and Amazon tag the review with a "Vine Voice" badge which indicates to readers that the reviewer received the product for free. But participation in Vine is apparently expensive, and so there's a proliferation of sites like Snagshout or AMZ Review Trader that use a different model. There's no requirement that you be an existing trusted reviewer and the product probably isn't free. You sign up, choose a product, receive a discount code and buy it from Amazon. You then have a couple of weeks to leave a review, and if you fail to do so you'll lose access to the service. This is completely acceptable under Amazon's rules, which state "If you receive a free or discounted product in exchange for your review, you must clearly and conspicuously disclose that fact". So far, so reasonable.

In reality it's worse than that, with several opportunities to game the system. AMZ Review Trader makes it clear to sellers that they can choose reviewers based on past reviews, giving customers an incentive to leave good reviews in order to keep receiving discounted products. Some customers take full advantage of this, leaving a giant number of 5 star reviews for products they clearly haven't tested and then (presumably) reselling them. What's surprising is that this kind of cynicism works both ways. Some sellers provide two listings for the same product, the second being significantly more expensive than the first. They then offer an attractive discount for the more expensive listing in return for a review, taking it down to approximately the same price as the original item. Once the reviews are in, they can remove the first listing and drop the price of the second to the original price point.

The end result is a bunch of reviews that are nominally honest but are tied to perverse incentives. In effect, the overall star rating tells you almost nothing - you still need to actually read the reviews to gain any insight into whether the customer actually used the product. And when you do write an honest review that the seller doesn't like, they may engage in heavy handed tactics in an attempt to make the review go away.

It's hard to avoid the conclusion that Amazon's review model is broken, but it's not obvious how to fix it. When search ranking is tied to reviews, companies have a strong incentive to do whatever it takes to obtain positive reviews. What we're left with for now is having to laboriously click through a number of products to see whether their rankings come from thoughtful and detailed reviews or are just a mass of 5 star one liners.

comment count unavailable comments

July 09, 2016 07:09 PM

July 07, 2016

Matthew Garrett: Bluetooth LED bulbs

The best known smart bulb setups (such as the Philips Hue and the Belkin Wemo) are based on Zigbee, a low-energy, low-bandwidth protocol that operates on various unlicensed radio bands. The problem with Zigbee is that basically no home routers or mobile devices have a Zigbee radio, so to communicate with them you need an additional device (usually called a hub or bridge) that can speak Zigbee and also hook up to your existing home network. Requests are sent to the hub (either directly if you're on the same network, or via some external control server if you're on a different network) and it sends appropriate Zigbee commands to the bulbs.

But requiring an additional device adds some expense. People have attempted to solve this in a couple of ways. The first is building direct network connectivity into the bulbs, in the form of adding an 802.11 controller. Go through some sort of setup process[1], the bulb joins your network and you can communicate with it happily. Unfortunately adding wifi costs more than adding Zigbee, both in terms of money and power - wifi bulbs consume noticeably more power when "off" than Zigbee ones.

There's a middle ground. There's a large number of bulbs available from Amazon advertising themselves as Bluetooth, which is true but slightly misleading. They're actually implementing Bluetooth Low Energy, which is part of the Bluetooth 4.0 spec. Implementing this requires both OS and hardware support, so older systems are unable to communicate. Android 4.3 devices tend to have all the necessary features, and modern desktop Linux is also fine as long as you have a Bluetooth 4.0 controller.

Bluetooth is intended as a low power communications protocol. Bluetooth Low Energy (or BLE) is even lower than that, running in a similar power range to Zigbee. Most semi-modern phones can speak it, so it seems like a pretty good choice. Obviously you lose the ability to access the device remotely, but given the track record on this sort of thing that's arguably a benefit. There's a couple of other downsides - the range is worse than Zigbee (but probably still acceptable for any reasonably sized house or apartment), and only one device can be connected to a given BLE server at any one time. That means that if you have the control app open while you're near a bulb, nobody else can control that bulb until you disconnect.

The quality of the bulbs varies a great deal. Some of them are pure RGB bulbs and incapable of producing a convincing white at a reasonable intensity[2]. Some have additional white LEDs but don't support running them at the same time as the colour LEDs, so you have the choice between colour or a fixed (and usually more intense) white. Some allow running the white LEDs at the same time as the RGB ones, which means you can vary the colour temperature of the "white" output.

But while the quality of the bulbs varies, the quality of the apps doesn't really. They're typically all dreadful, competing on features like changing bulb colour in time to music rather than on providing a pleasant user experience. And the whole "Only one person can control the lights at a time" thing doesn't really work so well if you actually live with anyone else. I was dissatisfied.

I'd met Mike Ryan at Kiwicon a couple of years back after watching him demonstrate hacking a BLE skateboard. He offered a couple of good hints for reverse engineering these devices, the first being that Android already does almost everything you need. Hidden in the developer settings is an option marked "Enable Bluetooth HCI snoop log". Turn that on and all Bluetooth traffic (including BLE) is dumped into /sdcard/btsnoop_hci.log. Turn that on, start the app, make some changes, retrieve the file and check it out using Wireshark. Easy.

Conveniently, BLE is very straightforward when it comes to network protocol. The only thing you have is GATT, the Generic Attribute Protocol. Using this you can read and write multiple characteristics. Each packet is limited to a maximum of 20 bytes. Most implementations use a single characteristic for light control, so it's then just a matter of staring at the dumped packets until something jumps out at you. A pretty typical implementation is something like:

0x56,r,g,b,0x00,0xf0,0x00,0xaa

where r, g and b are each just a single byte representing the corresponding red, green or blue intensity. 0x56 presumably indicates a "Set the light to these values" command, 0xaa indicates end of command and 0xf0 indicates that it's a request to set the colour LEDs. Sending 0x0f instead results in the previous byte (0x00 in this example) being interpreted as the intensity of the white LEDs. Unfortunately the bulb I tested that speaks this protocol didn't allow you to drive the white LEDs at the same time as anything else - setting the selection byte to 0xff didn't result in both sets of intensities being interpreted at once. Boo.

You can test this out fairly easily using the gatttool app. Run hcitool lescan to look for the device (remember that it won't show up if anything else is connected to it at the time), then do gatttool -b deviceid -I to get an interactive shell. Type connect to initiate a connection, and once connected send commands by doing char-write-cmd handle value using the handle obtained from your hci dump.

I did this successfully for various bulbs, but annoyingly hit a problem with one from Tikteck. The leading byte of each packet was clearly a counter, but the rest of the packet appeared to be garbage. For reasons best known to themselves, they've implemented application-level encryption on top of BLE. This was a shame, because they were easily the best of the bulbs I'd used - the white LEDs work in conjunction with the colour ones once you're sufficiently close to white, giving you good intensity and letting you modify the colour temperature. That gave me incentive, but figuring out the protocol took quite some time. Earlier this week, I finally cracked it. I've put a Python implementation on Github. The idea is to tie it into Ulfire running on a central machine with a Bluetooth controller, making it possible for me to control the lights from multiple different apps simultaneously and also integrating with my Echo.

I'd write something about the encryption, but I honestly don't know. Large parts of this make no sense to me whatsoever. I haven't even had any gin in the past two weeks. If anybody can explain how anything that's being done there makes any sense at all[3] that would be appreciated.

[1] typically via the bulb pretending to be an access point, but also these days through a terrifying hack involving spewing UDP multicast packets of varying lengths in order to broadcast the password to associated but unauthenticated devices and good god the future is terrifying

[2] For a given power input, blue LEDs produce more light than other colours. To get white with RGB LEDs you either need to have more red and green LEDs than blue ones (which costs more), or you need to reduce the intensity of the blue ones (which means your headline intensity is lower). Neither is appealing, so most of these bulbs will just give you a blue "white" if you ask for full red, green and blue

[3] Especially the bit where we calculate something from the username and password and then encrypt that using some random numbers as the key, then send 50% of the random numbers and 50% of the encrypted output to the device, because I can't even

comment count unavailable comments

July 07, 2016 10:41 PM

July 02, 2016

Pavel Machek: You know your build machine is too small when...

...you don't have enough memory to run cc1plus and emacs at the same time. Ouch. Development directly on target (N900) is nice, but it has some unexpected pitfalls.

OTOH, I was able to modify fcam-dev to work with new kernel, and I'm slowly moving towards some real photograhs.

July 02, 2016 07:34 PM

July 01, 2016

James Morris: Linux Security Summit 2016 Schedule Published

The schedule for the 2016 Linux Security Summit is now published!

The keynote speaker for this year’s event is Julia Lawall.  Julia is a research scientist at Inria, the developer of Coccinelle, and the Linux Kernel coordinator for the Outreachy project.

Refereed presentations include:

See the schedule for the full list of talks.

Also included are updates from Linux kernel security subsystem maintainers, and snacks.

The event this year is co-located with LinuxCon North America in Toronto, and will be held on the 25th and 26th of August.  Standalone registration for the Linux Security Summit is $100 USD: click here to register.

You can also follow updates and news for the event via Twitter:  @LinuxSecSummit
See you there!

July 01, 2016 09:44 AM

June 30, 2016

LPC 2016: Earlybird Registration for Plumbers nearly Full

As of today, we only have 25% of the places remaining out of our quota of 140 for earlybird registration.  Once this fills up, general registration will close until 27 August when we’ll add 50 more slots at the regular rate.

June 30, 2016 10:40 PM

June 28, 2016

Paul E. Mc Kenney: Stupid RCU Tricks: Ordering of RCU Callback Invocation

Suppose a Linux-kernel task registers a pair of RCU callbacks, as follows:


call_rcu(&p->rcu, myfunc);
smp_mb();
call_rcu(&q->rcu, myfunc);

Given that these two callbacks are guaranteed to be registered in order,
are they also guaranteed to be invoked in order?

June 28, 2016 05:58 PM

June 21, 2016

Matthew Garrett: I've bought some more awful IoT stuff

I bought some awful WiFi lightbulbs a few months ago. The short version: they introduced terrible vulnerabilities on your network, they violated the GPL and they were also just bad at being lightbulbs. Since then I've bought some other Internet of Things devices, and since people seem to have a bizarre level of fascination with figuring out just what kind of fractal of poor design choices these things frequently embody, I thought I'd oblige.

Today we're going to be talking about the KanKun SP3, a plug that's been around for a while. The idea here is pretty simple - there's lots of devices that you'd like to be able to turn on and off in a programmatic way, and rather than rewiring them the simplest thing to do is just to insert a control device in between the wall and the device andn ow you can turn your foot bath on and off from your phone. Most vendors go further and also allow you to program timers and even provide some sort of remote tunneling protocol so you can turn off your lights from the comfort of somebody else's home.

The KanKun has all of these features and a bunch more, although when I say "features" I kind of mean the opposite. I plugged mine in and followed the install instructions. As is pretty typical, this took the form of the plug bringing up its own Wifi access point, the app on the phone connecting to it and sending configuration data, and the plug then using that data to join your network. Except it didn't work. I connected to the plug's network, gave it my SSID and password and waited. Nothing happened. No useful diagnostic data. Eventually I plugged my phone into my laptop and ran adb logcat, and the Android debug logs told me that the app was trying to modify a network that it hadn't created. Apparently this isn't permitted as of Android 6, but the app was handling this denial by just trying again. I deleted the network from the system settings, restarted the app, and this time the app created the network record and could modify it. It still didn't work, but that's because it let me give it a 5GHz network and it only has a 2.4GHz radio, so one reset later and I finally had it online.

The first thing I normally do to one of these things is run nmap with the -O argument, which gives you an indication of what OS it's running. I didn't really need to in this case, because if I just telnetted to port 22 I got a dropbear ssh banner. Googling turned up the root password ("p9z34c") and I was logged into a lightly hacked (and fairly obsolete) OpenWRT environment.

It turns out that here's a whole community of people playing with these plugs, and it's common for people to install CGI scripts on them so they can turn them on and off via an API. At first this sounds somewhat confusing, because if the phone app can control the plug then there clearly is some kind of API, right? Well ha yeah ok that's a great question and oh good lord do things start getting bad quickly at this point.

I'd grabbed the apk for the app and a copy of jadx, an incredibly useful piece of code that's surprisingly good at turning compiled Android apps into something resembling Java source. I dug through that for a while before figuring out that before packets were being sent, they were being handed off to some sort of encryption code. I couldn't find that in the app, but there was a native ARM library shipped with it. Running strings on that showed functions with names matching the calls in the Java code, so that made sense. There were also references to AES, which explained why when I ran tcpdump I only saw bizarre garbage packets.

But what was surprising was that most of these packets were substantially similar. There were a load that were identical other than a 16-byte chunk in the middle. That plus the fact that every payload length was a multiple of 16 bytes strongly indicated that AES was being used in ECB mode. In ECB mode each plaintext is split up into 16-byte chunks and encrypted with the same key. The same plaintext will always result in the same encrypted output. This implied that the packets were substantially similar and that the encryption key was static.

Some more digging showed that someone had figured out the encryption key last year, and that someone else had written some tools to control the plug without needing to modify it. The protocol is basically ascii and consists mostly of the MAC address of the target device, a password and a command. This is then encrypted and sent to the device's IP address. The device then sends a challenge packet containing a random number. The app has to decrypt this, obtain the random number, create a response, encrypt that and send it before the command takes effect. This avoids the most obvious weakness around using ECB - since the same plaintext always encrypts to the same ciphertext, you could just watch encrypted packets go past and replay them to get the same effect, even if you didn't have the encryption key. Using a random number in a challenge forces you to prove that you actually have the key.

At least, it would do if the numbers were actually random. It turns out that the plug is just calling rand(). Further, it turns out that it never calls srand(). This means that the plug will always generate the same sequence of challenges after a reboot, which means you can still carry out replay attacks if you can reboot the plug. Strong work.

But there was still the question of how the remote control works, since the code on github only worked locally. tcpdumping the traffic from the server and trying to decrypt it in the same way as local packets worked fine, and showed that the only difference was that the packet started "wan" rather than "lan". The server decrypts the packet, looks at the MAC address, re-encrypts it and sends it over the tunnel to the plug that registered with that address.

That's not really a great deal of authentication. The protocol permits a password, but the app doesn't insist on it - some quick playing suggests that about 90% of these devices still use the default password. And the devices are all based on the same wifi module, so the MAC addresses are all in the same range. The process of sending status check packets to the server with every MAC address wouldn't take that long and would tell you how many of these devices are out there. If they're using the default password, that's enough to have full control over them.

There's some other failings. The github repo mentioned earlier includes a script that allows arbitrary command execution - the wifi configuration information is passed to the system() command, so leaving a semicolon in the middle of it will result in your own commands being executed. Thankfully this doesn't seem to be true of the daemon that's listening for the remote control packets, which seems to restrict its use of system() to data entirely under its control. But even if you change the default root password, anyone on your local network can get root on the plug. So that's a thing. It also downloads firmware updates over http and doesn't appear to check signatures on them, so there's the potential for MITM attacks on the plug itself. The remote control server is on AWS unless your timezone is GMT+8, in which case it's in China. Sorry, Western Australia.

It's running Linux and includes Busybox and dnsmasq, so plenty of GPLed code. I emailed the manufacturer asking for a copy and got told that they wouldn't give it to me, which is unsurprising but still disappointing.

The use of AES is still somewhat confusing, given the relatively small amount of security it provides. One thing I've wondered is whether it's not actually intended to provide security at all. The remote servers need to accept connections from anywhere and funnel decent amounts of traffic around from phones to switches. If that weren't restricted in any way, competitors would be able to use existing servers rather than setting up their own. Using AES at least provides a minor obstacle that might encourage them to set up their own server.

Overall: the hardware seems fine, the software is shoddy and the security is terrible. If you have one of these, set a strong password. There's no rate-limiting on the server, so a weak password will be broken pretty quickly. It's also infringing my copyright, so I'd recommend against it on that point alone.

comment count unavailable comments

June 21, 2016 11:11 PM

June 15, 2016

LPC 2016: Android/Mobile Microconference Accepted into 2016 Linux Plumbers Conference

Android continues to find interesting new applications and problems to solve, both within and outside the mobile arena. Mainlining continues to be an area of focus, as do a number of areas of core Android functionality, including the kernel. Other topics include efficient operation on big.LITTLE systems, support for HiKey in AOSP (and multi-device support in general), and the upcoming migration to Clang for Android builds.

Android continues to be a very exciting and dynamic project, with the above topics merely scratching the surface. For more details, see the Android/Mobile Microconference wiki page.

June 15, 2016 11:31 PM

Paul E. Mc Kenney: Android/Mobile Microconference Accepted into 2016 Linux Plumbers Conference

Android continues to find interesting new applications and problems to solve, both within and outside the mobile arena. Mainlining continues to be an area of focus, as do a number of areas of core Android functionality, including the kernel. Other topics include efficient operation on big.LITTLE systems, support for Hikey in AOSP (and multi-device support in general), and the upcoming migration to Clang" for Android builds.

Android continues to be a very exciting and dynamic project, with the above topics merely scratching the surface. For more details, see the Android/Mobile Microconference wiki page.

June 15, 2016 03:25 PM

Pete Zaitcev: Go go go

Can you program in Go without knowing a thing about it? Why, yes. A barrier to entry, where are you?

June 15, 2016 02:57 PM

Rusty Russell: Minor update on transaction fees: users still don’t care.

I ran some quick numbers on the last retargeting period (blocks 415296 through 416346 inclusive) which is roughly a week’s worth.

Blocks were full: median 998k mean 818k (some miners blind mining on top of unknown blocks). Yet of the 1,618,170 non-coinbase transactions, 48% were still paying dumb, round fees (like 5000 satoshis). Another 5% were paying dumbround-numbered per-byte fees (like 80 satoshi per byte).

The mean fee was 24051 satoshi (~16c), the mean fee rate 60 satoshi per byte. But if we look at the amount you needed to pay to get into a block (using the second cheapest tx which got in), the mean was 16.81 satoshis per byte, or about 5c.

tl;dr: It’s like a tollbridge charging vehicles 7c per ton, but half the drivers are just throwing a quarter as they drive past and hoping it’s enough. It really shows fees aren’t high enough to notice, and transactions don’t get stuck often enough to notice. That’s surprising; at what level will they notice? What wallets or services are they using?

June 15, 2016 03:00 AM

June 10, 2016

Andy Grover: Why Rust for Low-level Linux programming?

I think Rust is extremely well-suited for low level Linux systems userspace programming — daemons, services, command-line tools, that sort of thing.

Low-level userspace code on Linux is almost universally written in C — until one gets to a certain point where it’s acceptable for Python to be used. Undoubtedly this springs from Linux’s GNU & Unix heritage, but there are also many recent and Linux-specific pieces that are written in C. I think Rust is a better choice for new projects, and here’s why.

Coding is challenging because of mental context-keeping

Coding is hard and distractions are bad because of how much context the developer needs to keep straight as they look at the code. Buffers allocated, locks taken, local variables — these all create little mental things that I need to remember if I’m going to understand a chunk of code, and fix or improve it. Why have proper indentation? Because it helps us keep things straight in our heads. Why keep functions short? Same reason.

Rust reduces the amount of state I need to keep track of in my brain. It checks things that before I depended on myself to check. It gives me tools to express what I want in fewer lines, but still allows maximum control when needed. The same functionality with less code, and checks to ensure it’s better code, these make me more productive and introduce fewer bugs.

Strong types help the compiler help you

Strong typing gives the compiler information it can use to spot errors. This is important as a program grows from a toy into a useful thing. Assumptions within the code change and strong typing check the assumptions so that each version of the program globally either uses the old, or the new assumptions, but not both.

The key to this is being able to describe to the compiler the intended constraints of our code as clearly as possible.

Expressive types prevent needing type “escape hatches”

One problem with weakly-typed languages is when you need to do something that the type system doesn’t quite let you describe. This leads to needing to use the language’s escape hatches, the “do what I mean!” outs, like casting, that let you do what you need to do, but also inherently create places where the compiler can’t help check things.

Or, there may be ambiguity because a type serves two purposes, depending on the context. One example would be returning a pointer. Is NULL a “good” return value, or is it an error? The programmer needs to know based upon external information (docs), and the compiler can’t help by checking the return value is used properly. Or, say a function returns int. Usually a negative value is an error, but not always. And, negative values are nonzero so they evaluate as true in a conditional statement! It’s just…loose.

Rust distinguishes between valid and error results much more explicitly. It has a richer type system with sum types like Option and Result. These eliminate using a single value for both error and success return cases, and let the compiler help the programmer get it right. A richer type system lets us avoid needing escapes.

Memory Safety, Lifetimes, and the Borrow Checker

For me, this is another case where Rust is enabling the verification of something that C programmers learned painfully how to do right — or else. In C I’ve had functions that “borrowed” a pointer versus ones that “took ownership”, but this was not enforced in the language, only documented in the occasional comment above the function that it had one behavior or the other. So for me it was like “duh”, yeah, we notate this, have terms that express what’s happening, and the compiler can check it. Having to use ref-counting or garbage collection is great but for most cases it’s not strictly needed. And if we do need it, it’s available.

Cargo and Libraries

Cargo makes using libraries easy. Easy libraries mean your program can focus more on doing its thing, and in turn make it easier for others to use what you provide. Efficient use of libraries reduce duplicated work and keep lines of code down.

Functional Code, Functional thinking

I like iterators and methods like map, filter, zip, and chain because they make it easier to break down what I’m doing to a sequence into easier to understand fundamental steps, and also make it easier for other coders to understand the code’s intent.

Rewrite everything?

It’s starting to be a cliche. Let’s rewrite everything in Rust! OpenSSL, Tor, the kernel, the browser. Wheeee! Of course this isn’t realistic, but why do people exposed to Rust keep thinking things would be better off in Rust?

I think for two reasons, each from a different group. First, from coders coming from Python, Ruby, and JavaScript, Rust offers some of the higher-level conveniences and productivity they expect. They’re familiar with Rust development model and its Cargo-based, GitHub-powered ecosystem. Types and the borrow checker are a learning curve for them, but the result is blazingly fast code, and the ability to do systems-level things, like calling ioctls, in which a C extension would’ve been called for — but these people don’t want to learn C. These people might call for a rewrite in Rust because it brings that component into the realm of things they can hack on.

Second, there are people like me, people working in C and Python on Linux systems-level stuff — the “plumbing”, who are frustrated with low productivity. C and Python have diametrically-opposed advantages and disadvantages. C is fast to run but slow to write, and hard to write securely. Python is more productive but too slow and RAM-hungry for something running all the time, on every system. We must deal with getting C components to talk to Python components all the time, and it isn’t fun. Rust is the first language that gives a system programmer performance and productivity. These people might see Rust as a chance to increase security, to increase their own productivity, to never have to touch libtool/autoconf ever again, and to solve the C/Python dilemma with a one language solution.

Incremental Evolution of the Linux Platform

Fun to think about for a coder, but then, what, now we have C, Python, AND Rust that need to interact? “If only everything were Rust… and it would be so easy… how hard could it be?” 🙂 Even in Rust, a huge, not-terribly-fun task. I think Rust has great promise, but success lies in incremental evolution of the Linux platform. We’ve seen service consolidation in systemd and the idea of Linux as a platform distinct from Unix. We have a very useful language-agnostic IPC mechanism — DBus — that gives us more freedom to link things written in new languages. I’m hopeful Rust can find places it can be useful as Linux gains new capabilities, and then perhaps converting existing components may happen as maintainers gain exposure and experience with Rust, and recognize its virtues.

The Future

Rust is not standing still. Recent developments like native debugging support in GDB, and the ongoing MIR work, show that Rust will become even better over time. But don’t wait. Rust can be used to rapidly develop high-quality programs today. Learning Rust can benefit you now, and also yield dividends as Rust and its ecosystem continue to improve.

June 10, 2016 04:00 PM

June 07, 2016

Pete Zaitcev: You are in a maze of twisted little directories, all alike


[root@rhev-a24c-01 go]# make get
go get -t ./...
go install: no install location for directory /root/hail/swift-go/go/bench outside GOPATH
For more details see: go help gopath
[root@rhev-a24c-01 go]# pwd
/root/hail/swift-go/go
[root@rhev-a24c-01 go]# ls -l /root/go/src/github.com/openstack/swift
lrwxrwxrwx. 1 root root 25 Jun 6 21:50 /root/go/src/github.com/openstack/swift -> ../../../../hail/swift-go
[root@rhev-a24c-01 go]# cd /root/go/src/github.com/openstack/swift/go
[root@rhev-a24c-01 go]# pwd
/root/go/src/github.com/openstack/swift/go
[root@rhev-a24c-01 go]# make get
go get -t ./...
[root@rhev-a24c-01 go]#

June 07, 2016 09:21 PM

Matthew Garrett: Be wary of heroes

Inspiring change is difficult. Fighting the status quo typically means being able to communicate so effectively that powerful opponents can't win merely by outspending you. People need to read your work or hear you speak and leave with enough conviction that they in turn can convince others. You need charisma. You need to be smart. And you need to be able to tailor your message depending on the audience, even down to telling an individual exactly what they need to hear to take your side. Not many people have all these qualities, but those who do are powerful and you want them on your side.

But the skills that allow you to convince people that they shouldn't listen to a politician's arguments are the same skills that allow you to convince people that they shouldn't listen to someone you abused. The ability that allows you to argue that someone should change their mind about whether a given behaviour is of social benefit is the same ability that allows you to argue that someone should change their mind about whether they should sleep with you. The visibility that gives you the power to force people to take you seriously is the same visibility that makes people afraid to publicly criticise you.

We need these people, but we also need to be aware that their talents can be used to hurt as well as to help. We need to hold them to higher standards of scrutiny. We need to listen to stories about their behaviour, even if we don't want to believe them. And when there are reasons to believe those stories, we need to act on them. That means people need to feel safe in coming forward with their experiences, which means that nobody should have the power to damage them in reprisal. If you're not careful, allowing charismatic individuals to become the public face of your organisation gives them that power.

There's no reason to believe that someone is bad merely because they're charismatic, but this kind of role allows a charismatic abuser both a great deal of cover and a great deal of opportunity. Sometimes people are just too good to be true. Pretending otherwise doesn't benefit anybody but the abusers.

comment count unavailable comments

June 07, 2016 03:33 AM

June 06, 2016

LPC 2016: LPC 2016 registration is now Open

Early bird rate is available until our quota of 140 runs out (or we reach 26 August).  Please click here to register.

June 06, 2016 05:38 PM

Daniel Vetter: Awesome Atomic Advances

Also, silly titles. Atomic has taken of for real, right now there’s 17 drivers supporting atomic modesetting merged into the DRM subsystem. And still a pile of them each release pending for review&merging. But it’s not just new drivers, there’s also been a steady stream of small improvements over the past year, I think it’s time for an update.

It seems small, but a big improvement made over the past few months is that most driver callbacks used by the helper libraries are now optional. Which means tons and tons of dummy functions and boilerplate code can be removed from drivers, leading to less clutter and easier to understand driver code. Aside: Not all drivers have been decluttered, doing that is great starter project for contributing a first few patches to the DRM subsystem. Many thanks to Boris Brezillion, Noralf Trønnes and many others for making this happen.

A long standing complaint about the DRM kernel mode setting is that it’s too complicated, especially compared to fbdev when all you have is one dumb framebuffer and nothing else. And yes, in that case there’s really no point in having distinct CRTC, plane and encoder objects, and now finally Noralf Trønnes has volunteered to write a helper library for simple display pipelines to hide all that complexity from drivers. It’s not yet merged but I’m postive it’ll land in 4.8. And it will help to make writing DRM drivers for simple hardware easy and the driver code clutter-free.

Another piece many dumb framebuffer drivers need is support for manually uploading new contents to the screen. Often on this simple panels there’s no point in doing page-flipping of entire buffers since a real render engine is nowhere to be seen. And the panels are often behind a really slow bus, making full screen uploads to expensive. Instead it’s all done by directly drawing into the frontbuffer, and then telling the driver what changed so that it can shovel the new bits over the bus to the panel. DRM has full support for this through a dirty interface and IOCTL, and legacy fbdev also has some support for this. But the fbdev emulation helpers in DRM never wired these two bits together, forcing drivers to all type their own boilerplate. Noralf has fixed this by implementing fbdev deferred I/O support for the DRM fbdev helpers.

A related improvement is generic support to disable the fbdev emulation from Archit Tajena, both through a Kconfig option and a module option. Most distributions still expect fbdev to work for the boot splash, recovery console and emergency logging. But some, like ChromeOS, are entirely legacy-free and don’t need any of this. Thus far every DRM driver had to add implement support for fbdev emulation and disabling it optionally itself. Now that’s all done in the library using dummy stub functions in the disabled case, again simplifying driver code.

Somehow most ARM-SoC display drivers start out their system suspend/resume support with a dumb register save/restore. I guess because with simple hardware that works, and regmap provides it almost for free. And then everyone learns the lessons why the atomic modeset helpers have a very strict state transition model the hard way: Display hardware gets upset extremely easily when things are done in the wrong order, or without the required delays, obeying the depencies between components and much more. Dumb register restoring does none of that. To fix this Thierry Redding implemented suspend/resume helpers for atomic drivers. Unfortunately not many drivers use this support yet, which is again a nice opportunity to get a kernel patch merged if you have the hardware for such a driver.

Another big gap in the original atomic infrastructure that’s finally getting close is generic support for nonblocking commits. The tricky part there is getting the depency tracking between commits on different display parts right, and I secretly hoped that with a few examples it would be easier to implement something that’s useful for most drivers. With 17 examples I’ve finally run out of excuse to postpone this, after more than 1 year.

But even more important than making the code prettier for atomic drivers and removing boilerplate with better helpers and libraries is in my opinion explaing it all, and making sure all drivers work the same. Over the past few months there’s been massive sphinx-based documentation toolchain - the above links are already generated using that for a peek at all the new pretty.

The flip side is testing, and on that front Collabora’s effort to  convert all the kernel mode-setting tests in Tomeu Vizoso’s blog on validating changes to KMS drivers.

Finally it’s not all improvements to make it easier to write great drivers, there’s also some new feature work. Lionel Landwerlin added new atomic properties to implement color management support. And there’s work on-going to implement Z-order and blending properties, and lots more, but that’s not yet ready for merging.

June 06, 2016 09:41 AM

June 04, 2016

Daniel Vetter: On Getting Patches Merged

In some project there’s an awesome process to handle newcomer’s contributions - autobuilder picks up your pull and runs full CI on it, coding style checkers automatically do basic review, and the functional review load is also at least all assigned with tooling too.

Then there’s projects where utter chaos and ad-hoc process reign, like the Linux kernel or the X.org community, and it’s much harder for new folks to get their foot into the door. Of course there’s documentation trying to bridge that gap, tools like get_maintainers.pl to figure out whom to ping, but that’s kinda the details. In the end you need someone from the inside to care about what you’re doing and guide you through the maze the first few times.

I’ve been pinged about this a few times recently on IRC, so I figured I’ll type up my recommended best practices.

The crucial bit is that such unstructured developer communities run entirely on mutual trust, and patches get reviewed through a market of favours as in “I review yours and you review my patches”. As a newcomer you have neither. The goal is to build up enough trust and review favours owed to you to get your patches in.

And finally for the flip side, there’s a great write up from Sarah Sharp about doing review, which applies especially to reviewing newcomer’s patches.

June 04, 2016 11:11 AM

June 03, 2016

LPC 2016: Registration Opening Delayed until Monday 6 June

A while ago, we decided to combine the Kernel Summit Open Day with Linux Plumbers, meaning that Plumbers itself runs now from 1-4 November (as you can see from the updated site banners).  Unfortunately, we neglected to test that the registration system was ready for this change, so we’re scrambling now to fix it and we should have registrations open by Monday.  Sorry for the Delay.

June 03, 2016 03:20 PM

May 30, 2016

Andi Kleen: Literature survey for practical TSX lock elision applications

Introduction

This is a short non comprehensive (and somewhat biased) literature survey on how lock elision with Intel TSX can be used to improve performance to programs by increasing parallelism. Focus is on practical incremental improvements in existing software.

A basic introduction of TSX lock elision is available in Scaling existing lock based applications with Lock elision.

Lock libraries

The papers below are on actual code using lock elision, not just how to implement lock elision itself. The basic rules of how to implement lock elision in a locking library are in chapter 12 of the Intel Optimization manual. Common anti-patterns (mistakes) while doing so are described in TSX Anti-patterns.

Existing lock libraries that implement lock elision using TSX include glibc pthreads, Thread Building Blocks, tsx-tools and ConcurrencyKit

Databases

Lock elision has been used widely to speed up both production and research databases. A lot of work has been done on the SAP HANA in memory data which uses TSX in production today to improve performance of the B+Tree index and the delta tree data structures. This is described in Improving In-Memory Database Index Performance with TSX.

Several research databases went beyond just using TSX to speed up specific data structures, but map complete database transactions to hardware transaction. This requires much more changes to the databases, and careful memory layout, but can also give more gains. It generally goes beyond simple lock elision. This approach has been implemented by Leis in Exploiting HTM in Main memory databases. A similar approach is used in Using Restricted Transactional Memory to Build a Scalable In-Memory Database. Both see large gains on TPC-C style workloads.

For non in memory databases, a group at Berkeley used lock elision to improve parallelism in LevelDB and getting a 25% speedup on a 4 core system. Standard LevelDB is essentially a single lock system, so this was a nice speedup with only minor work compared to other efforts that used manual fine grained locking to improve parallelism in LevelDB. However it required special handling of condition variables, which are used for commit. For simpler key value stores Diegues used an automatic tuner with TSX transactions to get a 2x gain with memcached.

DrTM uses TSX together with RDMA to implement a fast distributed transaction manager using 2PL. TSX provides local isolation, while RDMA (which aborts transactions) provides remote isolation.

Languages

An attractive use of lock elision is to speed up locks implicit in language runtimes. Yoo implemented transparent. support for using TSX for Java synchronized sections in Early Experience on Transactional
Execution of Java TM Programs
. The runtime automatically detects when a synchronized region does not benefit from TSX and disables lock elision then. This works transparently, but using it successfully may still need some changes in the program to minimize conflicts and other aborts, as described by Gil Tene in Understanding Hardware Transactional Memory. This support is available in JDK 8u40 and can be enabled with the -XX:+UseRTMLocking option.

Another interesting case for lock elision is improving parallelism for the Great Interpreter Lock (GIL) used in interpreters. Odaiara implemented this for Ruby in Eliminating Global Interpreter Locks in Ruby through HTM. They saw a 4.4x speedup on Ruby NPB, 1.6x in WEBrick and 1.2x speedup in Ruby on Rails on a 12 core system.

Hardware transactions can be used for auto-parallelizing existing loops, even when the compiler cannot prove that iterations are independent, by using the transactions to isolate individual iterations. Odaira implemented TSX loop speculation manually for workloads in SPECCpu and report a 11% speedup on a 4 core system. There are some limitations to this technique due to the lazy subscription problem described by Dice, but in principle it can be directly implemented in compilers.

Salamanca used TSX to implement tracing recovery code for Speculative Trace Optimization (STO) for loops. The basic principles are similar to the previous paper, but they implemented an automated prototype. They report 9% improvement on 4 cores with a number of benchmarks.

An older paper, which predates TSX, by Dice describes how to use Hardware Transactional Memory to simplify Work Stealing schedulers.

High Performance Computing

Yoo.et.al. use TSX lock elision to benchmark a number of HPC applications in Performance Evaluation of Intel TSX for High-Performance Computing. They report an average 41% speedup on a 4 core system. They also report an average 31% improvement in bandwidth when applying TSX to a user space TCP stack.

Hay explored lock elision for improving Parallel Discrete Event Simulations. He reports a speed ups of up to 28%.

Data structures

Lock elision has been used widely to speed up parallel data structures. Normally applying lock elision to an existing data structure is very simple — elide the lock protecting it — but some tweaking of the data structure can often give better performance. Dementiev explores TSX for general fast scalable hash tables. Li uses TSX to implement scalable Cuckoo hash tables. Using TSX for hash tables is generally very straight forward. For tree data structures one need to be careful that tree rebalancing does not overwhelm the write set capacity of the hardware transactions., Repetti uses TSX to scale Patricia Tries. Siakavas explores TSX usage for scalable Red-Black Trees, similar in this paper. Bonnichsen uses HTM to improve BT Trees, reporting a speed up of 2x to 3x compared to earlier implementations. The database papers described above describe the rules needed to successfully elide BTrees.

Calciu uses TSX to implement a more scalable priority queue based on skip list and reports increased parallelism.

Memory allocation and Garbage Collection

One challenge with using garbage collection is that the worst case “stop the world” pauses from parallel garbage collectors limit the total heap size that can be supported in copying garbage collectors. The Chihuahua GC paper implements a prototype TSX based collector for the Jikes research java VM. They report upto 101% speedup in concurrent copying speed, and show that a simple parallel garbage collector can be implemented with limited efforts.

Another dog-themed GC, the Collie garbage collector (original paper predates TSX) presents a production quality parallel collector that minimizes pauses and allows scaling to large heaps. Opdahl has another description of the Collie algorithm It is presumably deployed now for TSX in Azul’s commercial ZING JVM product which claims to scale upto 2TB of heap memory.

StackTrack is an efficient algorithm to do automatic memory reclaim for parallel data structures using hardware transactions, out performing existing techniques such as hazard pointers. It requires recompiling the program with a special patched gcc compiler, and automatically creates variable-length transactions for functions freeing memory. The technique could be potentially used even without special compilers.

Kuzmaul uses TSX to implement a scalable SuperMalloc and reports good performance combined with relatively simple code. Dice et all report how an cache index aware malloc can improve TSX performance by improving utilization of the L1 cache.

Other usages

Peters use lock elision to parallelize a micro kernel For a micro kernel a big lock is fine and finds that RTM lock elision outperforms fine grained locking due to less single thread overhead.

May 30, 2016 04:40 AM

May 29, 2016

Pete Zaitcev: Encrypt everything? Please reconsider.

Somehow it became fashionable among site admins to set it up so that accessing over HTTP was immediately redirected to https://. But doing that adds new ways to fail, such as certificate expired:

Notice that Firefox provides no way to ignore the problem and access the website (which was supposed to be accessible over HTTP to begin with). Solution? Use Chrome, which does:

Or, disable NTP and change your PC's clock two days back (be careful with running make while doing so).

This was discussed by CKS previously (of course), and he seems to think that the benefits outweigh the downsides of an occasional fuck-up, like the only website in the world that has the information that I want right now is suddenly unavailable with no recourse.

UPDATE: Chris dicussed the problem some more and brought up other examples, such as outdated KVM appliances that use obsolete ciphers.

One thing I'm wondering about is if a redirect from http:// to https:// makes a lot of sense. If you do not support access by plain HTTP, why not return ECONNREFUSED? I'm sure it's an extremely naiive idea.

May 29, 2016 05:16 PM

May 26, 2016

Vegard Nossum: Writing a reverb filter from first principles

WARNING/DISCLAIMER: Audio programming always carries the risk of damaging your speakers and/or your ears if you make a mistake. Therefore, remember to always turn down the volume completely before and after testing your program. And whatever you do, don't use headphones or earphones. I take no responsibility for damage that may occur as a result of this blog post!

Have you ever wondered how a reverb filter works? I have... and here's what I came up with.

Reverb is the sound effect you commonly get when you make sound inside a room or building, as opposed to when you are outdoors. The stairwell in my old apartment building had an excellent reverb. Most live musicians hate reverb because it muddles the sound they're trying to create and can even throw them off while playing. On the other hand, reverb is very often used (and overused) in the studio vocals because it also has the effect of smoothing out rough edges and imperfections in a recording.

We typically distinguish reverb from echo in that an echo is a single delayed "replay" of the original sound you made. The delay is also typically rather large (think yelling into a distant hill- or mountainside and hearing your HEY! come back a second or more later). In more detail, the two things that distinguish reverb from an echo are:

  1. The reverb inside a room or a hall has a much shorter delay than an echo. The speed of sound is roughly 340 meters/second, so if you're in the middle of a room that is 20 meters by 20 meters, the sound will come back to you (from one wall) after (20 / 2) / 340 = ~0.029 seconds, which is such a short duration of time that we can hardly notice it (by comparison, a 30 FPS video would display each frame for ~0.033 seconds).
  2. After bouncing off one wall, the sound reflects back and reflects off the other wall. It also reflects off the perpendicular walls and any and all objects that are in the room. Even more, the sound has to travel slightly longer to reach the corners of the room (~14 meters instead of 10). All these echoes themselves go on to combine and echo off all the other surfaces in the room until all the energy of the original sound has dissipated.

Intuitively, it should be possible to use multiple echoes at different delays to simulate reverb.

We can implement a single echo using a very simple ring buffer:

    class FeedbackBuffer {
    public:
        unsigned int nr_samples;
        int16_t *samples;

        unsigned int pos;

        FeedbackBuffer(unsigned int nr_samples):
            nr_samples(nr_samples),
            samples(new int16_t[nr_samples]),
            pos(0)
        {
        }

        ~FeedbackBuffer()
        {
            delete[] samples;
        }

        int16_t get() const
        {
            return samples[pos];
        }

        void add(int16_t sample)
        {
            samples[pos] = sample;

            /* If we reach the end of the buffer, wrap around */
            if (++pos == nr_samples)
                pos = 0;
        }
    };

The constructor takes one argument: the number of samples in the buffer, which is exactly how much time we will delay the signal by; when we write a sample to the buffer using the add() function, it will come back after a delay of exactly nr_samples using the get() function. Easy, right?

Since this is an audio filter, we need to be able to read an input signal and write an output signal. For simplicity, I'm going to use stdin and stdout for this -- we will read 8 KiB at a time using read(), process that, and then use write() to output the result. It will look something like this:

    #include <cstdio>
    #include <cstdint>
    #include <cstdlib>
    #include <cstring>
    #include <unistd.h>


    int main(int argc, char *argv[])
    {
        while (true) {
            int16_t buf[8192];
            ssize_t in = read(STDIN_FILENO, buf, sizeof(buf));
            if (in == -1) {
                /* Error */
                return 1;
            }
            if (in == 0) {
                /* EOF */
                break;
            }

            for (unsigned int j = 0; j < in / sizeof(*buf); ++j) {
                /* TODO: Apply filter to each sample here */
            }

            write(STDOUT_FILENO, buf, in);
        }

        return 0;
    }

On Linux you can use e.g. 'arecord' to get samples from the microphone and 'aplay' to play samples on the speakers, and you can do the whole thing on the command line:

    $ arecord -t raw -c 1 -f s16 -r 44100 |\
        ./reverb | aplay -t raw -c 1 -f s16 -r 44100

(-c means 1 channel; -f s16 means "signed 16-bit" which corresponds to the int16_t type we've used for our buffers; -r 44100 means a sample rate of 44100 samples per second; and ./reverb is the name of our executable.)

So how do we use class FeedbackBuffer to generate the reverb effect?

Remember how I said that reverb is essentially many echoes? Let's add a few of them at the top of main():

    FeedbackBuffer fb0(1229);
    FeedbackBuffer fb1(1559);
    FeedbackBuffer fb2(1907);
    FeedbackBuffer fb3(4057);
    FeedbackBuffer fb4(8117);
    FeedbackBuffer fb5(8311);
    FeedbackBuffer fb6(9931);

The buffer sizes that I've chosen here are somewhat arbitrary (I played with a bunch of different combinations and this sounded okay to me). But I used this as a rough guideline: simulating the 20m-by-20m room at a sample rate of 44100 samples per second means we would need delays roughly on the order of 44100 / (20 / 340) = 2594 samples.

Another thing to keep in mind is that we generally do not want our feedback buffers to be multiples of each other. The reason for this is that it creates a consonance between them and will cause certain frequencies to be amplified much more than others. As an example, if you count from 1 to 500 (and continue again from 1), and you have a friend who counts from 1 to 1000 (and continues again from 1), then you would start out 1-1, 2-2, 3-3, etc. up to 500-500, then you would go 1-501, 2-502, 3-504, etc. up to 500-1000. But then, as you both wrap around, you start at 1-1 again. And your friend will always be on 1 when you are on 1. This has everything to do with periodicity and -- in fact -- prime numbers! If you want to maximise the combined period of two counters, you have to make sure that they are relatively coprime, i.e. that they don't share any common factors. The easiest way to achieve this is to only pick prime numbers to start with, so that's what I did for my feedback buffers above.

Having created the feedback buffers (which each represent one echo of the original sound), it's time to put them to use. The effect I want to create is not simply overlaying echoes at fixed intervals, but to have the echos bounce off each other and feed back into each other. The way we do this is by first combining them into the output signal... (since we have 8 signals to combine including the original one, I give each one a 1/8 weight)

    float x = .125 * buf[j];
    x += .125 * fb0.get();
    x += .125 * fb1.get();
    x += .125 * fb2.get();
    x += .125 * fb3.get();
    x += .125 * fb4.get();
    x += .125 * fb5.get();
    x += .125 * fb6.get();
    int16_t out = x;

...then feeding the result back into each of them:

    fb0.add(out);
    fb1.add(out);
    fb2.add(out);
    fb3.add(out);
    fb4.add(out);
    fb5.add(out);
    fb6.add(out);

And finally we also write the result back into the buffer. I found that the original signal loses some of its power, so I use a factor 4 gain to bring it roughly back to its original strength; this number is an arbitrary choice by me, I don't have any specific calculations to support it:

    buf[j] = 4 * out;

That's it! 88 lines of code is enough to write a very basic reverb filter from first principles. Be careful when you run it, though, even the smallest mistake could cause very loud and unpleasant sounds to be played.

If you play with different buffer sizes or a different number of feedback buffers, let me know if you discover anything interesting :-)

May 26, 2016 07:05 PM

May 25, 2016

Pete Zaitcev: Russian Joke

In a quik translation from Bash:

XXX: Still writing that profiler?
YYY: Naah, reading books now
XXX: Like what books?
YYY: TCP Illustrated, Understanding the Linux kernel, Linux kernel development
XXX: And I read "The Never-ending Path Of Hatred".
YYY: That's about Node.js, right?

May 25, 2016 07:15 PM

May 22, 2016

Pete Zaitcev: Dell, why u no VPNC

Yo, we heard you liked remote desktops, so we put remote desktop into a remote desktop, now you can remote desktop while remote desktop.

I remember how IBM simply put VPNC interface in their Bladecenter. It was so nice. Unfortunately, vendors never want to be too nice to users, so their next release switched to a Java applet. Dell copied their approach for DRAC5. In theory, this should be future-proof, all hail WORA. In practice, it only worked with a specific version of Java, which was current when Dell shipped R905 ten years ago. You know, back then Windows XP was new and hot.

Fortunately, by a magic of KVM, libvirt, and Qemu, it's possible to create a virtual machine, install Fedora 10 on it, and then run Firefox with the stupid Java applet on it. Also, the Firefox and Java have to run in 32-bit mode.

When I did it for the first time, I ran Firefox through X11 redirection. That was quite inconvenient: I had to stop the Firefox running on the host desktop, because one cannot run 2 firefoxes painting to the same $DISPLAY. The reason that happens is, well, Mozilla Foundation is evil, basically. The remote Firefox finds the running Firefox through X11 properties and then some crapmagic happens and everything crashes and burns. So, it's much easier just to hook to the VM with Vinagre and run Firefox with DISPLAY=:0 in there.

Those old Fedoras were so nice, BTW. Funnily enough, that VM with 1 CPU and 1.5 GB starts quicker than the host laptop with the benefit of SystemD and its ability to run tasks in parallel. Of course, the handing of WiFi in Fedora 20+ is light years ahead of nm-applet in Fedora 10. There was some less noticeable progress elsewhere as well. But in the same time, the bloat was phenomenal.

UPDATE: Java does not work. Running the JNLP simply fails after downloading the applets, without any error messages. To set the plugin type of "native", ssh to DRAC, then "racadm config -g cfgRacTuning -o cfgRacTunePluginType 0". No kidding.

May 22, 2016 03:46 AM

May 20, 2016

Matthew Garrett: Your project's RCS history affects ease of contribution (or: don't squash PRs)

Github recently introduced the option to squash commits on merge, and even before then several projects requested that contributors squash their commits after review but before merge. This is a terrible idea that makes it more difficult for people to contribute to projects.

I'm spending today working on reworking some code to integrate with a new feature that was just integrated into Kubernetes. The PR in question was absolutely fine, but just before it was merged the entire commit history was squashed down to a single commit at the request of the reviewer. This single commit contains type declarations, the functionality itself, the integration of that functionality into the scheduler, the client code and a large pile of autogenerated code.

I've got some familiarity with Kubernetes, but even then this commit is difficult for me to read. It doesn't tell a story. I can't see its growth. Looking at a single hunk of this diff doesn't tell me whether it's infrastructural or part of the integration. Given time I can (and have) figured it out, but it's an unnecessary waste of effort that could have gone towards something else. For someone who's less used to working on large projects, it'd be even worse. I'm paid to deal with this. For someone who isn't, the probability that they'll give up and do something else entirely is even greater.

I don't want to pick on Kubernetes here - the fact that this Github feature exists makes it clear that a lot of people feel that this kind of merge is a good idea. And there are certainly cases where squashing commits makes sense. Commits that add broken code and which are immediately followed by a series of "Make this work" commits also impair readability and distract from the narrative that your RCS history should present, and Github present this feature as a way to get rid of them. But that ends up being a false dichotomy. A history that looks like "Commit", "Revert Commit", "Revert Revert Commit", "Fix broken revert", "Revert fix broken revert" is a bad history, as is a history that looks like "Add 20,000 line feature A", "Add 20,000 line feature B".

When you're crafting commits for merge, think about your commit history as a textbook. Start with the building blocks of your feature and make them one commit. Build your functionality on top of them in another. Tie that functionality into the core project and make another commit. Add client support. Add docs. Include your tests. Allow someone to follow the growth of your feature over time, with each commit being a chapter of that story. And never, ever, put autogenerated code in the same commit as an actual functional change.

People can't contribute to your project unless they can understand your code. Writing clear, well commented code is a big part of that. But so is showing the evolution of your features in an understandable way. Make sure your RCS history shows that, otherwise people will go and find another project that doesn't make them feel frustrated.

(Edit to add: Sarah Sharp wrote on the same topic a couple of years ago)

comment count unavailable comments

May 20, 2016 12:06 AM

May 17, 2016

Gustavo F. Padovan: Collabora contributions to Linux Kernel 4.6

Linux Kernel 4.6 was released this week, and a total of 9 Collabora engineers took part in its development, Collabora’s highest number of engineers contributing to a single Linux Kernel release yet. In total Collabora contributed 42 patches.

As part of Collabora’s continued commitment to further increase its participation to the Linux Kernel, Collabora is actively looking to expand its team of core software engineers. If you’d like to learn more, follow this link.

Here are some highlights of Collabora’s participation in Kernel 4.6:

Andrew Shadura fixed the number of buttons reported on the Pemount 6000 USB touchscreen controller, while Daniel Stone enabled BCM283x familiy devices in the ARM multi_v7_defconfig and Emilio López added module autoloading for a few sunxi devices.

Enric Balletbo i Serra added boot console output to AM335X(Sitara) and OMAP3-IGEP and fixed audio codec setup on AM335X using the right external clock. Martyn Welch added the USB device ID for the GE Healthcare cp210x serial device and renamed the reset reason of the Zodiac Watchdog.

Gustavo Padovan cleaned up the Android Sync Framework on the staging tree for further de-staging of the Sync File infrastructure, which will land in 4.7. Most of the work was removing interfaces that won’t be used in mainline. He also added vblank event support for atomic commits in the virtio DRM driver.

Peter Senna improved an error path and added some style fixes to the sisusbvga driver. While Sjoerd Simons enabled wireless on radxa Rock2 boards, fixed an issue withthe brcmfmac sdio driver sometimes timing out with a false positive and fixed some issues with Serial output on Renesas R-Car porter board.

Tomeu Vizoso changed driver_match_device() to return errors and in case of -EPROBE_DEFER queue the device for deferred probing, he also provided two fixes to Rockchip DRM driver as part of his work on making intel-gpu-tools work on other platforms.

Following is a list of all patches submitted by Collabora for this kernel release:

Andrew Shadura (1):

Daniel Stone (1):

Emilio López (4):

Enric Balletbo i Serra (3):

Gustavo Padovan (17):

Martyn Welch (2):

Peter Senna Tschudin (4):

Sjoerd Simons (6):

Tomeu Vizoso (4):

May 17, 2016 06:15 PM

May 12, 2016

Matthew Garrett: Convenience, security and freedom - can we pick all three?

Moxie, the lead developer of the Signal secure communication application, recently blogged on the tradeoffs between providing a supportable federated service and providing a compelling application that gains significant adoption. There's a set of perfectly reasonable arguments around that that I don't want to rehash - regardless of feelings on the benefits of federation in general, there's certainly an increase in engineering cost in providing a stable intra-server protocol that still allows for addition of new features, and the person leading a project gets to make the decision about whether that's a valid tradeoff.

One voiced complaint about Signal on Android is the fact that it depends on the Google Play Services. These are a collection of proprietary functions for integrating with Google-provided services, and Signal depends on them to provide a good out of band notification protocol to allow Signal to be notified when new messages arrive, even if the phone is otherwise in a power saving state. At the time this decision was made, there were no terribly good alternatives for Android. Even now, nobody's really demonstrated a free implementation that supports several million clients and has no negative impact on battery life, so if your aim is to write a secure messaging client that will be adopted by as many people is possible, keeping this dependency is entirely rational.

On the other hand, there are users for whom the decision not to install a Google root of trust on their phone is also entirely rational. I have no especially good reason to believe that Google will ever want to do something inappropriate with my phone or data, but it's certainly possible that they'll be compelled to do so against their will. The set of people who will ever actually face this problem is probably small, but it's probably also the set of people who benefit most from Signal in the first place.

(Even ignoring the dependency on Play Services, people may not find the official client sufficient - it's very difficult to write a single piece of software that satisfies all users, whether that be down to accessibility requirements, OS support or whatever. Slack may be great, but there's still people who choose to use Hipchat)

This shouldn't be a problem. Signal is free software and anybody is free to modify it in any way they want to fit their needs, and as long as they don't break the protocol code in the process it'll carry on working with the existing Signal servers and allow communication with people who run the official client. Unfortunately, Moxie has indicated that he is not happy with forked versions of Signal using the official servers. Since Signal doesn't support federation, that means that users of forked versions will be unable to communicate with users of the official client.

This is awkward. Signal is deservedly popular. It provides strong security without being significantly more complicated than a traditional SMS client. In my social circle there's massively more users of Signal than any other security app. If I transition to a fork of Signal, I'm no longer able to securely communicate with them unless they also install the fork. If the aim is to make secure communication ubiquitous, that's kind of a problem.

Right now the choices I have for communicating with people I know are either convenient and secure but require non-free code (Signal), convenient and free but insecure (SMS) or secure and free but horribly inconvenient (gpg). Is there really no way for us to work as a community to develop something that's all three?

comment count unavailable comments

May 12, 2016 02:40 PM

May 10, 2016

Daniel Vetter: Neat drm/i915 Stuff for 4.7

The 4.6 release is almost out of the door, it’s time to look at what’s in store for 4.7.

Let’s first look at the epic saga called atomic support. In 4.7 the atomic watermark update support for Ironlake through Broadwell from Matt Roper, Ville Syrjälä and others finally landed. This took about 3 attempts to get merged because there’s lots of small little corner cases that caused regressions each time around, but it’s finally done. And it’s an absolutely key piece for atomic support, since Intel hardware does not support atomic updates of the watermark settings for the display fetch fifos. And if those values are wrong tearings and other ugly things will result. We still need corresponding support for other platforms, but this is a really big step. But that’s not the only atomic work: Maarten Lankhorst made the hardware state checker atomic, and there’s been tons of smaller things all over to move the driver towards the shiny new.

Another big feature on the display side is color management, implemented by Lionel Landwerlin, and then fixes to make it fully atomic from Maarten. Color management aims for more accurate reproduction of a well definied color space on panels, using a de-gamma table, then a color matrix, and finally a gamma table.

For platform enabling the big thing is support for DSI panels on Broxton from Jani Nikula and Ramalingam C. One fallout from this effort is the cleaned up VBT parsing code, done by Jani. There’s now a clean split between parsing the different VBT versions on all the various platforms, now neatly consolidated, and using that information in different places within the driver. Ville also hooked up upscaling/panel fitting for DSI panels on all platforms.

Looking more at driver internals Ander Conselvan de Oliviera and Ville refactored the entire display PLL code on all platforms, with the goal to reuse it in the DP detection code for upfront link training. This is needed to detect the link configuration in certain situations like USB type C connectors. Shubhangi Shrivastava reworked the DP detection code itself, again to prep for these features. Still on pure display topics Ville fixed lots of underrun issues to appease our CI on lots of platforms. Together with the atomic watermark updates this should shut up one of the largest sources of noise in our test results.

Moving on to power management work the big thing is lots of small fixes for the runtime PM support all over the place from Imre Deak and Ville, with a big focus on the Broxton platform. And while we talk features affecting the entire driver: Imre** added fault injection to the driver load paths** so that we can start to exercise all that code in an automated way.

Finally looking at the render/GEM side of the driver the short summary is that Tvrtko Ursulin and Chris Wilson worked the code all over the place: A cleanup up and tuned forcewake handling code from Tvrtko, fixes for more userptr corner cases from Chris, a new notifier to handle vmap exhaustion and assorted polish in the related shrinker code, cleaned up and fixed handling of gpu reset corner cases, fixes for context related hard hangs on Sandybridge and Ironlake, large-scale renaming of parameters and structures to realign old code with the newish execlist hardware mode, the list goes on. And finally a rather big piece, and one which causes some trouble, is all the work to speed up the execlist code, with a big focusing on reducing interrupt handling overhead. This was done by moving the expensive parts of execlist interrupt handling into a tasklet. Unfortunately that uncovered some bugs in our interrupt handling on Braswell, so Ville jumped in and fixed it all up, plus of course removed some cruft and applied some nice polish.

Other work in the GT are are gpu hang fixes for Skylake GT3 and GT4 configurations from Mika Kuoppala. Mika also provided patches to improve the edram handling on those same chips. Alex Dai and Dave Gordon kept working on making GuC ready for prime time, but not yet there. And Peter Antoine improved the MOCS support to work on all engines.

And of course there’s been tons of smaller improvements, bugfixes, cleanups and refactorings all over the place, as usual.

May 10, 2016 09:52 PM

May 06, 2016

Pete Zaitcev: Dropbox lifts the kimono

Dropbox posted somewhat of a whitepaper about their exabyte storage system, which exceeds the largest Swift cluster by about 2 orders of magnitude. Here's a couple of fun quotes:

The Block Index is a giant sharded MySQL cluster, fronted by an RPC service layer, plus a lot of tooling for database operations and reliability. We’d originally planned on building a dedicated key-value store for this purpose but MySQL turned out to be more than capable.

Kinda like SQLite in Swift.

Cells are self-contained logical storage clusters that store around 50PB of raw data.

And they have dozens of those. Their cell has a master node BTW. Kinda like Ceph's PG, but unlike Swift.

RTWT

May 06, 2016 08:38 PM

May 05, 2016

LPC 2016: Tracing Microconference Accepted into 2016 Linux Plumbers Conference

After taking a break in 2015, Tracing is back at Plumbers this year! Tracing is heavily used throughout the Linux ecosystem, and provides an essential method for extracting information about the underlying code that is running on the system. Although tracing is simple in concept, effective usage and implementation can be quite involved.

Topics proposed for this year’s event include new features in the BPF compiler collection, perf, and ftrace; visualization frameworks; large-scale tracing and distributed debugging; always-on analytics and monitoring; do-it-yourself tracing tools; and, last but not least, a kernel-tracing wishlist.

We hope to see you there!

May 05, 2016 10:21 PM

May 04, 2016

Paul E. Mc Kenney: Tracing Microconference Accepted into 2016 Linux Plumbers Conference

After taking a break in 2015, Tracing is back at Plumbers this year! Tracing is heavily used throughout the Linux ecosystem, and provides an essential method for extracting information about the underlying code that is running on the system. Although tracing is simple in concept, effective usage and implementation can be quite involved.

Topics proposed for this year's event include new features in the BPF compiler collection, perf, and ftrace; visualization frameworks; large-scale tracing and distributed debugging; always-on analytics and monitoring; do-it-yourself tracing tools; and, last but not least, a kernel-tracing wishlist.

We hope to see you there!

May 04, 2016 03:18 PM

April 28, 2016

Pete Zaitcev: OpenStack Swift Proxy-FS by SwiftStack

SwiftStack's Joe Arnold and John Dickinson chose the Austin Summit and a low-key, #vBrownBag venue, to come out of closet with PROXY-FS (also spelled as ProxyFS), a tightly integrated addition to OpenStack Swift, which provides a POSIX-ish filesystem access to a Swift cluster.

Proxy-FS is basically a peer to a less known feature of Ceph Rados Gateway that permits accessing it over NFS. Both of them are fundamentally different from e.g. Swift-on-file in that the data is kept in Swift or Ceph, instead of a general filesystem.

The object layout is natural in that it takes advantage of SLO by creating a log-structured, manifested object. This way the in-place updates are handled, including appends. Yes, you can create a manifest with a billion of 1-byte objects just by invoking write(2). So, don't do that.

In response to my question, Joe promised to open the source, although we don't know when.

Another question dealt with the performance expectations. The small I/O performance of Proxy-FS is not going to be great in comparison to a traditional NFS filer. One of its key features is relative transparency: there is no cache involved and every application request goes straight to Swift. This helps to adhere to the principle of the least surprise, as well as achieve scalability for which Swift is famous. There is no need for any upcalls/cross-calls from the Swift Proxy into Proxy-FS that invalidate the cache, because there's no cache. But it has to be understood that Proxy-FS, as well as NFS mode in RGW, are not intended to compete with Netapp.

Not directly, anyway. But what they could do is to disrupt, in Christensen's sense. His disruption examples were defined as technologies that are markedly inferior to incumbents, as well as dramatically cheaper. Swift and Ceph are both: the filesystem performance sucks balls and the price per terabyte is 1/10th of NetApp (this statement was not evaluated by the Food and Drug Administration). If new applications come about that make use of these properties... You know the script.

April 28, 2016 04:42 PM

Daniel Vetter: X.Org Foundation Election Results

Two questions were up for voting, 4 seats on the Board of Directors and approval of the amended By-Laws to join SPI.

Congratulations to our reelected and new board members Egbert Eich, Alex Deucher, Keith Packard and Bryce Harrington. Thanks a lot to Lucas Stach for running. And also big thanks to our outgoing board member Matt Dew, who stepped down for personal reasons.

On the bylaw changes and merging with SPI, 61 out of 65 active members voted, with 54 voting yes, 4 no and 3 abstained. Which means we’re well past the 2/3rd quorum for bylaw changes, and everything’s green now to proceed with the plan to join SPI!

April 28, 2016 06:37 AM

April 27, 2016

James Bottomley: Unprivileged Build Containers

A while ago, a goal I set myself was to be able to maintain my build and test environments for architecture emulation containers without having to do any of the tasks as root and without creating any suid binaries to do this.  One of the big problems here is that distributions get annoyed (and don’t run correctly) if root doesn’t own most of the files … for instance the installers all check to see that the file got installed with the correct ownership and permissions and fail if they don’t.  Debian has an interesting mechanism, called fakeroot, to get around this using a preload library intercepting the chmod and chown system calls, but it’s getting a bit hackish to try to extend this to work permanently for an emulation container.

The correct way to do this is with user namespaces so the rest of this post will show you how.  Before we get into how to use them, lets begin with the theory of how user namespaces actually work.

Theory of User Namespaces

A user namespace is the single namespace that can be created by an unprivileged user.  Their job is to map a set of interior (inside the user namespace) uids, gids and projids1 to a set of exterior (outside the user namespace).

The way this works is that the root user namespace simply has a 1:1 identity mapping of all 2^32 identifiers, meaning it fully covers the space.  However, any new user namespace only need remap a subset of these.  Any id that is not mapped into the user namespace becomes inaccessible to that namespace.  This doesn’t mean completely inaccessible, it just means any resource owned or accessed by an unmapped id treats an attempted access (even from root in the namespace) as though it were completely unprivileged, so if the resource is readable by any id, it can still be read even in a user namespace where its owning id is unmapped.

User namespaces can also be nested but the nested namespace can only map ids that exist in its parent, so you can only reduce but not expand the id space by nesting.  The way the nested mapping works is that it remaps through all the parent namespaces, so what appears on the resource is still the original exterior ids.

User Namespaces also come with an owner (the uid/gid of the process that created the container).  The reason for this is that this owner is allowed to execute setuid/setgid to any id mapped into the namespace, so the owning uid/gid pair is the effective “root” of the container.  Note that this setuid/setgid property works on entry to the namespace even if uid 0 is not mapped inside the namespace, but won’t survive once the first setuid/setgid is issued.

The final piece of the puzzle is that every other namespace also has an owning user namespace, so while I cannot create a mount namespace as unprivileged user jejb, I can as remapped root inside my user namespace

jejb@jarvis:~> unshare --mount
unshare: unshare failed: Operation not permitted
jejb@jarvis:~> nsenter --user=/tmp/userns
root@jarvis:~# unshare --mount
root@jarvis:~#

And once created, I can always enter this mount namespace provided I’m also in my user namespace.

Setting up Unprivileged Namespaces

Any system user can actually create a user namespace.  However a non-root (meaning not uid zero in the parent namespace) user cannot remap any exterior id except their own.  This means that, because a build container needs a range of ids, it’s not possible to set up the intial remapped namespace without the help of root.  However, once that is done, the user can pretty much do every other operation2

The way remap ranges are set up is via the uid_map, gid_map and projid_map files sitting inside the /proc/<pid> directory.  These files may only be written to once and never updated3

As an example, to set up a build container, I need a remapping for every id that would be created during installation.  Traditionally for Linux, these are ids 0-999.  I want to remap them down to something unprivileged, say 100,000 so my line entry for this is

0 100000 1000

However, I also want an identity mapping for my own id (currently I’m at uid 1000), so I can still use my home directory from within the build container.  This also means I can create the roots for the containers within my home directory.  Finally, the nobody user and nobody,nogroup groups also need to be mapped, so the final uid map entries look like

0 100000 1000
1000 1000 1
65534 101001 1

For the groups, it’s even more complex because on openSUSE, I’m a member of the users group (gid 100) which sits in the middle of the privileged 0-999 group range, so the gid_map entry I use is

0 100000 100
100 100 1
101 100100 899
65533 101000 2

Which is almost up to the kernel imposed limit of five separate lines.

Finally, here’s how to set this up and create a binding for the user namespace.  As myself (I’m uid 1000 user name jejb) I do

jejb@jarvis:~> unshare --user
nobody@jarvis:~> echo $$
20211
nobody@jarvis:~>

Note that I become nobody inside the container because currently the map files are unwritten so there are no mapped ids at all.  Now as root, I have to write the mapping files and bind the entry file to the namespace somewhere

jarvis:/home/jejb # echo 1|awk '{print "0 100000 1000\n1000 1000 1\n65534 101001 1"}' > /proc/20211/uid_map
jarvis:/home/jejb # echo 1|awk '{print "0 100000 100\n100 100 1\n101 100100 899\n65533 101000 2"}' > /proc/20211/gid_map
jarvis:/home/jejb # touch /tmp/userns
jarvis:/home/jejb # mount --bind /proc/20211/ns/user /tmp/userns

Now I can exit my user namespace because it’s permanently bound and the next time I enter it I become root inside the container (although with uid 100000 outside)

jejb@jarvis:~> nsenter --user=/tmp/userns
root@jarvis:~# id
uid=0(root) gid=0(root) groups=0(root)
root@jarvis:~# su - jejb
jejb@jarvis:~> id
uid=1000(jejb) gid=100(users) groups=100(users)

Giving me a user namespace with sufficient mapped ids to create a build container.

Unprivileged Architecture Emulation Containers

Essentially, I can use the user namespace constructed above to bootstrap and enter the entire build container and its mount namespace with one proviso that I have to have a pre-created devices directory because I don’t possess the mknod capability as myself, so my container root also doesn’t possess it.  The way I get around this is to create the initial dev directory as root and then change the ownership to 100000.100000 (my unprivileged ids)

jejb@jarvis:~/containers/debian-amd64/dev> ls -l
total 0
lrwxrwxrwx 1 100000 100000 13 Feb 20 09:45 fd -> /proc/self/fd/
crw-rw-rw- 1 100000 100000 1, 7 Feb 20 09:45 full
crw-rw-rw- 1 100000 100000 1, 3 Feb 20 09:45 null
lrwxrwxrwx 1 100000 100000 8 Feb 20 09:45 ptmx -> pts/ptmx
drwxr-xr-x 2 100000 100000 6 Feb 20 09:45 pts/
crw-rw-rw- 1 100000 100000 1, 8 Feb 20 09:45 random
drwxr-xr-x 2 100000 100000 6 Feb 20 09:45 shm/
lrwxrwxrwx 1 100000 100000 15 Feb 20 09:45 stderr -> /proc/self/fd/2
lrwxrwxrwx 1 100000 100000 15 Feb 20 09:45 stdin -> /proc/self/fd/0
lrwxrwxrwx 1 100000 100000 15 Feb 20 09:45 stdout -> /proc/self/fd/1
crw-rw-rw- 1 100000 100000 5, 0 Feb 20 09:45 tty
crw-rw-rw- 1 100000 100000 1, 9 Feb 20 09:45 urandom
crw-rw-rw- 1 100000 100000 1, 5 Feb 20 09:45 zero

This seems to be sufficient of a dev skeleton to function with.  For completeness sake, I placed my bound user namespace into /run/build-container/userns and following the original architecture container emulation post, with modified susebootstrap and build-container scripts.  The net result is that as myself I can now enter and administer and update the build and test architecture emulation container with no suid required

jejb@jarvis:~> nsenter --user=/run/build-container/userns --mount=/run/build-container/ppc64
root@jarvis:/# id
uid=0(root) gid=0(root) groups=0(root)
root@jarvis:/# uname -m
ppc64
root@jarvis:/# su - jejb
jejb@jarvis:~> id
uid=1000(jejb) gid=100(users) groups=100(users)

The only final wrinkle is that root has to set up the user namespace on every boot, but that’s only because there’s no currently defined operating system way of doing this.

April 27, 2016 06:52 PM

April 26, 2016

LPC 2016: Checkpoint-Restore Microconference Accepted into 2016 Linux Plumbers Conference

This year will feature a four-fold deeper dive into checkpoint-restore technology, thanks to participation by people from a number of additional related projects! These are the OpenMPI message-passing library, Berkeley Lab Checkpoint/Restart (BLCR), and Distributed MultiThreaded CheckPointing (DMTCP) (not to be confused with TCP/IP), in addition to the Checkpoint/Restore in Userspace group that has participated in prior years.

Docker integration remains a hot topic, as is post-copy live migration, as well as testing/validation. As you might guess from the inclusion of people from BLCR and OpenMPI, checkpoint-restore for distributed workloads (rather than just single systems) is an area of interest.

Please join us for a timely and important discussion!

April 26, 2016 02:43 PM

April 25, 2016

Paul E. Mc Kenney: Checkpoint-Restore Microconference Accepted into 2016 Linux Plumbers Conference

This year will feature a four-fold deeper dive into checkpoint-restore technology, thanks to participation by people from a number of additional related projects! These are the OpenMPI message-passing library, Berkeley Lab Checkpoint/Restart (BLCR), and Distributed MultiThreaded CheckPointing (DMTCP) (not to be confused with TCP/IP), in addition to the Checkpoint/Restore in Userspace group that has participated in prior years.

Docker integration remains a hot topic, as is post-copy live migration, as well as testing/validation. As you might guess from the inclusion of people from BLCR and OpenMPI, checkpoint-restore for distributed workloads (rather than just single systems) is an area of interest.

Please join us for a timely and important discussion!

April 25, 2016 05:28 PM

April 24, 2016

LPC 2016: Testing & Fuzzing Microconference Accepted into 2016 Linux Plumbers Conference

Testing, fuzzing, and other diagnostics have made the Linux ecosystem much more robust than in the past, but there are still embarrassing bugs. Furthermore, million-year bugs will be happening many times per day across Linux’s huge installed base, so there is clearly need for even more aggressive validation.

The Testing and Fuzzing Microconference aims to significantly increase the aggression level of Linux-kernel validation, with discussions on tools and test suites including kselftest, syzkaller, trinity, mutation testing, and the 0day Test Robot. The effectiveness of these tools will be attested to by any of their victims, but we must further raise our game as the installed base of Linux continues to increase.

One additional way of raising the level of testing aggression is to document the various ABIs in machine-readable format, thus lowering the barrier to entry for new projects. Who knows? Perhaps Linux testing will be driven by artificial-intelligence techniques!

Join us for an important and spirited discussion!

April 24, 2016 07:43 PM

Daniel Vetter: Should the X.org Foundation join SPI? Vote Now!

In case you missed it: Please vote now on https://members.x.org/login.php!

April 24, 2016 09:14 AM

April 22, 2016

LPC 2016: Device Tree Microconference Accepted into 2016 Linux Plumbers Conference

Device-tree discussions are probably not quite as spirited as in the past, however, device tree is an active area. In particular, significant issues remain.

This microconference will cover the updated device-tree specification, debugging, bindings validation, and core-kernel code directions. In addition, there will be discussion of what does (and does not) go into the device tree, along with the device-creation and driver binding ordering swamp.

Join us for an important and spirited discussion!

April 22, 2016 01:33 PM

Matthew Garrett: Circumventing Ubuntu Snap confinement

Ubuntu 16.04 was released today, with one of the highlights being the new Snap package format. Snaps are intended to make it easier to distribute applications for Ubuntu - they include their dependencies rather than relying on the archive, they can be updated on a schedule that's separate from the distribution itself and they're confined by a strong security policy that makes it impossible for an app to steal your data.

At least, that's what Canonical assert. It's true in a sense - if you're using Snap packages on Mir (ie, Ubuntu mobile) then there's a genuine improvement in security. But if you're using X11 (ie, Ubuntu desktop) it's horribly, awfully misleading. Any Snap package you install is completely capable of copying all your private data to wherever it wants with very little difficulty.

The problem here is the X11 windowing system. X has no real concept of different levels of application trust. Any application can register to receive keystrokes from any other application. Any application can inject fake key events into the input stream. An application that is otherwise confined by strong security policies can simply type into another window. An application that has no access to any of your private data can wait until your session is idle, open an unconfined terminal and then use curl to send your data to a remote site. As long as Ubuntu desktop still uses X11, the Snap format provides you with very little meaningful security. Mir and Wayland both fix this, which is why Wayland is a prerequisite for the sandboxed xdg-app design.

I've produced a quick proof of concept of this. Grab XEvilTeddy from git, install Snapcraft (it's in 16.04), snapcraft snap, sudo snap install xevilteddy*.snap, /snap/bin/xevilteddy.xteddy . An adorable teddy bear! How cute. Now open Firefox and start typing, then check back in your terminal window. Oh no! All my secrets. Open another terminal window and give it focus. Oh no! An injected command that could instead have been a curl session that uploaded your private SSH keys to somewhere that's not going to respect your privacy.

The Snap format provides a lot of underlying technology that is a great step towards being able to protect systems against untrustworthy third-party applications, and once Ubuntu shifts to using Mir by default it'll be much better than the status quo. But right now the protections it provides are easily circumvented, and it's disingenuous to claim that it currently gives desktop users any real security.

comment count unavailable comments

April 22, 2016 01:51 AM

April 20, 2016

Paul E. Mc Kenney: Testing & Fuzzing Microconference Accepted into 2016 Linux Plumbers Conference

Testing, fuzzing, and other diagnostics have made the Linux ecosystem much more robust than in the past, but there are still embarrassing bugs. Furthermore, million-year bugs will be happening many times per day across Linux's huge installed base, so there is clearly need for even more aggressive validation.

The Testing and Fuzzing Microconference aims to significantly increase the aggression level of Linux-kernel validation, with discussions on tools and test suites including kselftest, syzkaller, trinity, mutation testing, and the 0day Test Robot. The effectiveness of these tools will be attested to by any of their victims, but we must further raise our game as the installed base of Linux continues to increase.

One additional way of raising the level of testing aggression is to document the various ABIs in machine-readable format, thus lowering the barrier to entry for new projects. Who knows? Perhaps Linux testing will be driven by artificial-intelligence techniques!

Join us for an important and spirited discussion!

April 20, 2016 04:43 PM

Daniel Vetter: X.org Foundation Election - Vote Now!

It’s election season in X.org land, and it matters: Besides new board seats we’re also voting on bylaw changes and whether to join SPI or not.

Personally, and as the secretary of the board I’m very much in favour of joining SPI. It will allow us to offload all the boring bits of running a foundation, and those are also all the bits we tend to struggle with. And that would give the board more time to do things that actually matter and help the community. And all that for a really reasonable price - running our own legal entity isn’t free, and not really worth it for our small budget mostly consisting of travel sponsoring and the occasional internship.

And bylaw changes need a qualified supermajority of all members, every vote counts and not voting essentially means voting no. Hence please vote, and please vote even when you don’t want to join - this is our second attempt and I’d really like to see a clear verdict from our members, one way or the other.

Thanks.

Voting closes by  Apr 26 23:59 UTC, but please don’t cut it short, it’s a computer that decides when it’s over …

April 20, 2016 07:24 AM

April 18, 2016

Paul E. Mc Kenney: Device Tree Microconference Accepted into 2016 Linux Plumbers Conference

Device-tree discussions are probably not quite as spirited as in the past, however, device tree is an active area. In particular, significant issues remain.

This microconference will cover the updated device-tree specification, debugging, bindings validation, and core-kernel code directions. In addition, there will be discussion of what does (and does not) go into the device tree, along with the device-creation and driver binding ordering swamp.

Join us for an important and spirited discussion!

April 18, 2016 03:21 PM

Matthew Garrett: One more attempt at SATA power management

Around a year ago I wrote some patches in an attempt to improve power management on Haswell and Broadwell systems by configuring Serial ATA power management appropriately. I got a couple of reports of them triggering SATA errors for some users, couldn't reproduce them myself and so didn't have a lot of confidence in them. Time passed.

I've been working on power management stuff again this week, so it seemed like a good opportunity to revisit these. I've made a few changes and pushed a couple of trees - one against master and one against 4.5.

First, these probably only have relevance to users of mobile Intel parts in the U or S range (/proc/cpuinfo will tell you - you're looking for a four-digit number that starts with 4 (Haswell), 5 (Broadwell) or 6 (Skylake) and ends with U or S), and won't do anything unless you have SATA drives (including PCI-based SATA). To test them, first disable anything like TLP that might alter your SATA link power management policy. Then check powertop - you should only be getting to PC3 at best. Build a kernel with these patches and boot it. /sys/class/scsi_host/*/link_power_management_policy should read "firmware". Check powertop and see whether you're getting into deeper PC states. Now run your system for a while and check the kernel log for any SATA errors that you didn't see before.

Let me know if you see SATA errors and are willing to help debug this, and leave a comment if you don't see any improvement in PC states.

comment count unavailable comments

April 18, 2016 02:15 AM

April 17, 2016

LPC 2016: PCI Microconference Accepted into 2016 Linux Plumbers Conference

Given that PCI was introduced more than two decades ago and that PCI Express was introduced more than ten years ago, one might think that the Linux plumbing already did everything possible to support PCI.

One would be quite wrong.

One issue with current PCI support is that resource allocation is handled on a per-architecture basis, leading to duplicate code, and, worse yet, duplicate bugs. This microconference will therefore look into possible consolidation of this code.

Another issue is integration of PCI’s message-signaled interrupts (MSI) into the kernel’s core interrupt code. Whilst the device tree bindings and relative core code for MSI management have just been integrated, legacy code needs more work, including architecture-specific code and PCI legacy host controller drivers. In addition, attention should be paid to the ACPI bindings, to the related ACPI kernel core code, and to the MSI passthrough usage in virtualized environments.

Advances in Virtualization technologies, System I/O devices and their memory management through IOMMU components are driving features in the PCI Express specifications (Address Translation Services (ATS) and Page Request Interface (PRI)). This microconference will also foster debate on the best way to integrate these features in the kernel in a seamless way for different architectures.

Of course, no discussion of PCI would be complete without considering firmware issues, hardware quirks, power management, and the interplay between device tree and ACPI.

Join us for an important new-to-Plumbers PCI discussion!

April 17, 2016 04:01 PM

April 15, 2016

Pavel Machek: Python is a nice... trap

Python is a nice... language

Easy to program in. No need to recompile, you can hack in as you would in shell. Fun.
Lets see if I can predict the weather. Oh yes, lets hack something in python. Hey, it works. But it is slow on a PC and so slow on a phone that future is already past when you predict is.
Then you try to improve the speed. That's very very bad idea. You can't optimize Python, but you can spend hours trying. You make a mental note to never ever use Python for computation again.
But Python should be fine for simple Gtk project, communicating over Dbus, right?
For bigger projects, Python has some support. Modules / object orientation really helps there. Lack of compilation really hurts there, but this is fun language, right? Allowing you to write nice and clean code.
Refactoring is hard, because variables are not declared.
Python is a nice... trap
But then your project grows larger, and you realize it takes 30 seconds to start up. Because many module need gtk, and importing gtk takes time. Heck... compiling _and_ running C based hello world is still faster then running Python based one.
Ok, C++ was tempting. Gtk Hello world takes 55 seconds to compile. My g++ does not support "auto", so the application starts with Glib::RefPtr<Gtk::Application> app = Gtk::Application::create(argc, argv, "org.gtkmm.example"). I think I'll stick with C.

Oh, and I already used my main SIM in N900/Debian... and I really used N900/Debian. I needed to tether a network, but not provide wifi hotspot. Done.

April 15, 2016 11:17 AM

Matthew Garrett: David MacKay

The first time I was paid to do software development came as something of a surprise to me. I was working as a sysadmin in a computational physics research group when a friend asked me if I'd be willing to talk to her PhD supervisor. I had nothing better to do, so said yes. And that was how I started the evening having dinner with David MacKay, and ended the evening better fed, a little drunker and having agreed in principle to be paid to write free software.

I'd been hired to work on Dasher, an information-efficient text entry system. It had been developed by one of David's students as a practical demonstration of arithmetic encoding after David had realised that presenting a visualisation of an effective compression algorithm allowed you to compose text without having to enter as much information into the system. At first this was merely a neat toy, but it soon became clear that the benefits of Dasher had a great deal of overlap with good accessibility software. It required much less precision of input, it made it easy to correct mistakes (you merely had to reverse direction in order to start zooming back out of the text you had entered) and it worked with a variety of input technologies from mice to eye tracking to breathing. My job was to take this codebase and turn it into a project that would be interesting to external developers.

In the year I worked with David, we turned Dasher from a research project into a well-integrated component of Gnome, improved its support for Windows, accepted code from an external contributor who ported it to OS X (using an OpenGL canvas!) and wrote ports for a range of handheld devices. We added code that allowed Dasher to directly control the UI of other applications, making it possible for people to drive word processors without having to leave Dasher. We taught Dasher to speak. We strove to avoid the mistakes present in so many other pieces of accessibility software, such as configuration that could only be managed by an (expensive!) external consultant. And we visited Dasher users and learned how they used it and what more they needed, then went back home and did what we could to provide that.

Working on Dasher was an incredible opportunity. I was involved in the development of exciting code. I spoke on it at multiple conferences. I became part of the Gnome community. I visited the USA for the first time. I entered people's homes and taught them how to use Dasher and experienced their joy as they realised that they could now communicate up to an order of magnitude more quickly. I wrote software that had a meaningful impact on the lives of other people.

Working with David was certainly not easy. Our weekly design meetings were, charitably, intense. He had an astonishing number of ideas, and my job was to figure out how to implement them while (a) not making the application overly complicated and (b) convincing David that it still did everything he wanted. One memorable meeting involved me gradually arguing him down from wanting five new checkboxes to agreeing that there were only two combinations that actually made sense (and hence a single checkbox) - and then admitting that this was broadly equivalent to an existing UI element, so we could just change the behaviour of that slightly without adding anything. I took the opportunity to delete an additional menu item in the process.

I was already aware of the importance of free software in terms of developers, but working with David made it clear to me how important it was to users as well. A community formed around Dasher, helping us improve it and allowing us to develop support for new use cases that made the difference between someone being able to type at two words per minute and being able to manage twenty. David saw that this collaborative development would be vital to creating something bigger than his original ideas, and it succeeded in ways he couldn't have hoped for.

I spent a year in the group and then went back to biology. David went on to channel his strong feelings about social responsibility into issues such as sustainable energy, writing a freely available book on the topic. He served as chief adviser to the UK Department of Energy and Climate Change for five years. And earlier this year he was awarded a knighthood for his services to scientific outreach.

David died yesterday. It's unlikely that I'll ever come close to what he accomplished, but he provided me with much of the inspiration to try to do so anyway. The world is already a less fascinating place without him.

comment count unavailable comments

April 15, 2016 06:26 AM

April 13, 2016

Matthew Garrett: Skylake's power management under Linux is dreadful and you shouldn't buy one until it's fixed

(Edit to add: this issue is restricted to the mobile SKUs. Desktop parts have very different power management behaviour)

Linux 4.5 seems to have got Intel's Skylake platform (ie, 6th-generation Core CPUs) to the point where graphics work pretty reliably, which is great progress (4.4 tended to lose all my windows every so often, especially over suspend/resume). I'm even running Wayland happily. Unfortunately one of the reasons I have a laptop is that I want to be able to do things like use it on battery, and power consumption's an important part of that. Skylake continues the trend from Haswell of moving to an SoC-type model where clock and power domains are shared between components that were previously entirely independent, and so you can't enter deep power saving states unless multiple components all have the correct power management configuration. On Haswell/Broadwell this manifested in the form of Serial ATA link power management being involved in preventing the package from going into deep power saving states - setting that up correctly resulted in a reduction in full-system power consumption of about 40%[1].

I've now got a Skylake platform with a nice shiny NVMe device, so Serial ATA policy isn't relevant (the platform doesn't even expose a SATA controller). The deepest power saving state I can get into is PC3, despite Skylake supporting PC8 - so I'm probably consuming about 40% more power than I should be. And nobody seems to know what needs to be done to fix this. I've found no public documentation on the power management dependencies on Skylake. Turning on everything in Powertop doesn't improve anything. My battery life is pretty poor and the system is pretty warm.

The best thing about this is the following statement from page 64 of the 6th Generation Intel ® Processor Datasheet for U-Platforms:

Caution: Long term reliability cannot be assured unless all the Low-Power Idle States are enabled.

which is pretty concerning. Without support for states deeper than PC3, Linux is running in a configuration that Intel imply may trigger premature failure. That's obviously not good. Until this situation is improved, you probably shouldn't buy any Skylake systems if you're planning on running Linux.

[1] These patches never went upstream. Someone reported that they resulted in their SSD throwing errors and I couldn't find anybody with deeper levels of SATA experience who was interested in working on the problem. Intel's AHCI drivers for Windows do the right thing, but I couldn't find anybody at Intel who could get any information from their Windows driver team.

comment count unavailable comments

April 13, 2016 10:16 PM

Paul E. Mc Kenney: PCI Microconference Accepted into 2016 Linux Plumbers Conference

Given that PCI was introduced more than two decades ago and that PCI Express was introduced more than ten years ago, one might think that the Linux plumbing already did everything possible to support PCI.

One would be quite wrong.

One issue with current PCI support is that resource allocation is handled on a per-architecture basis, leading to duplicate code, and, worse yet, duplicate bugs. This microconference will therefore look into possible consolidation of this code.

Another issue is integration of PCI's message-signaled interrupts (MSI) into the kernel's core interrupt code. Whilst the device tree bindings and relative core code for MSI management have just been integrated, legacy code needs more work, including architecture-specific code and PCI legacy host controller drivers. In addition, attention should be paid to the ACPI bindings, to the related ACPI kernel core code, and to the MSI passthrough usage in virtualized environments.

Advances in Virtualization technologies, System I/O devices and their memory management through IOMMU components are driving features in the PCI Express specifications (Address Translation Services (ATS) and Page Request Interface (PRI)). This microconference will also foster debate on the best way to integrate these features in the kernel in a seamless way for different architectures.

Of course, no discussion of PCI would be complete without considering firmware issues, hardware quirks, power management, and the interplay between device tree and ACPI.

Join us for an important new-to-Plumbers PCI discussion!

April 13, 2016 08:25 PM