|Opened since 2014-04-11||3||14||8||(25)|
|Closed since 2014-04-11||6||9||6||(21)|
|Changed since 2014-04-11||6||29||14||(49)|
|Opened since 2014-04-11||3||14||8||(25)|
|Closed since 2014-04-11||6||9||6||(21)|
|Changed since 2014-04-11||6||29||14||(49)|
Added some code to trinity to use random open flags on the fd’s it opens on startup.
Spent most of the day hitting the same VM bugs as yesterday, or others that Sasha had already reported.
Later in the day, I started seeing this bug after applying a not-yet-merged patch to fix a leak that Coverity had picked up on recently. Spent some time looking into that, without making much progress.
Rounded out the day by trying out latest builds on my freshly reinstalled laptop, and walked into this.
Spent all of yesterday attempting recovery (and failing) on the /home partition of my laptop.
On the weekend, I decided I’d unsuspend it to send an email, and just got a locked up desktop. The disk IO light was stuck on, but it was completely dead to input, couldn’t switch to console. Powered it off and back on. XFS wanted to me to run xfs_repair. So I did. It complained that there was pending metadata in the log, and that I should mount the partition to replay it first. I tried. It failed miserably, so I re-ran xfs_repair with -L to zero the log. Pages and pages of scrolly text zoomed up the screen.
Then I rebooted and.. couldn’t log in any more. Investigating with root showed that /home/davej was now /home/lost & found, and within it were a couple dozen numbered directories containing mostly uninteresting files.
So that’s the story about how I came to lose pretty much everything I’ve written in the last month that I hadn’t already pushed to github. I’m still not entirely sure what happened, but I point the finger of blame more at dm-crypt than at xfs at this point, because the non-encrypted partitions were fine.
Ultimately I gave up, reformatted and reinstalled. Kind of a waste of a day (and a half).
Things haven’t being entirely uneventful though:
So there’s still some fun VM/FS horrors lurking. Sasha has been hitting a bunch more huge page bugs too. It never ends.
MITRE gave a presentation on UEFI Secure Boot at SyScan earlier this month. You should read the the presentation and paper, because it's really very good.
It describes a couple of attacks. The first is that some platforms store their Secure Boot policy in a run time UEFI variable. UEFI variables are split into two broad categories - boot time and run time. Boot time variables can only be accessed while in boot services - the moment the bootloader or kernel calls ExitBootServices(), they're inaccessible. Some vendors chose to leave the variable containing firmware settings available during run time, presumably because it makes it easier to implement tools for modifying firmware settings at the OS level. Unfortunately, some vendors left bits of Secure Boot policy in this space. The naive approach would be to simply disable Secure Boot entirely, but that means that the OS would be able to detect that the system wasn't in a secure state. A more subtle approach is to modify the policy, such that the firmware chooses not to verify the signatures on files stored on fixed media. Drop in a new bootloader and victory is ensured.
But that's not a beautiful approach. It depends on the firmware vendor having made that mistake. What if you could just rewrite arbitrary variables, even if they're only supposed to be accessible in boot services? Variables are all stored in flash, connected to the chipset's SPI controller. Allowing arbitrary access to that from the OS would make it straightforward to modify the variables, even if they're boot time-only. So, thankfully, the SPI controller has some control mechanisms. The first is that any attempt to enable the write-access bit will cause a System Management Interrupt, at which point the CPU should trap into System Management Mode and (if the write attempt isn't authorised) flip it back. The second is to disable access from the OS entirely - all writes have to take place in System Management Mode.
The MITRE results show that around 0.03% of modern machines enable the second option. That's unfortunate, but the first option should still be sufficient. Except the first option requires on the SMI actually firing. And, conveniently, Intel's chipsets have a bit that allows you to disable all SMI sources, and then have another bit to disable further writes to the first bit. Except 40% of the machines MITRE tested didn't bother setting that lock bit. So you can just disable SMI generation, remove the write-protect bit on the SPI controller and then write to arbitrary variables, including the SecureBoot enable one.
This is, uh, obviously a problem. The good news is that this has been communicated to firmware and system vendors and it should be fixed in the future. The bad news is that a significant proportion of existing systems can probably have their Secure Boot implementation circumvented. This is pretty unsurprisingly - I suggested that the first few generations would be broken back in 2012. Security tends to be an iterative process, and changing a branch of the industry that's historically not had to care into one that forms the root of platform trust is a difficult process. As the MITRE paper says, UEFI Secure Boot will be a genuine improvement in security. It's just going to take us a little while to get to the point where the more obvious flaws have been worked out.
 Unless the malware was intelligent enough to hook GetVariable, detect a request for SecureBoot and then give a fake answer, but who would do that?
 Impressively, basically everyone enables that.
 Great for dealing with bugs caused by YOUR ENTIRE COMPUTER BEING INTERRUPTED BY ARBITRARY VENDOR CODE, except unfortunately it also probably disables chunks of thermal management and stops various other things from working as well.
|Opened since 2014-04-04||3||19||7||(29)|
|Closed since 2014-04-04||7||13||6||(26)|
|Changed since 2014-04-04||9||32||10||(51)|
I've released man-pages-3.64. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.
Aside from a number of minor improvements and fixes to various pages, the notable changes in changes in man-pages-3.64 are the following:
|Opened since 2014-03-28||4||17||9||(30)|
|Closed since 2014-03-28||12||17||4||(33)|
|Changed since 2014-03-28||15||29||10||(54)|
Weekly Fedora kernel bug statistics – April 04 2014 is a post from: codemonkey.org.uk
|Opened since 2014-03-01||32||92||19||(143)|
|Closed since 2014-03-01||156||183||99||(438)|
|Changed since 2014-03-01||83||111||31||(225)|
A post I wrote back in 2012 got linked from a couple of the discussions relating to Brendan Eich being appointed Mozilla CEO. The tldr version is "If members of your community doesn't trust their leader socially, the leader's technical competence is irrelevant". That seems to have played out here.
In terms of background: in 2008, Brendan donated money to the campaign for Proposition 8, a Californian constitutional amendment that expressly defined marriage as being between one man and one woman. Both before and after that he had donated money to a variety of politicians who shared many political positions, including the definition of marriage as being between one man and one woman.
Mozilla is an interesting organisation. It consists of the for-profit Mozilla Corporation, which is wholly owned by the non-profit Mozilla Foundation. The Corporation's bylaws require it to work to further the Foundation's goals, and any profit is reinvested in Mozilla. Mozilla developers are employed by the Corporation rather than the Foundation, and as such the CEO is responsible for ensuring that those developers are able to achieve those goals.
The Mozilla Manifesto discusses individual liberty in the context of use of the internet, not in a wider social context. Brendan's appointment was very much in line with the explicit aims of both the Foundation and the Corporation - whatever his views on marriage equality, nobody has seriously argued about his commitment to improving internet freedom. So, from that perspective, he should have been a fine choice.
But that ignores the effect on the wider community. People don't attach themselves to communities merely because of explicitly stated goals - they do so because they feel that the community is aligned with their overall aims. The Mozilla community is one of the most diverse in free software, at least in part because Mozilla's stated goals and behaviour are fairly inspirational. People who identify themselves with other movements backing individual liberties are likely to identify with Mozilla. So, unsurprisingly, there's a large number of socially progressive individuals (LGBT or otherwise) in the Mozilla community, both inside and outside the Corporation.
A CEO who's donated money to strip rights from a set of humans will not be trusted by many who believe that all humans should have those rights. It's not just limited to individuals directly affected by his actions - if someone's shown that they're willing to strip rights from another minority for political or religious reasons, what's to stop them attempting to do the same to you? Even if you personally feel safe, do you trust someone who's willing to do that to your friends? In a community that's made up of many who are either LGBT or identify themselves as allies, that loss of trust is inevitably going to cause community discomfort.
The first role of a leader should be to manage that. Instead, in the first few days of Brendan's leadership, we heard nothing of substance - at best, an apology for pain being caused rather than an apology for the act that caused the pain. And then there was an interview which demonstrated remarkable tone deafness. He made no attempt to alleviate the concerns of the community. There were repeated non-sequiturs about Indonesia. It sounded like he had no idea at all why the community that he was now leading was unhappy.
And, today, he resigned. It's easy to get into hypotheticals - could he have compromised his principles for the sake of Mozilla? Would an initial discussion of the distinction between the goals of members of the Mozilla community and the goals of Mozilla itself have made this more palatable? If the board had known this would happen, would they have made the same choice - and if they didn't know, why not?
But that's not the real point. The point is that the community didn't trust Brendan, and Brendan chose to leave rather than do further harm to the community. Trustworthy leadership is important. Communities should reflect on whether their leadership reflects not only their beliefs, but the beliefs of those that they would like to join the community. Fail to do so and you'll drive them away instead.
 For people who've been living under a rock
 Proposition 8 itself was a response to an ongoing court case that, at the point of Proposition 8 being proposed, appeared likely to support the overturning of Proposition 22, an earlier Californian ballot measure that legally (rather than constitutionally) defined marriage as being between one man and one woman. Proposition 22 was overturned, and for a few months before Proposition 8 passed, gay marriage was legal in California.
 Brendan made a donation on October 25th, 2008. This postdates the overturning of Proposition 22, and as such gay marriage was legal in California at the time of this donation. Donating to Proposition 8 at that point was not about supporting the status quo, it was about changing the constitution to forbid something that courts had found was protected by the state constitution.
DisplayPort 1.2 Multi-stream Transport is a feature that allows daisy chaining of DP devices that support MST into all kinds of wonderful networks of devices. Its been on the TODO list for many developers for a while, but the hw never quite materialised near a developer.
At the start of the week my local Red Hat IT guy asked me if I knew anything about DP MST, it turns out the Lenovo T440s and T540s docks have started to use DP MST, so they have one DP port to the dock, and then dock has a DP->VGA, DP->DVI/DP, DP->HDMI/DP ports on it all using MST. So when they bought some of these laptops and plugged in two monitors to the dock, it fellback to using SST mode and only showed one image. This is not optimal, I'd call it a bug :)
Now I have a damaged in transit T440s (the display panel is in pieces) with a dock, and have spent a couple of days with DP 1.2 spec in one hand (monitor), and a lot of my hair in the other. DP MST has a network topology discovery process that is build on sideband msgs send over the auxch which is used in normal DP to read/write a bunch of registers on the plugged in device. You then can send auxch msgs over the sideband msgs over auxch to read/write registers on other devices in the hierarchy!
Today I achieved my first goal of correctly encoding the topology discovery message and getting a response from the dock:
[ 2909.990743] link address reply: 4
[ 2909.990745] port 0: input 1, pdt: 1, pn: 0
[ 2909.990746] port 1: input 0, pdt: 4, pn: 1
[ 2909.990747] port 2: input 0, pdt: 0, pn: 2
[ 2909.990748] port 3: input 0, pdt: 4, pn: 3
There are a lot more steps to take before I can produce anything, along with dealing with the fact that KMS doesn't handle dynamic connectors so well, should make for a fun tangent away from the job I should be doing which is finishing virgil.
I've ordered another DP MST hub that I can plug into AMD and nvidia gpus that should prove useful later, also for doing deeper topologies, and producing loops.
Also some 4k monitors using DP MST as they are really two panels, but I don't have one of them, so unless one appears I'm mostly going to concentrate on the Lenovo docks for now.
So the release of the 3.14 linux kernel already happended and I'm a bit late for our regular look at what cool stuff will land in the 3.15 merge window for the Intel graphics driver.
The first big thing is Ben Widawsky's support for per-process address spaces. Since a long time we've already supported the per-process gtt page tables the hardware provides, but only with one address space. With Ben's work we can now manage multiple address spaces and switch between them, at least on Ivybridge and Haswell. Support for Baytrail and Broadwell for this feature is still in progress. This finally allows multiple different users to use the gpu concurrently without accidentally leaking information between applications. Unfortunately there have been a few issues with the code still so we had to disable this for 3.15 by default.
Another really big feature was the much more fine-grained display power domain handling and a lot of runtime power management infrastructure work from Imre and Paulo. Intel gfx hardware always had lots of automatic clock and power gating, but recent hardware started to have some explicit power domains which need to be managed by the driver. To do that we need to keep track of the power state of each domain, and if we need to switch something on we also need to restore the hardware state. On top of that every platform has it's own special set of power domains and how the logical pieces are split up between them also changes. To make this all manageable we now have an extensive set of display power domains and functions to handle them as abstraction between the core driver code and the platform specific power management backend. Also a lot of work has happened to allow us to reuse parts of our driver resume/suspend code for runtime power management. Unfortunately the patches merged into 3.15 are all just groundwork, new platforms and features will only be enabled in 3.16.
Another long-standing nuisance with our driver was the take-over from the firmware configuration. With the reworked modesetting infrastructure we could take over the output routing, and with the experimental i915.fastboot=1 option we could eschew the initial modeset. This release Jesse provided another piece of the puzzle by allowing the driver to inherit the firmware framebuffer. There's still more work to do to make fastboot solid and enable it by default, but we now have all the pieces in place for a smooth and fast boot-up.
There have been tons of patches for Broadwell all over the place. And we still have some features which aren't yet fully enabled, so there will be lots more to come. Other smaller features in 3.15 are improved support for framebuffer compression from Ville, again more work still left. 5.4GHz DisplayPort support, which is a required to get 4k working. Unfortunately most 4k DP monitors seem to expose two separate screens and so need a adriver with working MST (multi-stream support) which we don't yet support. On that topic: The i915 driver now uses the generic DP aux helpers from Thierry Redding. Having shared code to handle all the communication with DP sinks should help a lot in getting MST off the ground. And finally I'd like to highlight the large cursor support, which should be especially useful for high-dpi screens.
And of course there's been fixes and small improvements all over the place, as usual.
The big thing that stands out this cycle is that the defect ratio was going down until we hit around 3.14-rc7, and then we got a few hundred new issues. What happened ?
Nothing in the kernel thankfully. This was due to an upgrade server side to a new version of Coverity which has some new checkers. Some of the existing ones got improved too, so a bunch of false positives we had sitting around in the database are no longer reported. The number of new issues unfortunately was greater than the known false positives. In the days following, I did a first sweep through these and closed out the easy ones, bringing the defect density back down.
note: I stopped logging the ‘dismissed’ totals. With Coverity 7.0, the number can go backwards.
If a file gets deleted, the issues against that file that were dismissed also disappears.
Given this happens fairly frequently, the number isn’t really indicative of anything useful.
With the 3.15 merge window now open, I’m hoping a bunch of the queued fixes I sent over the last few weeks get merged, but I’m fully expecting to need to do some resending.
 It was actually worse than this, the ratio went back up to 0.57 right before rc7
It’s been a busy week.
A week ago I flew out to Napa,CA for two days of discussions with various kernel people (ok, and some postgresql people too) about all things VM and FS/IO related. I learned a lot. These short focussed conferences have way more value to me these days personally than the conferences of years ago with a bunch of tracks, and day after day of presentations.
I gave two sessions relating to testing, there are some good write-ups on lwn. It was more of a extended QA than a presentation, so I got a lot of useful feedback (and especially afterwards in the hallway sessions). A couple people asked if trinity was doing certain things yet, which led to some code walkthroughs, and a lot of brainstorming about potential solutions.
By the end of the week I was overflowing with ideas for new things it could be doing, and have started on some of the code for this already. One feature I’d had in mind for a while (children doing root operations) but hadn’t gotten around to writing could be done in a much simpler way, which opens the doors to a bunch more interesting things. I might end up rewriting the current ioctl fuzzing (which isn’t finding a huge amount of bugs right now anyway) once this stuff has landed, because I think it could be doing much more ‘targeted’ things.
It was good to meet up with a bunch of people that I’ve interacted with for a while online and discuss some things. Was surprised to learn Sasha Levin is actually local to me, yet we both had to fly 3000 miles to meet.
Two sessions at LSF/MM were especially interesting outside of my usual work.
The postgresql session where they laid out their pain points with the kernel IO was enlightening, as they started off with a quick overview of postgresql’s process model, and how things interact. The session felt like it went off in a bunch of random directions at once, but the end goal (getting a test case kernel devs can run without needing a full postgresql setup) seemed to be reached the following day.
The second session I found interesting was the “Facebook linux problems” session. As mentioned in the lwn write-up, one of the issues was this race in the pipe code. “This is *very* hard to trigger in practice, since the race window is very small”. Facebook were hitting it 500 times a day. Gave me thoughts on a whole bunch of “testing at scale” problems. A lot of the testing I do right now is tiny in comparison. I do stress tests & fuzz runs on a handful of machines, and most of it is all done by hand. Doing this kind of thing on a bigger scale makes it a little impractical to do in a non-automated way. But given I’ve been buried alive in bugs with just this small number, it has left me wondering “would I find a load more bugs with more machines, or would it just mean the mean time between reproducing issues gets shorter”. (Given the reproducibility problems I’ve had with fuzz testing sometimes, the latter wouldn’t necessarily be a bad thing). More good thoughts on this topic can be found in a post google made a few years ago.
Coincidentally, I’m almost through reading How google tests software, which is a decent book, but with not a huge amount of “this is useful, I can apply this” type knowledge. It’s very focussed on the testing of various web-apps, with no real mention of testing of Android, Chrome etc. (The biggest insights in the book aren’t actually testing related, but more the descriptions of googles internal re-hiring processes when people move between teams).
Collaboration summit followed from Wednesday onwards. One highlight for me were learning that the tracing code has something coming in 3.15/3.16 that I’ve been hoping for for a while. At last years kernel summit, Andi Kleen suggested it might be interesting if trinity had some interaction with ftrace to get traces of “what the hell just happened”. The tracing changes landing over the next few months will allow that to be a bit more useful. Right now, we can only do that on a global system-wide basis, but with that moving to be per-process, things can get a lot more useful.
Another interesting talk was the llvmlinux session. I haven’t checked in on this project in a while, so was surprised to learn how far along they are. Apparently all the necessary llvm changes to build the kernel are either merged, or very close to merging. The kernel changes still have a ways to go, but this too has improved a lot since I last looked. Some good discussion afterwards about the crossover between things like clang’s static analysis warnings and the stuff I’m doing with Coverity.
Speaking of, I left early on Friday to head back to San Francisco to meet up with Coverity. Lots of good discussion about potential workflow improvements, false positive/heuristic improvements etc. A good first meeting if only to put faces to names I’ve been dealing with for the last year. I bugged them about a feature request I’ve had for a while (that a few people the days preceding had also nagged me about); the ability to have per-subsystem notification emails instead of the one global email. If they can hook this up, it’ll save me a lot of time having to manually craft mails to maintainers when new issues are detected.
busy busy week, with so many new ideas I felt like my head was full by the time I got on the plane to get back.
Taking it easy for a day or two, before trying to make progress on some of the things I made notes on last week.
For years, I did my best to ignore the problem, but CKS inspired me to blog the curious networking banality, in case anyone has wisdom to share.
The deal is simple: I have a laptop with a VPN client (I use vpnc). The client creates a tun0 interface and some RFC 1918 routes. My home RFC 1918 routes are more specific, so routing works great. The name service does not.
Obviously, if we trust DHCP-supplied nameserver, it has no work-internal names in it. The stock solution is to let vpnc to install /etc/resolv.conf pointing to work-internal nameservers. Unfortunately this does not work for me, because I have a home DNS zone, zaitcev.lan. Work-internal DNS does not know about that one.
Thus I would like some kind of solution that routes DNS requests somehow according to a configuration. Requests to work-internal namespaces (such as *.redhat.com) would go to nameservers delivered by vpnc (I think I can make it write something like /etc/vpnc/resolv.conf that does not conflict). Other requests go to the infrastructure name service, being it a hotel network or home network. Home network is capable of serving its own private authoritative zones and forwarding the rest. That's the ideal, so how to accomplish it?
I attempted apply a local dnsmasq, but could not figure out if it can do what I want and if yes, how.
For now, I have some scripting that caches work-internal hostnames in /etc/hosts. That works, somewhat. Still, I cannot imagine that nobody thought of this problem. Surely, thousands are on VPNs, and some of them have home networks. And... nobody? (I know that a few people just run VPN on the home infrastructure; that does not help my laptop, unfortunately).
I published a new article on TSX anti patterns: Common mistakes that people make when implementing TSX lock libraries or writing papers about them. Make sure you don’t fall into the same traps. Enjoy!
Gnome since 3.8 has restricted the Blank Screen time to between 1 and 15 minutes, or “Never”, to disable screen blanking/locking entirely. If this isn’t granular enough, you can set other values like so:
dconf write /org/gnome/desktop/session/idle-delay 1800
gsettings set org.gnome.desktop.session idle-delay 1800
The value is in seconds, so here we set the delay to 30 minutes (60*30=1800). It seems that once doing this, the UI will show “Never”, but the set value is still used correctly.
There is also a “Presentation Mode” shell extension that adds a button to inhibit screen lock, but for me, I still wanted to have it automatically lock, but just a little bit slower.
EDIT: dconf didn’t actually work! Apparently gsettings is the way to go.
I was awarded the Free Software Foundation Award for the Advancement of Free Software this weekend. I'd been given some forewarning, and I spent a bunch of that time thinking about how free software had influenced my life. It turns out that it's a lot.
I spent most of the 90s growing up in an environment that was rather more interested in cattle than in computers, and had very little internet access during that time. My entire knowledge of the wider free software community came from a couple of CDs that contained a copy of the jargon file, the source code to the entire GNU project and an early copy of the m68k Linux kernel.
But that was enough. Before I'd even got to university, I knew what free software was. I'd had the opportunity to teach myself how an operating system actually worked. I'd seen the benefits of being able to modify software and share those modifications with others. I met other people with the same interests. I ended up with a job writing free software and collaborating with others on integrating it with upstream code. And, from there, I became more and more involved with a wider range of free software communities, finding an increasing number of opportunities to help make changes that benefited both me and others.
Without free software I'd have started years later. I'd have lost the opportunity to collaborate with people spread over the entire world. My first job would have looked very different, as would my entire career since then. Without free software, almost everything I've achieved in my adult life would have been impossible.
To me, free software means I've lived a significantly better life than would otherwise have been the case. But more than that, it means doing what I can to make sure that other people have the same opportunities. I am here because of the work of others. The most rewarding part of my continued involvement is the knowledge that I am part of a countless number of people working to make sure that others can tell the same story in future.
 I'd link to the actual press release, but it contains possibly the worst photograph of me in the entire history of the universe
As my previous post documented, I’ve experimented with localbitcoins.com. Following the arrest of two Miami men for trading on localbitcoins, I decided to seek legal advice on the sitation in Australia.
Online research led me to Nick Karagiannis of Kelly and Co, who was already familiar with Bitcoin: I guess it’s a rare opportunity for excitement in financial regulatory circles! This set me back several thousand dollars (in fiat, unfortunately), but the result was reassuring.
They’ve released an excellent summary of the situation, derived from their research. I hope that helps other bitcoin users in Australia, and I’ll post more in future should the legal situation change.
|Opened since 2014-03-14||6||19||2||(27)|
|Closed since 2014-03-14||6||133||84||(223)|
|Changed since 2014-03-14||18||46||14||(78)|
So to enable OpenGL 3.3 on radeonsi required some patches backported to llvm 3.4, I managed to get some time to do this, and rebuilt mesa against the new llvm, so if you have an AMD GPU that is supported by radeonsi you should now see GL 3.3.
For F20 this isn't an option as backporting llvm is a bit tricky there, though I'm considering doing a copr that has a private llvm build in it, it might screw up some apps but for most use cases it might be fine.
I bought 10 BTC to play with back in 2011, and have been slowly spending them to support bitcoin adoption. One thing which I couldn’t get reliable information on was how to buy and sell bitcoin within Australia, so over the last few months I decided to sell a few via different methods and report the results here (this also helps my budget, since I’m headed off on paternity leave imminently!).
All options listed here use two-factor authentication, otherwise I wouldn’t trust them with more than cents. And obviously you shouldn’t leave your bitcoins in an exchange for any longer than necessary, since most exchanges over time have gone bankrupt.
Yes, I transferred some BTC into MtGox and sold them. This gave the best price, but after over two months of waiting the bank transfer to get my money hadn’t been completed. So I gave up, bought back into bitcoins (fewer, since the price had jumped) and thus discovered that MtGox was issuing invalid BTC transactions so I couldn’t even get those out. Then they halted transactions altogether blaming TX malleability. Then they went bankrupt. Then they leaked my personal data just for good measure. The only way their failure could be more complete is if my MtGox Yubikey catches on fire and burns my home to the ground.
Volume: Great (5M AUD/month)
Price Premium: $25 – $50 / BTC
According to bitcoincharts.com, localbitcoins is the largest volume method for AUD exchange. It’s not an exchange, so much as a matching and escrow service, though there are a number of professional traders active on the site. The bulk of AUD trades are online, though I sold face to face (and I’ll be blogging about the range of people I met doing that).
localbitcoins.com is a great place for online BTC buyers, since they have been around for quite a while and have an excellent reputation with no previous security issues, and they hold bitcoins in escrow as soon as you hit “buy”. It’s a bit more work than an exchange, since you have to choose the counter-party yourself.
For online sellers, transfers from stolen bank accounts is a real issue. Electronic Funds Transfer (aka “Pay Anyone”) is reversible, so when the real bank account owner realizes their money is missing, the bank tends to freeze the receiving (ie. BTC seller’s) bank account to make sure they can’t remove the disputed funds. This process can take weeks or months, and banks’ anti-fraud departments generally treat bitcoin sellers who get defrauded with hostility (ANZ is reported to be the exception here). A less common scam is fraudsters impersonating the Australian Tax Office and telling the victim to EFT to the localbitcoins seller.
Mitigations for sellers include any combination of:
Many buyers on localbitcoins.com are newcomers, so anticipate honest mistakes for the most part. The golden rule always applies: if someone is offering an unrealistic price, it’s because they’re trying to cheat you.
Volume: Good (1M AUD/month)
Price Premium: $5 – $20 / BTC
Charge: 1% (selling), 0% (buying)
You’ll need to get your bank account checked to use this fairly low-volume exchange, but it’s reasonably painless. Their issues are their lack of exposure (I found out about them through bitcoincharts.com) and lack of volume (about a quarter of the localbitcoins.com volume), but they also trade litecoin if you’re into that. You can leave standing orders, or just manually place one which is going to be matched instantly.
They seem like a small operation, based in Sydney, but my interactions with them have been friendly and fast.
Volume: Low (300k AUD/month)
Price Premium: $0 / BTC
I heard about this site from a well-circulated blog post on Commonwealth Bank closing their bank account last year. I didn’t originally consider them since they don’t promote themselves as an exchange, but you can use their filler to sell them bitcoins at a spot rate. It’s limited to $4000 per day according to their FAQ.
They have an online ID check, using the usual sources which didn’t quite work for me due to out-of-date electoral information, but they cleared that manually within a day. They deposit 1c into your bank account to verify it, but that hasn’t worked for me, so I’ve no way to withdraw my money and they haven’t responded to my query 5 days ago leaving me feeling nervous. A search of reddit points to common delays, and founder’s links to the hacked-and-failed Bitcoinica give me a distinct “magical gathering” feel.
Volume: Unknown (self-reports indicate ~250k/month?)
Price Premium: $0 / BTC
Charge: 1.1% (selling) 2% (buying)
If you trade, I’d love to hear corrections, comments etc. or email me on firstname.lastname@example.org.
I've released man-pages-3.63. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.
Most notable among the changes in the release are many new and modified pages related to locales. Among the changes in man-pages-3.63 are the following:
I've released man-pages-3.62. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.
Among the more notable changes in man-pages-3.62 are the following:
Linux Storage Filesystem and Memory Management Summit 2014 (LSF/MM 2014) is almost here with just under a week to go. The schedule is almost filled but there are still some gaps for last-minute topics if something crops up or attendees decide feel that something is missing. There will be no printed schedule to accommodate this but there should be a qcode link to the live schedule on conference badges. As always, attendees are strongly encouraged to actively participate and post on the lists in advance if they want changes. The range of topics is very large although personally I'm very much looking forward to the database/kernel topics and the large block interfaces because of my personal bias towards VM and VM/FS topics.
For those that will not be attending LSF/MM there will be the usual writeups and some follow-ups at Collaboration Summit. There will be a panel consisting of at least one person from each track on the Thursday morning to cover any topic missed by the writeup or if attendees have additional comments to make. The database topic at LSF/MM has a limited number of attendees from the database community and there was more interest than anticipated so we'll also have a database/kernel interlock topic at Collaboration Summit on Thursday at 3pm. There was an announcement with a summary of the discussion to date and I'm hoping as many database and kernel community developers as possible will attend.
It's looking like this will be a great LSF/MM which continues to grow strongly from its humble origins when Nick Piggin organised the first FS/MM meeting (that I'm aware of at least). It's appreciated that companies continue to fund developers to attend LSF/MM which I personally consider to be the most important event that I attend each year. As always, while companies and foundations fund people to attend it would not be possible to run the event with the support of the sponsors as well. We have some great sponsors this year to help cover our costs and a few more have come on board since I wrote the first update. Facebook, NetApp and Western Digital are showing great support by sponsoring us at the platinum level. Google, IBM, Samsung and Seagate supporting us as Gold Sponsors and we appreciate the additional support of our Silver Sponsors, Dataera, Pure Storage and SUSE. Thanks to all who are going to make LSF/MM 2014 an exceptional event.
|Opened since 2014-03-07||8||18||7||(33)|
|Closed since 2014-03-07||133||20||6||(159)|
|Changed since 2014-03-07||79||35||11||(125)|
High (low?) point of the day was taking delivery of my new remote power switch.
You know that weird PCB chemical smell some new electronics have? Once I got this thing out the box it smelled so strong, I almost gagged. I let it ‘air’ for a little while, hoping it would dissipate. It didn’t. Or if it did, I couldn’t tell because now the whole room stunk. Then I made the decision to plug it in anyway. Within about a minute the smell went away. Well, not so much “went away”. More like, “was replaced with the smell of burning electronics”.
So that’s another fun “hardware destroys itself as soon as I get a hold of it” story, and yet another Amazon return.
(And I guess I’m done with ‘digital loggers’ products).
In non hardware-almost-burning-down-my-house news:
I’ve been trying to chase down the VM crashes I’ve been seeing. I’ve managed to find ways to reproduce some of them a little faster, but not really getting any resolution so far. Hacked up a script to run a subset of random VM related syscalls. (Trinity already has ‘-g vm’, but I wanted something even more fine-grained). Within minutes I was picking up VM traces that so far I’ve only seen Sasha reporting on -next.
I wrote about Thunderbolt on Apple hardware a while ago. Since then Andreas Noever has somehow managed to write a working Thunderbolt stack, which awesome! But there was still the problem I mentioned of the device not appearing unless you passed acpi_osi="Darwin" on the kernel command line, and a further problem that if you suspended and then resumed it vanished again.
The ACPI _OSI interface is a mechanism for the firmware to determine the OS that the system is running. It turns out that this works fine for operating systems that export fairly static interfaces (Windows, which adds a new _OSI per release) and poorly for operating systems that don't even guarantee any kind of interface stability in security releases (Linux, which claimed to be "Linux" regardless of version until we turned that off). OS X claims to be Darwin and nothing else. As I mentioned before, claiming to be Darwin in addition to Windows was enough to get the Thunderbolt hardware to stay alive after boot, but it wasn't enough to get it powered up again after suspend.
It turns out that there's two sections of ACPI code where this Mac checks _OSI. The first is something like:
if (_OSI("Darwin")) Store 0x2710 OSYS; else if(_OSI("Windows 2009") Store 0x7D9 OSYS; else…
ie, if the OS claims to be Darwin, all other strings are ignored. This is called from \_SB._INI(), which is the first ACPI method the kernel executes. The check for whether to power down the Thunderbolt controller occurs after this and then works correctly.
The second version is less helpful. It's more like:
if (_OSI("Darwin")) Store 0x2710 OSYS; if (_OSI("Windows 2009")) Store 0x7D9 OSYS; if…
ie, if the OS claims to be both Darwin and Windows 2009 (which Linux will if you pass acpi_osi="Darwin"), the OSYS variable gets set to the Windows 2009 value. This version gets called during PCI initialisation, and once it's run all the other Thunderbolt ACPI calls stop doing anything and the controller gets powered down after suspend/resume. That can be fixed easily enough by special casing Darwin. If the platform requests Darwin before anything else, we'll just stop claiming to be Windows.
Phew. Working Thunderbolt! (Well, almost - _OSC fails and so we disable PCIe hotplug, but that's easy to work around). But boo, no working battery. Apple do something very strange with their ACPI battery interface. If you're running anything that doesn't claim to be Darwin, Apple expose an ACPI Control Method battery. Control Method interfaces abstract the complexity away into either ACPI bytecode or system management traps - the OS simply calls an ACPI method, magic happens and it gets an answer back. If you claim to be Darwin, Apple remove that interface and instead expose the raw ACPI Smart Battery System interface. This provides an i2c bus over which the OS must then speak the Smart Battery System protocol, allowing it to directly communicate with the battery.
Linux has support for this, but it seems that this wasn't working so well and hadn't been for years. Loading the driver resulted in modprobe hanging until a timeout occurred, and most accesses to the battery would (a) take forever and (b) probably fail. It also had the nasty habit of breaking suspend and resume, which was unfortunate since getting Thunderbolt working over suspend and resume was the whole point of this exercise.
So. I modified the sbs driver to dump every command it sent over the i2c bus and every response it got. Pretty quickly I found that the failing operation was a write - specifically, a write used to select which battery should be connected to the bus. Interestingly, Apple implemented their Control Method interface by just using ACPI bytecode to speak the SBS protocol. Looking at the code in question showed that they never issued any writes, and the battery worked fine anyway. So why were we writing? SBS provides a command to tell you the current state of the battery subsystem, including which battery (out of a maximum of 4) is currently selected. Unsurprisingly, calling this showed that the battery we wanted to talk to was already selected. We then asked the SBS manager to select it anyway, and the manager promptly fell off the bus and stopped talking to us. In keeping with the maxim of "If hardware complains when we do something, and if we don't really need to do that, don't do that", this makes it work.
Working Thunderbolt and working battery. We're even getting close to getting switchable GPU support working reasonably, which is probably just going to involve rewriting the entirety of fbcon or something similarly amusing.
A rant in ;login was making rounds recently (h/t @jgarzik), which I thought was not all that relevant... until I remembered that Swift Power Calculator has mysteriously stopped working for me. Its creator is powerless to do anything about it, and so am I.
So, it's relevant all right. We're in a big trouble even if Gmail kind of works most of the time. But the rant makes no recommendations, only observations. So it's quite unsatisfying.
BTW, it reminds me about a famous preso by Jeff Mogul, "What's wrong with HTTP and why it does not matter". Except Mogul's rant was more to the point. They don't make engineers like they used to, apparently. Also notably, I think, Mogul prompted development of RESTful improvements. But there's nothing we can do about excessive thickness of our stacks (that I can see). It's just spiralling out of control.
Been hitting a number of VM related bugs the last few days.
The first bug is the one that concerns me most right now, though the 2nd is feasibly something that some non-fuzzer workloads may hit too. Other than these bugs, 3.14rc6 is working pretty well for me.
The first edition of “Is Parallel Programming Hard, And, If So, What Can You Do About It?” is now available in electronic form. The dead-tree version will be available a bit later, or, as they say, Real Soon Now.
Big-animal changes over the early 2013 version include:
I suppose everyone has to pass through a hardware phase, and mine is now, for which I implemented a LED blinker with an AVRtiny2313. I don't think it even merits the usual blog laydown. Basically all it took was following tutorials to the letter.
For the initial project, I figured that learning gEDA would take too much, so I unleashed an inner hipster and used Fritzing. Hey, it allows to plan breadboards, so there. And well it was a learning experience and no mistake. Crashes, impossible to undo changes, UI elements outside of the screen, everything. Black magic everywhere: I could never figure out how to merge wires, dedicate a ground wire/plane, or edit labels (so all of them are incorrect in the schematic above). The biggest problem was the lack of library support together with an awful parts editor. Editing schematics in Inkscape was so painful, that I resigned to doing a piss-poor job, evident in all the crooked lines around the AVRtiny2313. I understand that Fritzing's main focus is iPad, but this is just at a level of typical outsourced Windows application.
Inkscape deserves a special mention due to the way Fritzing requires SVG files being in a particular format. If you load and edit some of those, the grouping defeats Inkscape features, so one cannot even select elements at times. And editing the raw XML cause weirdest effects, so it's not like LyX-on-TeX, edit and visualize. At least our flagship vector graphics package didn't crash.
The avr-gcc is awesome though. 100% turnkey: yum install and you're done. Same for avrdude. No huss, no fuss, everything works.
I've spent a while trying to make GPU switching work more reliably on Apple hardware, and got to the point where my Retina MBP now comes up with X running on Intel regardless of what the firmware wanted to do (test patches here. But in the process I'd introduced an additional GPU switch which added another layer of flicker to the boot process. I spent some time staring at driver code and poking registers trying to figure out how I could let i915 probe EDID from the panel without switching the display over and made no progress at all.
And then I realised that nouveau already has all the information that i915 wants, and maybe we could just have the Switcheroo code hand that over instead of forcing i915 to probe again. Sigh.
|Opened since 2014-02-28||12||29||9||(50)|
|Closed since 2014-02-28||8||41||8||(57)|
|Changed since 2014-02-28||18||54||16||(88)|
Weekly Fedora kernel bug statistics – March 07 2014 is a post from: codemonkey.org.uk
Looking at a review by Solly today, I saw something deeply disturbing. A simplified version that I tested follows:
import unittest class Context(object): def __init__(self): self.func = None def kill(self): self.func(31) class TextGuruMeditationMock(object): # The .run() normally is implemented in the report.Text. def run(self): return "Guru Meditation Example" @classmethod def setup_autorun(cls, ctx, dump_with=None): ctx.func = lambda *args: cls.handle_signal(dump_with, *args) @classmethod def handle_signal(cls, dump_func, *args): try: res = cls().run() except Exception: dump_func("Unable to run") else: dump_func(res) class TestSomething(unittest.TestCase): def test_dump_with(self): ctx = Context() class Writr(object): def __init__(self): self.res = '' def go(self, out): self.res += out target = Writr() TextGuruMeditationMock.setup_autorun(ctx, dump_with=target.go) ctx.kill() self.assertIn('Guru Meditation', target.res)
Okay, obviously we're setting a signal handler, which is a little lambda, which invokes the dump_with, which ... is a class method? How does it receive its self?!
I guess that the deep Python magic occurs in how the method target.go is prepared to become an argument. The only explanation I see is that Python creates some kind of activation record for this, which includes the instance (target) and the method, and that record is the object being passed down as dump_with. I knew that Python did it for scoped functions, where we have global dict, local dict, and all that good stuff. But this is different, isn't it? How does it even know that target.io belongs to target? In what part of Python spec is it described?
UPDATE: Commenters provided hints with the key idea being a "bound method" (a kind of user-defined method).
A user-defined method object combines a class, a class instance (or None) and any callable object (normally a user-defined function).
When a user-defined method object is created by retrieving a user-defined function object from a class, its im_self attribute is None and the method object is said to be unbound. When one is created by retrieving a user-defined function object from a class via one of its instances, its im_self attribute is the instance, and the method object is said to be bound.
Thanks, Josh et al.!
UPDATE: See also Chris' explanation and Peter Donis' comment re. unbound methods gone from py3.
Spent some time chasing down what looks like a race condition in the watchdog code in trinity.
The symptom was a crash on x86-64, where it would try and decode a 32-bit syscall using the 64-bit syscall table. This segfaulted, because the 64-bit table is shorter. I stared at the code for quite a while, and adding debugging printfs at the crash site made the bug disappear. What I think was happening was that the child processes are updating two separate variables (one, a bool that says if we’re doing 32 or 64 bit calls, and two the syscall number), and the watchdog code was reading them in the middle of them being updated. I added some locking code to make sure we don’t read either value before an update is complete.
I’ve not managed to reproduce the bug since, so I’m really hoping I got it right.
Christoffer Dall lead a session today at Linaro Connect discussing standards for portable ARM virtual machines (video). About a week ago, Christoffer posted a draft specification to the linux-arm-kernel, kvm and xen mailing lists which attracted lots of useful feedback. Today we went over the major points of issue and Christopher is going to take the feedback to prepare a new draft.
Many of the issues raised boil down to how much reach the spec should have. If it specifies too much, then it will be burdensome for vendors to be compliant, but if it specifies too little then it won’t be useful for making portable disk images. Today we talked about how specific it must be on the topics of required hardware, required virtual interfaces (virtio, xenbus), firmware interface (UEFI) and hardware description (ACPI, FDT).
We also talked about the use-cases covered by this spec. For instance, while there is interest in supporting some hypothetical future version of ARM Windows as either a host or a guest, it is pointless to try and guess what requirements Microsoft will have. For now the focus is on Linux hosts running either Xen, KVM or QEMU, with guests running predominantly Linux (while still supporting any guest OS that conforms). OS vendors should be able to use the spec to design installation and update tools that will work with any compliant virtual machine.
The ARM Server Base System Architecture (SBSA) specification defines the basic requirements for ARM server hardware. Christoffer used the SBSA as a starting point, but quickly realized that the peripheral options described in the SBSA makes little sense in a virtual environment. For instance, a virtual machine can certainly emulate a SATA controller, but it can provide far better performance with an interface designed for virtualization. It was asked if the spec should specify a choice of either virtio or xenbus, but the problem with doing so is it effectively requires OSes to implement support for both in order to be compliant. This isn’t a problem for Linux guests because the kernel already has drivers for both, but it could be a problem for non-Linux guests.
Instead the choice was made to treat virtual buses in exactly the same way we treat real hardware; it is still up to the OS to include driver support for the platform it is running on. OS vendors are strongly encouraged to support both, but the spec does not require them to do so. If only one is supported then the onus is on them to list it in their own requirements.
Particular attention was given to the SBSA serial port requirement. Level 1 of the SBSA requires the platform implement a debug port which is register compatible with ARM’s pl011 UART. Ian Campbell and Stefano Stabellini from Citrix were concerned that implementing full pl011 emulation would perform poorly and would be require a lot of work to implement. However, Alexander Graf pointed out that an always available console device would eliminate a lot of the pain of failed booting without any log output. It was also pointed out that the SBSA does not actually require a full pl011 implementation. DMA and IRQ support are not necessary, which makes emulation trivial, and the virtual UART is only expected to be used during early boot scenarios. Normally console output will be reported first via the UEFI console before ExitBootServices() is called, and then via the VM’s preferred console device. At the close of the discussion we decided to require the SBSA debug port definition in the VM spec.
The requirement of UEFI for the firmware interface was mostly uncontroversial. In the earlier mailing list discussion, Dennis Gilmore did take issue with specifying UEFI over U-Boot given that UEFI is not in heavy use on 32-bit ARM. U-Boot is also making strides forward in standardizing the boot flow which would make it it more suitable for VM scenarios. Dennis is concerned that UEFI would require a lot of new effort to get working. However, that work has already been completed. There is a 32-bit port of UEFI running under QEMU, mainline GRUB includes ARM UEFI support, and merging kernel support is in progress.
None of the VM developers in the room today seemed concerned about requiring UEFI for virtual firmware, and the UEFI spec covers quite a few standard booting scenarios including, removable media, network booting, and booting from a block device. The feeling is that it is important for both 64-bit and 32-bit virtual machines to have the same behaviour and so the UEFI requirement will remain.
Deciding whether an FDT or an ACPI hardware description is required was more of an concern. Jon Masters from Red Hat has previously stated that Red Hat Enterprise Linux will only support booting with ACPI. There is concern that the specification will not be acceptable to Red Hat if it does not require ACPI. However, ACPI is still a work in progress and we don’t yet know how to implement it in a VM. Since all of the VMs already use FDT, and will continue to do so for the foreseeable future, it was decided to make FDT support mandatory in version 1 of the spec. A future version 2 will allow ACPI to be provided in addition to FDT with the expectation that an OS vendor can choose to make ACPI support mandatory for their product.
For the next steps, Christoffer is going to take all the comments from the mailing list and today’s meeting and he will post a second draft of the spec. Then after further feedback, the specification will probably get published, possibly as a Linaro whitepaper.
|Opened since 2014-02-01||15||76||32||(123)|
|Closed since 2014-02-01||33||66||29||(128)|
|Changed since 2014-02-01||33||256||36||(325)|
As part of the Lightsaber project, I’ve been looking for a low pin count way to add control since the ATTiny85 that I’m using only has 6 IO pins. For the prototype I connected a button and a potentiometer to a pin each. I’d like to have an accelerometer and another button or two, but that uses up pins pretty quickly. However, if I hang all the controls off an i2c bus, then I only need to use two IO pins.
The Wii Nunchuk just happens to be an i2c bus. It also happens to have 4 inputs built into it: 2 buttons, a 2-axis joystick and a 3 axis accelerometer. That’s pretty close to everything I want. It also aggregates reading all of those sensor inputs into a single i2c transaction which means less work for the ATTiny85 software.
Official Wii Nunchuks aren’t the cheapest things in the world. Even 8 years after the Wii was first released, a genuine Nintendo Nunchuk is £15. That blows my budget for this project. I can however order replica Nunchuks via Aliexpress for a mere £2.95 each including shipping. I ordered a lot of 5 to experiment a couple of weeks ago and they arrived today.
For such a low price I was not expecting much, and indeed, my expectation were met. They work, I’ll say that much for them, but I wouldn’t want to use them for actually playing a game. Button presses don’t always make contact and feel a bit sloppy. I’m not too worried about that though because I’m going to gut them for the electronics and throw away the plastic.
More troublesome though is that the clones don’t behave in exactly the same as an official Nintendo Nunchuk. In fact, in the lot of 5 I purchased I seem to have two different variants, each of which behaves differently. Two of them I was able to get working completely by following the instructions in this forum post. The other three are recognized by the new code, but the event reports are still encrypted. I need to do some debugging to figure out what else is needed.
Cracking the case open, it’s clear that the plastic moulding is a direct copy of the Nintendo part. The boards in both versions have exactly the same outline as the Nintendo one and all the plastic looks identical. One of them even uses the Nintendo tri-wing screws. The boards themselves look designed to be as cheap as possible. The main controller is an anonymous die-on-board held with a blob of epoxy. Soldering quality is marginal at best.
Not that any of this bothers me. If I can get it mounted inside the Lightsaber hilt then it will do the job nicely for less than it would cost to buy each button and sensor individually.
|Opened since 2014-02-21||1||23||7||(31)|
|Closed since 2014-02-21||5||25||4||(34)|
|Changed since 2014-02-21||11||232||11||(254)|
I've released man-pages-3.61. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.
This was a relatively small release. Among the more notable changes in man-pages-3.61 are the following:
Last year I hacked up a small shell script to test various IO related things like “create a RAID5 array, put an XFS file system on it, create a bunch of files on it”.
Despite its crudeness, it ended up finding a bunch of kernel bugs. Unfortunately many of them were not easily reproducible, and required hours of runtime. There were also some problems with scaling the tests. Every time I wanted to add another test, or another filesystem, the overall runtime grew dramatically. Before my test box with 4 SATA disks died, it would take over 3 hours for a single run.
So I’ve been sketching up ideas for a replacement to address a number of these shortfallings.
Firstly, it’s in C. Shell was fun for coming up with an initial proof of concept, but for some things like better management of threads, it’s just not going to work. Speaking of threads, one of the reasons that the runtime was previously so long was that it never took advantage of idle disks. So if for example, I have 4 disks, and I want to run a 2 disk RAID0 stripe in one test, I should be able to launch additional threads to do something interesting with the other 2 idle disks.
The code for this is still very early, and doesn’t do much of anything yet, but it’ll show up on github at some point.
In the meantime, I’ve been trying to put together something to test on. For reasons unexplained, the quad opteron that held all my disks no longer powers up. I spent a couple hours trying to revive it with various spare parts, without luck.
Yesterday the idea occurred to me that I could just use a USB hub and a bunch of old memory sticks for now.
It would have the advantage of being easily portable while travelling. Then I rediscovered just how crap no-name chinese USB hubs are. Devices sometimes showing up, sometimes not. Devices falling off the bus. Sometimes the whole hub disappearing. Sometimes refusing to even power up. I tossed the idea. For now, I’ve got this usb-sata thing connected to an SSD. Portable, fast, and surprisingly, entirely stable.
I’ve got a bunch of other ideas for this tool beyond what the io-tests shell script did, and I suspect after next months VM/FS summit, I’ll have a load more.
Piston, an Openstack-in-a-box vendor are a sponsor of the Red Hat Summit this year. Last week they briefly ceased to be for no publicly stated reason, although it's been sugggested that this was in response to Piston winning a contract that Red Hat was also bidding on. This situation didn't last for long - Red Hat's CTO tweeted that this was an error and that Red Hat would pay Piston's sponsorship fee for them.
To Red Hat's credit, having the CTO immediately and publicly accept responsibility and offer reparations seems like the best thing they could possibly do in the situation and demonstrates that there are members of senior management who clearly understand the importance of community collaboration to Red Hat's success. But that leaves open the question of how this happened in the first place.
Red Hat is big on collaboration. Workers get copies of the Red Hat Brand Book, an amazingly well-written description of how Red Hat depends on the wider community. New hire induction sessions stress the importance of open source and collaboration. Red Hat staff are at the heart of many vital free software projects. As far as fundamentally Getting It is concerned, Red Hat are a standard to aspire to.
Which is why something like this is somewhat unexpected. Someone in Red Hat made a deliberate choice to exclude Piston from the Summit. If the suggestion that this was because of commercial concerns is true, it's antithetical to the Red Hat Way. Piston are a contributor to upstream Openstack, just as Red Hat are. If Piston can do a better job of selling that code than Red Hat can, the lesson that Red Hat should take away is that they need to do a better job - not punish someone else for doing so.
However, it's not entirely without precedent. The most obvious example is the change to kernel packaging that happened during the RHEL 6 development cycle. Previous releases had included each individual modification that Red Hat made to the kernel as a separate patch. From RHEL 6 onward, all these patches are merged into one giant patch. This was intended to make it harder for vendors like Oracle to compete with RHEL by taking patches from upcoming RHEL point releases, backporting them to older ones and then selling that to Red Hat customers. It obviously also had the effect of hurting other distributions such as Debian who were shipping 2.6.32-based kernels - bugs that were fixed in RHEL had to be separately fixed in Debian, despite Red Hat continuing to benefit from the work Debian put into the stable 2.6.32 point releases.
It's almost three years since that argument erupted, and by and large the community seems to have accepted that the harm Oracle were doing to Red Hat (while giving almost nothing back in return) justified the change. The parallel argument in the Piston case might be that there's no reason for Red Hat to give advertising space to a company that's doing a better job of selling Red Hat's code than Red Hat are. But the two cases aren't really equal - Oracle are a massively larger vendor who take significantly more from the Linux community than they contribute back. Piston aren't.
Which brings us back to how this could have happened in the first place. The Red Hat company culture is supposed to prevent people from thinking that this kind of thing is acceptable, but in this case someone obviously did. Years of Red Hat already having strong standing in a range of open source communities may have engendered some degree of complacency and allowed some within the company to lose track of how important Red Hat's community interactions are in perpetuating that standing. This specific case may have been resolved without any further fallout, but it should really trigger an examination of whether the reality of the company culture still matches the theory. The alternative is that this kind of event becomes the norm rather than the exception, and it takes far less time to lose community goodwill than it takes to build it in the first place.
 And, in the spirit of full disclosure, a competitor to my current employer
 Furthering the spirit of full disclosure, a former employer
|Opened since 2014-02-14||0||3||23||3||(29)|
|Closed since 2014-02-14||0||10||14||7||(31)|
|Changed since 2014-02-14||0||11||40||7||(58)|
I've scheduled two further public iterations of my Linux/UNIX System Programming course in Munich, Germany, in April and June, and I hope to announce a San Francisco date soon as well. (I'm also available to give on-demand tailored versions of the course onsite at customer premises, in Europe, the US, and further afield.)
The 5-day course is intended for programmers developing system-level, embedded, or network applications for Linux and UNIX systems, or programmers porting such applications from other operating systems (e.g., Windows) to Linux or UNIX. The course is based on my book, The Linux Programming Interface (TLPI), and covers topics such as low-level file I/O; signals and timers; creating processes and executing programs; POSIX threads programming; interprocess communication (pipes, FIFOs, message queues, semaphores, shared memory), network programming (sockets), and server design.
The course has a lecture+lab format, and devotes substantial time to working on some carefully chosen programming exercises that put the "theory" into practice. Students receive a copy of TLPI, along with a 600-page course book containing the more than 1000 slides that are used in the course. A reading knowledge of C is assumed; no previous system programming experience is needed.
Some useful links for anyone interested in the course:
Some work today on trinity to rid it of some hard-coded limits on the number of child processes. Now, if you have some ridiculously overpowered machine with hundreds of processors, it should run at least one child process per thread instead of maxing out at 64 like before. (It also allows overriding the maximum number of running children with the -C parameter as always, now with no upper bound, other than memory allocation for all the arrays).
Asides from that, some digging through coverity, and some abortive attempts at cleaning up some more “big function” drivers. Some of them are such a mess they need bigger changes than simply hoisting code out into functions. There’s a point though where I start to feel uncomfortable changing them without hardware to test on.
I've released man-pages-3.59. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.
The sole change in this release is the conversion of various man pages that contained non-ASCII characters, as well as the changelog, to UTF-8 encoding, a task completed thanks largely to some scripts provided by Peter Schiffer.
Update, 2014-02-18: turns out that a couple of section 7 pages had encoding errors added in man-pages-3.59. So, I've decided to make a quick small 3.60 release that fixes those issues, and includes a few other unrelated minor fixes.
|Opened since 2014-02-07||0||5||16||16||(37)|
|Closed since 2014-02-07||1||4||12||14||(31)|
|Changed since 2014-02-07||0||12||32||20||(64)|
For reasons not entirely clear, I felt the urge to hack some on x86info again. Ended up committing dozens of changes cleaning up old crap. A lot of that code was written over 10 years ago, and I like to believe that my skills have improved somewhat in that time. Lots of it really wants gutting and rewriting from scratch, but the effort really isn’t worth it given for many of its features there now exist better tools (lscpu for example). It’s likely that 1.31 will be the final release. After that, I’m thinking of stripping out some parts of it to standalone tools, and throwing away the rest.
Chasing the xfs overflow continued.
Looked over some coverity bugs. One of the features it offers is a view that shows functions with a high cyclomatic complexity. What this typically translates to is “really long functions that are in dire need of splitting up”.
After those four, the next dozen or so items on the list were still quite long functions, but I somehow quickly lost the urge to look at reiserfs and isdn.
A lot of this kind of ‘cleaning’ is pretty mind numbing boring work, but sometimes when I see something that horrific, I just can’t walk away..
So as usual FOSDEM was a blast and as usual the hallway track was great too. The beer certainly helped with that ... Big props to the entire conference crew for organizing a stellar event and specifically to Luc and the other graphics devroom people.
I've also uploaded the slides for my Kernel GPU Driver Testing talk. My OCD made me remove that additional dot and resurrect the missing 'e' in one of the slides even. FOSDEM also had live-streaming and rendering videos should eventually show up, I'll add the link as soon as it's there.
Update: FOSDEM staff published the video of the talk!
The 2014 Linux Security Summit will be held on the 18th and 19th of August, co-located with LinuxCon in Chicago, IL, USA. The Kernel Summit and several other events will also be co-located there this year.
The Call for Participation will be announced later via the LSM mailing list.
A belated statistics dump of how Coverity looked during 3.13 development.
Things kinda slowed down over the xmas break, but the overall trend from 3.12 -> 3.13 is ~200 open issues lower, even after including new issues being introduced.
Which is what 3.11 -> 3.12 showed too. Just 25 more releases, and we’re done :-)
The new issues are getting jumped on pretty quickly too, which is good to see.
Hit an interesting bug yesterday without really trying. What looks like a case of interrupts being disabled when they shouldn’t be is believed to actually be a stack overrun. But rather than crashing, we’re corrupting the state of the irq flags. I’m a little sceptical still that this is the actual cause, but right now it’s the best answer being offered. As such, people are starting to look at the amount of stack used during some of the IO paths.
As can be seen in the linked stack trace above, the callchain can get pretty deep, and seems to be getting worse over time.
Fixed up a handful of small trinity bugs that caused children to segfault. There’s still a few remaining. I’ve held off on fixing them for now, because the current state of trinity segv’ing with certain parameters is a useful reproducer for the bug mentioned above.
Continued chipping away at the coverity backlog.
It’s great to look out and see all the ingenious things people have build with the nifty WS2812b LEDs pixels from WorldSemi, but I’m quite surprised that I’ve yet to see any project use them in the way they were intended. I speak, of course, of Lightsabers.
Most custom Lightsaber designs are built around either a superbright LED module in the hilt shining up into the blade, or a string of LEDs up through the blade. In both designs it is difficult to emulate the effect of the blade growing up from the hilt when turned on, and then drawn back in when it is turned off. However, WS2812b LEDs are individually addressable, so a small microcontroller can easily animate the blade growing and shrinking effect. Blade shimmer and flash effects are equally easy to implement and it can be any colour you like. Therefore it is obvious that this is what the WS2812b was designed for.
After finishing a few simple projects, my children and I have been looking for something new to build. When I suggested building Lightsabers they jumped all over the idea. Our friend Mandy, who is also a huge Star Wars nut, was going to be visiting us, so we made sure to start the project while she was in town.
Building Lightsabers at home is by no means a new idea. Plans, kits and guides are all over the Internet. I spent a lot of time reading about what other people have done to prepare for this project and have gotten a lot of good suggestions about what to do. There are some incredibly ambitious projects out there even to the point of precision machining the hilt, but we’re not going to be nearly so ambitious.
Most of our materials will come from the plumbing section of our local B&Q, electronics from Proto-PIC and Aliexpress, and polycarbonate tube from theplasticshop.co.uk. Right now my prototype is running at about £46 in parts which includes £12.50 for LEDs and £11.50 for electronics, but I’d like to get the total down to about £25.1
For the prototype we’re using lengths of ABS waste water pipe which is easy to work with. Hopefully we can find paint which will adhere to it well when sanded. The blade is a length of 25mm polycarbonate tubing and a half-sphere polycarbonate cap for the tip. The LEDs are inside a second length of smaller 16mm polycarbonate tube which keeps the LEDs straight and holds them in the center of the blade.
Polycarbonate tube is transparent and the individual LEDs can be seen right through the tube, so the light needs to be diffused in some way. Two suggestions I’ve come across are to either sand the inside of the tube, or to use a roll of diffusing film on the inside of the blade. I’ve used the sanding methods which makes it look better, but the individual pixels can still be seen, at least in person. On photos it looks awesome, but only because it saturates the camera sensor.
You can see on the photo on the right the individual pixels showing up in the reflection of the blade even though the blade itself looks like a single beam of light. I’m still experimenting with how to make it look better and I’d love to hear about how others have solved the problem.
To light the blade, I used 2 lengths of 60 pixels/m WS2812b LED strip and pasted them back-to-back. Originally I merely used one 2M length and folded it in half, but doing it that way requires twice the memory and processor time to drive the pixels.
The controller is a Digispark which uses the attiny85 microcontroller and is Arduino compatible. The Digispark has an on-board 500mA regulator, but that isn’t nearly enough for 110 WS2812b LEDs which can draw up to 60mA per pixel when fully lit. Instead I added a Pololu 5V/3.5A regulator module to provide power to the strip, which is still less than the theoretical maximum of 6.6A for all the LEDs, but for how they are being used, somewhere around 2.5A will be the typical draw. Finally, I’m merely using 6xAA batteries to provide power, which is fine for now, but I will be replacing it with a rechargeable pack soon.
Control is provided via a button and a potentiometer wired to GPIO pins on the attiny. The firmware itself is trivial.2 I’m using the NeoPixel library from Adafruit to drive the control signal. Pressing the button toggles the saber on and off. Colour is set based on the potentiometer voltage reading. I’ve yet to add any other effects.
Mechanically, the potentiometers I’m used have turned out to be a bad choice. I’ve damaged two so far while mounting them inside the case. What happened is I ended up flexing the base so that the wiper doesn’t make solid contact any more and it causes the colour to flicker badly. I’m considering replacing it entirely with the sensor board from a Wii Nunchuck. Cheap replicas can be found on Aliexpress for about £3 each which I can teardown for parts. That would give me two buttons, a 2-axis joystick and 3-axis accelerometer all readable from a single i2c transaction.
The next thing I need to do is get all everything to fit inside the hilt. It’s mostly there, but the battery pack is a little large and the controls are interfering with the circuit boards. I may replace the digispark and power regulator boards with a single custom board to keep the size down. I would also like to add sound effects. A speaker could be attached to one of the attiny85 PWM pins, but generating the sounds may be asking a bit to much out of it. It only has 8K of flash and 512 bytes of SRAM after all. I could switch to a more capable microcontroller, but it is rather fun to see how much I can do with a small device.
I’m blithely ignoring shipping costs here. Once the design is sorted out I’ll purchase enough parts to build about a dozen sabers for my son’s birthday party. Buying that many at once diffuses the shipping costs somewhat. ↩
eventually I’ll post a git repo with the source ↩
I published an introductory article on practical lock elision with Intel TSX at ACM Queue/CACM: Scaling existing lock-based applications with lock elision. Enjoy!
I've released man-pages-3.58. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.
This was a relatively small release. Among the more notable changes in man-pages-3.58 are the following:
First day back after taking a week off. Which of course meant being buried alive in email. I deleted a ton of stuff, so if you were expecting a reply from me and didn’t get one, resend.
Started poking at trinity bugs. There’s a case where the watchdog got hung in circumstances where the main process had quit (ironically, it got stuck in the routine that checked if it was still alive). Fixed that up, but still uncertain why that path was ever being taken. The new dropprivs code has shaken out a few more corner cases, such as fork() failures that weren’t being checked for. Hopefully that’s a bit more robust after todays changes.
Hit a weird irda/lockdep bug. Not sure what’s going on there, and everyone is reluctant to dig into irda too much. Can’t say I blame them tbh, it’s pretty grotty and unmaintained.
Spent some time looking over the recent coverity issues, and dismissed a bunch. Requested a bunch more components for some of the more ‘bug heavy’ parts of the kernel.
I realized I never did a statistics dump for 3.13 like I did for 3.12. I’ll sort that out tomorrow.
I spent last week looking into the firmware configuration protocol used on current IBM system X servers. IBM provide a tool called ASU for configuring firmware settings, either in-band (ie, running on the machine you want to reconfigure) or out of band (ie, running on a remote computer and communicating with the baseboard management controller - IMM in IBM-speak). I'm not a fan of using vendor binaries for this kind of thing. They tend to be large (ASU is a 20MB executable) and difficult to script, so I'm gradually reimplementing them in Python. Doing that was fairly straightforward for Dell and Cisco, both of whom document their configuration protocol. IBM, on the other hand, don't. My first plan was to just look at the wire protocol, but it turns out that it's using IPMI 2.0 to communicate and that means that the traffic is encrypted. So, obviously, time to spend a while in gdb.
The most important lesson I learned last time I did this was "Check whether the vendor tool has a debug option". ASU didn't mention one in its help text and there was no sign of a getopt string, so this took a little longer - but nm helpfully showed a function called DebugLog and setting a breakpoint on that in gdb showed it being called a bunch of the time, so woo. Stepping through the function revealed that it was checking the value of a variable, and looking for other references to that variable in objdump revealed a -l argument. Setting that to 255 gave me rather a lot of output. Nearby was a reference to --showsptraffic, and passing that as well gave me output like:
BMC Sent: 2e 90 4d 4f 00 06 63 6f 6e 66 69 67 2e 65 66 69
BMC Recv: 00 4d 4f 00 21 9e 00 00
which was extremely promising. 2e is IPMI-speak for an OEM specific command, which seemed pretty plausible. 4d 4f 00 is 20301 in decimal, which is the IANA enterprise number for IBM's system X division (this is required by the IPMI spec, it wasn't an inspired piece of deduction on my behalf). 63 6f 6e 66 69 67 2e 65 66 69 is ASCII for config.xml. That left 90 and 06 to figure out. 90 appeared in all of the traces, so it appears to be the command to indicate that the remaining data is destined for the IMM. The prior debug output indicated that we were in the QuerySize function, so 06 presumably means… query the size. And this is supported by the response - 00 (a status code), 4d 4f 00 (the IANA enterprise number again, to indicate that the responding device is speaking the same protocol as you) and 21 9e 00 00 - or, in rational endianness, 40481, an entirely plausible size.
Once I'd got that far the rest started falling into place fairly quickly. 01 rather than 06 indicated a file open request, returning a four byte file handle of some description. 02 was read, 03 was write and 05 was close. Hacking this together with pyghmi meant I could open a file and read it. Success!
Well, kind of. I was getting back a large blob of binary. The debug trace showed calls to an EfiDecompress function, so on a whim I tried just decompressing it using the standard UEFI compression format. Shockingly, it worked and now I had a 345K XML blob and presumably many more problems than I'd previously had. Parsing the XML was fairly straightforward, and now I could retrieve the full set of config options, along with the default, current and possible values for them.
Making changes was pretty much just the reverse of this. A small XML blob containing the new values is compressed and written to asu_update.efi. One of the elements is a random identifier, which will then appear in another file called config_log along with a status. Just keep re-reading this until the value changes to CM_DONE and we're good.
The in-band configuration appears to be identical, but rather than sending commands over the wire they're send through the system's IPMI controller directly. Yes, this does mean that it's sending compressed XML through a single io port. Yes, I'd rather be drinking.
I've documented the protocol here and hope to be able to release this code before too long - right now I'm trying to nail down the interface a little, but it's broadly working.
|Opened since 2014-01-31||0||4||17||10||(31)|
|Closed since 2014-01-31||37||12||11||5||(65)|
|Changed since 2014-01-31||0||11||53||17||(81)|