July 03, 2009

Die HAL Die

Time for a Stalinist purge

The purge is complete

As of a week ago or so, HAL is no longer required by either NetworkManager or ModemManager.  This helps streamline the hardware detection process and cleans up that code a lot.  It was a fun ride and a lot of other great stuff came along with the udev port, because rewriting everything to use udev pretty much required cleaning up a bunch of other stuff.  The udev parts were a lot easier than I thought they would be; what was complex was rewriting a ton of ModemManager to be more flexible and work better with multi-port modems on the one hand, and really stupid quirky hardware on the other.

For everyone in the US, have wonderful 4th of July.  To everyone who’s not, have fun at the Desktop Summit.  Had prior plans meaning I couldn’t attend, but I’m sure the Red Hat team will honor my absence by spreading the love and drinking all the liquor.  Rock on, GNOME.

July 02, 2009

wordpress I hate you

After seeing that I possibly might have had some exploits run on my site again, I upgraded to wordpress 2.8

After reading up on hardening wordpress, the official site mentions AskApache, some plugin that helps hardening. I’m not too sure about it yet, because it wants to be writing .htaccess files in my directories and for that I have to open up more than I would want. But hey, let’s give it a go.

At some point it creates a username and password that you choose. I go on and configure stuff, not knowing very well which of its many modules I’m supposed to activate, or why.

I forget about it, and ten minutes later I check my mail. I have a mail from AskApache. With my login details. And the password in plaintext.

Is the WordPress security model just fundamentally broken ?

5luendo birthday party

Yesterday was cause for celebration. We got together to celebrate five years of the Fluendo Group!

01072009

The picture quality is bad, and not everyone is in it, but I just took it on a whim after marveling how many people were there. I didn’t even know all of them – yes it’s gotten to that point. 67 months ago I arrived in Barcelona without the company even being created…

We celebrated with mountains of cheese and rivers of wine which in the first year would have lasted us a few weeks and now only lasted an hour.

As magical accidents sometimes happen, today is also the day Fluendo received the certification confirmation from Dolby for our DVD player. It didn’t take long to land in the webshop, so finally our DVD player is up for sale! So you know what to get us for our birthday – a shop checkout with the dvd player in your cart.

Good timing – that means that at this year’s GUADEC/Desktop Summit I know what the answer will be to one of the most asked questions I get.

This is the first GUADEC I’m going to with Kristien in tow, I hope she can manage. I’ll be there from Monday through Friday, because the week is bookended by two weddings. Looking forward to a GStreamer summit on Thursday discussing 1.0…

radeon DDX has initial KMS support

So we've had an -ati DDX with KMS support in a branch for quite a while, but it was starting to grow into a very big mess, and some of the hacks in it were quite unmaintainable.

So started cleaning it up and pushing the bits to master.

Step 1 was adding macros in all the places in the accel code to abstract away the different command submission methods, and
add some ifs to do KMS specific things. Once this was done in theory the accel code wouldn't functionally regress and wouldn't require any more changes.

Step 2 was bringing over the kms and DRI2 support files from the branch.

Step 3 was making the decision between if (kms) if (!kms) blocks all over radeon_driver.c in all the various functions or having a nearly completely separate KMS/DRI2 driver file. The original code has the if approach and it was an unmaintainable nightmare, so I opted for approach 2 and it definitely is the best. At driver probe time in radeon_probe.c, I now do the KMS check if the pciaccess probe is called (kms without pciaccess is probably not going to matter). If I get KMS supported the driver picks a nearly completely different set of functions for PreInit/ScreenInit etc. I now have a separate radeon_kms.c file which has all the DDX interface in it. Of course if we have any changes they may need to be done in two places, but its a lot cleaner than it was in the other codebase.

I also ported the KMS code to use the libdrm_radeon buffer management code which is shared with mesa, instead of the DDX having its own buffer manager code base. This code is well tested via mesa, however it hasn't got all the features/optimisations that I've added to the DDX bufmgr over the last while.

So what's left:
Missing optimisation from old buffer manager:
1. buffer in VRAM? has the buffer ever explicitly been validated to VRAM, this allow for an optimisation on download from screen.
2. Download from screen, with driver pixmaps we don't know if the buffer is in VRAM or GTT, with (1) we can either blit if in VRAM, or just memcpy if in GART.
3. force a buffer to validate in GTT or force a buffer to stay in fast CPU access space - this was really useful some sw fallbacks where a buffer would end up in VRAM and then get used by the CPU from there. It probably only really takes the place of fixing EXA properly so the pixmap scoring is separate from the offscreen memory, and having a XA that works with driver pixmaps a lot better.
4. bugs and crashes I appear to be hitting a realloc crash at some point where glibc reenters itself and fails.

July 01, 2009

Tuesday

left work around 38 degrees C, got a haircut, went for some great tapas on my own reading Darkly Dreaming Dexter, went to a bar, met up with friends, an impromptu bbq plan was hatched, went to a lovely atico at Portal De L’Angel, barbecued in a soothing summer breeze, rode home on the back of a motorcycle hanging on for dear life. All in all a typical Barcelona summer Tuesday.

All my talk slides are now on Slideshare

I've uploaded the slides from essentially all of the talks I've given to Slideshare. This is likely more useful than my previous strategy of dumping them in a directory and leaving the rest up to search engine bots.

Click here for the full list of slides. They are all published under the Creative Commons attribution share-alike license.

One interesting slide title, which I'd forgotten about, is Kernel Security for 2.8, from the 2004 Kernel Summit. This was from when we were still expecting a 2.7 development kernel leading to a 2.8 stable kernel -- I think Linus announced the change in development model at that summit.

Included in this set of slides are several introductory and deeper technical overviews of SELinux; I hope they are useful for people who are looking for information for themselves, or if making their own slides. As the license suggests, please feel free to copy and extend them (but note that the older ones are going to be more out of date).

June 30, 2009

GIMP and the value of standard window controls

The individual who came up with the current "toolband" UI for GIMP should be made to use it on a netbook, while an eagle pecks on his liver. For crying out loud I'm having trouble placing windows on a 1440x900 screen.

Repeating the cycle, time to kill rhpl

Continuing on the historical vein, once upon a time there was a package included in Red Hat Linux called pythonlib. One of the things I helped do was to finish killing it off. We went along and then a few releases later, wanted to share some python code again. Thus was born rhpl – the Red Hat Python Library. It started out simply enough — some wrappers for translation stuff and one or two other little things. And then it began to grow, as these things do over time. Some of the things made sense, some less so. Over time, pieces have moved around into other things (including rhpxl — the Red Hat Python Xconfig library)

Fast-forward to today and it’s a bit of a mess with things contributed by various people and used in one config tool (or two) and barely maintained. Also a lot of the things being wrapped have gotten a lot better in the python standard library. The gettext module is leaps and bounds better than the one from python 1.5 and also the subprocess module is awesome for spawning processes.

Therefore, I think it’s time to continue the cycle and kill off rhpl for Fedora 12. I’m starting to make patches and file them for packages using rhpl to transition them over. Help much appreciated from anyone that wants to join in.

For the rhpl.translate -> gettext case, you generally want to replace the import of _ and N_ from rhpl.translate with something like

import gettext
_ = lambda x: gettext.ldgettext(domain, x)
N_ = lambda x: x

Comments

June 29, 2009

Increasing testing of unreleased kernels.

This past weekend I’ve been thinking of reviving an idea that has come up countless times. Producing RPM builds of the rawhide kernel for our already released Fedoras. The reason for not doing them so far has come down to bandwidth. (in terms of build system throughput, disk space, mirroring, and people bandwidth).

What I’m toying with doing is some devel kernels for Fedora 11 that are built outside of the Fedora build system. The Fedora kernel team now has enough build bandwidth for x86-[64] that we can actually get builds for those architectures done faster than koji.

Disk space – I’m thinking of just keeping the last 2-3 builds available.

Mirroring – Instead of having these be part of Fedora proper, I think an external repo on something like fedorapeople.org will suffice.

Which just leaves people bandwidth. For the most part the work is going to be just regularly syncing the devel/ branch with a CVS branch of F-11/ For some of this work, some scripting could be done to alleviate some of the pain. Also the frequency at which we push out these builds will determine the pain point. Perhaps every -git isn’t particularly valuable anyway. One build every handful of -git’s should be sufficient for bisecting.

There does remain one additional barrier. Occasionally we introduce something in rawhide builds which just won’t work on F11. For example, the kernel modesetting patches are tied closely to Xorg packages. Sometimes upstream changes require changes in mkinitrd or udev or some other ‘plumbing’. Some of these are regressions, and hopefully by identifying them sooner we can get them reverted/fixed upstream. Sometimes however, things get deprecated, and we need to change these packages. I’m not sure how to cope with this yet in a devel-for-F11 scenario.

One other thing that might be fun to throw into this would be the generation of -vanilla packages. The only reasons we don’t do these as part of the regular kernel builds is the various bandwidth concerns above. The specfile copes with spitting out RPMs with very little work needed. Josh Boyer has been occasionally doing these builds, though there hasn’t been a huge uptake. It’s unclear if this is due to lack of interest, or just a lack of publicity.

Another question to be answered is whether we go the route of enabling debugging in all builds as we do in rawhide, or do separate -debug builds. I’m leaning towards the latter.

I’m not committing anything to this for sure just yet, but it’s something I’ve been giving quite a bit of thought. There are still a bunch of unanswered questions.

Post from: codemonkey.org.uk

No related posts.

Rebuilding VLC in Rawhide

It looks like I have to rebuild VLC for F12 Rawhide, since it's holding up yum update and its dependencies are too expansive to be worked around with --skip-broken. I kept hoping that Nicolas would do his duty but apparently it was in wain, while everything is being built, even Pidgin. The downside of this is that I have better things to do than compensating for an AWOL maintainer, but an upside exists too. It's past due that I learned all the fancy tools Fedora grew over the years, such as Mock. Also, if I somehow work this into Koji, fruits of my labour will be available to everyone.

UPDATE: It turns out I need to be a registered developer at RPMfusion before I can use the tools, so I just built it from an src.rpm, and so I'm in posession of vlc-core-1.0.0-0.11rc3.fc12.x86_64.rpm. It was a rather painless process, and remember that Gentoo users do it every day before breakfast.

Programming contest

Jan and Arek entered this year’s ICFP programming contest. It’s a three day programming contest, so this morning they asked if they could swap their Friday project day for today to finish the contest. They seem to be in the top third at the moment.

Arek’s never been a fan of long meetings, but today’s standup meeting was particularly amusing with Arek urging everyone to keep focused and get out there quickly. They had less than two hours left on the clock.

29062009

Spot the seven differences

Amusingly, today they came to work with almost the same shirt on, by accident! I can only assume there is a big clothes factory in Poland where they have huge stock of the same fabric…

90 minutes left, knock them dead, guys!

SELinux for Humans

I mean, SLUGs...

Paul Wayper gave a couple of talks on SELinux at this weeks' SLUG meeting, and includes links to a couple of very useful slide decks:


The sysadmin slides look particularly useful, as they focus on solving common issues such as running FTP/SAMBA/Apache servers, and provide some very useful general tips, such as looking in the audit log and using policy booleans for high-level policy tweaking.

These slides may be the best, short introduction for sysadmins on the topic that I've seen. It's a difficult thing to get right.

notes on OOo and ARM

To build OOo under ARM, (an apparently massively popular hobby in China at the moment for some reason other), try downloading the normal source from www.openoffice.org and…

./configure
make

that should go a long way. Though for < 3.2 you may need (ooo#100469) applied.

Of course, given that ARM isn’t generally fast, then there are likely cross-compilers, qemu-user hackery, or other stuff involved, but nothing specific to OOo, so if you can, say, build gedit with your environment, then OOo should generally build fine too. Try the same hackery you’d try with a simpler project with OOo.

And it is not porting something to simply build it. You actively have to change something before bandying around the term porting

Kernel Conf Australia - ! all OSOL honest.

So there is a kernel conference on in Brisbane next month, being run by Sun.

Now when this was announced initially it was proposed as an all kernel hackery type get together for folks in the region, not matter which kernel they cared about.

(I did propose to talk about kernel graphics at this and got refused - so maybe I'm just being bitchy, however this is my blog).

20 speakers break down as follow:
12 Sun
1 Intel - Sun manager
1 RH
1 OpenBSD
1 FreeBSD
4 misc.

2 of the misc talks are in some way OSOL related,

the OpenBSD talk is about networking and pf, the RH talk is about security, FreeBSD about storage.

Now really if you aren't into OSOL or ZFS (3 slots OSOL FS related), why would you go. This conference is
local to me and I still couldn't justify paying the signup fee/taking the time to my manager at all. Now if
one of the main kernel and X.org hackers who lives in Brisbane can't be bothered to go, I do wonder
why anyone who isn't into OSOL kernels might be tempted.

There was talk of a .au unconference at one point, which maybe when the whole swine flu escapade is over might
actually be a useful meetup for the aussie open source community.

pidgin-2.5.8 several important bug fixes

http://kojipkgs.fedoraproject.org/packages/pidgin/2.5.8/
pidgin-2.5.8 builds for Fedora 9, 10, 11 and 12.  Please let me know if anything became worse.

June 28, 2009

Out of the Office

I have a son! Emil Au Kristensen, born June 25th, 8lbs 12oz, 22 inches.

Day 3 of FUDCon Berlin

It's day 3 here at FUDCon Berlin. Day 2 was a whirlwind of presentations, talks, conversations, greetings, and collaboration.

I attended Steven's talk about using Koji at CERN which I must say was really awesome. CERN is using their own instance of a koji build system to build rpms that are then deployed across the grid they have around the world. He talked about how much better koji is than their old build system, but also had some questions about how to improve his use of it and their strategy around it. They also need to produce .debs and would like to do so via koji so that should be interesting.

I also went to, and participated in the Git for Hackers talk put on by Yaakov and Jeroen. I even got to get up and show people some of my favorite things to do with git. I learned some new things too which will make my development efforts just that much more easy.

Josh Bressers talked to us a bit about security, and the RHT security team and how they interact with Fedora. He was amazed at how responsive the Fedora project is to security issues, often fixing them before the team he works on is able to check in on Fedora. He is looking for more ways to get involved though. I set him off thinking about what kind of code review we can do in an automated way as code is checked into packages.

Finally I gave two talks (linked to in an earlier blog), one on the future of the Fedora Development Cycle, and one on Automated QA.

We had a rather adventurous dinner in which I got to sit at a table with folks from Romania, Italy, Germany, The Netherlands, and more. It was a fun evening of comparing cultures and food and sports and life styles. The great conversation made up for the fact that food took over an hour to show up (the drinks didn't hurt either).

Today is a hackfest day, and I'm working more on the autotest packages for autoqa. I also got to learn about Fedora FEL which is a fantastic agent of change, not just a simple remix of packages. Chitlesh really has something special going on and I'm glad to see that Fedora was able to provide the platform to launch his efforts. I look forward to seeing more success out of his project. Later he asked me about a problem he was having with his Fedora Hosted site, and I'm looking into updating the git plugin to help out. I also got to meet and talk to Phil Knirsh, mostly about the Fedora on s390 effort. They're ready to start composing images which means I'm going to have to be ready to start writing or reviewing pungi packages as I'm certain there are changes necessary to make it work for s390. I look forward to it, but please, don't send me an s390 (: (I wouldn't mind a beagleboard for playing with Fedora arm...)

I'm going to head out a little early this afternoon and try to see some more sights, like the DDR museum (no not that DDR) and maybe a few other places.

Tomorrow begins my 36~ hour Monday. Boy will that be fun. After I get home and a quick night of doing laundry, I'll be taking a few days off with my wife and son and we'll be staying at my uncle's lake house in Lake Chelan, where I'll be on a boat!

June 27, 2009

mach 0.9.5 ‘MMM…’ released

mach allows you to set up clean roots from scratch for any distribution or distribution variation supported.

This release of mach contains fixes for Python 2.6, and adds Fedora 10 and 11, while fixing the archived Fedora locations.

Get it from the mach project page.

moap vcs bisect

Next step on this weekend’s yakshave: a first implementation of moap vcs bisect!

The interface is lifted from git, obviously, since that’s where most people will know the feature from.

I implemented it first with CVS, so I could fix this pychecker bug which was blocking Fedora from bumping the pychecker version from 0.8.17 (3 years old) to 0.8.18. And sure enough, it picked out the commit I broke.

While implementing and while dealing with CVS’s idea of how it stores CVS revisions and dates and so on, I googled and was amused to find this first hit on google for the words cvs and bisect. Clever Andy! And he cleverly sidestepped the problem I wrestled with by making the user specify two dates at the start instead of trying to figure it out from the checkout. And all in lisp too!

Then, to test that my VCS interface was sane, I implemented it for Subversion as well. That took about 15 minutes, since Subversion is much more sane than CVS. I tried the following command on a flumotion checkout:

moap vcs bisect reset; moap vcs bisect start; svn up; moap vcs bisect good; svn up -r 3000; moap vcs bisect bad; MOAP_DEBUG=4 moap vcs bisect run ./test.sh

With test.sh containing
test -e flumotion/component/consumers/gdp/gdp.py

(In other words, look for the commit that added this file.)

Sure enough, it picked out this commit:


[moap-trunk] [thomas@ana flumotion]$ moap vcs bisect diff
Index: /home/thomas/tmp/flumotion/configure.ac
===================================================================
--- /home/thomas/tmp/flumotion/configure.ac (revision 6909)
+++ /home/thomas/tmp/flumotion/configure.ac (revision 6908)
@@ -212,7 +212,6 @@
flumotion/component/combiners/switch/Makefile
flumotion/component/consumers/Makefile
flumotion/component/consumers/disker/Makefile
-flumotion/component/consumers/gdp/Makefile
flumotion/component/consumers/httpstreamer/Makefile
flumotion/component/consumers/preview/Makefile
flumotion/component/consumers/shout2/Makefile
Index: /home/thomas/tmp/flumotion/flumotion/component/consumers/Makefile.am
===================================================================
--- /home/thomas/tmp/flumotion/flumotion/component/consumers/Makefile.am (revision 6909)
+++ /home/thomas/tmp/flumotion/flumotion/component/consumers/Makefile.am (revision 6908)
@@ -11,7 +11,6 @@

SUBDIRS = \
disker \
- gdp \
httpstreamer \
preview \
shout2
Index: /home/thomas/tmp/flumotion/ChangeLog
===================================================================
--- /home/thomas/tmp/flumotion/ChangeLog (revision 6909)
+++ /home/thomas/tmp/flumotion/ChangeLog (revision 6908)
@@ -1,16 +1,5 @@
2008-06-20 Thomas Vander Stichele <thomas at apestaart dot org>
</thomas>

- * configure.ac:
- * flumotion/component/consumers/Makefile.am:
- * flumotion/component/consumers/gdp (added):
- * flumotion/component/consumers/gdp/gdp.py (added):
- * flumotion/component/consumers/gdp/__init__.py (added):
- * flumotion/component/consumers/gdp/Makefile.am (added):
- * flumotion/component/consumers/gdp/gdp.xml (added):
- Add a GDP consumer.
-
-2008-06-20 Thomas Vander Stichele <thomas at apestaart dot org>
-
* flumotion/component/producers/gdp/gdp.py:
Add error for http://bugzilla.gnome.org/show_bug.cgi?id=532364
</thomas>

So, the feature is ready for testing. It could use some more documenting, and some additional goodies like accepting arguments to moap vcs bisect start for example.

Feedback appreciated!

jhbuild for python

Still on the yak shave expedition.

I’ve written some simple scripts and files to set up and build python 2.3, 2.4, and 2.5 in separate prefixes to be able to test my software against these versions.

If you’re interested, in theory it should be really simple:

As the README says, this should go on to build all versions of python, and install some scripts.

After that, you just run py-2.3 to go into a shell with Python 2.3 on your path.

Don’t say I never did anything for you.

Python disassembly

As part of this weekend’s yakshave, I’m trying to implement a handler for STORE_MAP in pychecker. STORE_MAP is a new opcode in Python 2.6, which speeds up dict building.

So, for the first time I went under the hood of Python and figured out just enough to understand this problem. It was a lot less scary than I thought it was going to be!

It seems that using dis.dis(), one can easily dissassemble any python function into its opcodes. This shows clearly where the behaviour is different between python 2.5 and python 2.6.

Given the following function:
f = lambda: {'a': 1, 'b': 2}

Python 2.6 gives:
1 0 BUILD_MAP 2
3 LOAD_CONST 0 (1)
6 LOAD_CONST 1 ('a')
9 STORE_MAP
10 LOAD_CONST 2 (2)
13 LOAD_CONST 3 ('b')
16 STORE_MAP
17 RETURN_VALUE

I couldn’t find a good description of the output of dis.dis, but in my naiveness I am guessing the following:

  • The first 1 maps to the line number in the code object where the function is found.
  • The second column is the offset of the opcode and its arguments
  • The third column is the opcode name
  • The next column is the arguments for the opcode; in the case of CONST, it shows the index as well as the const object indexed

I am assuming each opcode takes one address location, and each argument takes two; that maps with the address pointers in front of the opcodes.

The opcodes are all documented.

So, in human terms:

  • we start with BUILD_MAP, saying that we’ll create a new dictionary on the stack, with 2 entries.
  • we load the constant with index 0 onto the stack (which happens to be the integer object ‘1′, the value of the ‘a’ key)
  • we load the constant with index 1 onto the stack (which happens to be the string object ‘a’, the key for the ‘1′ value)
  • STORE_MAP pops the key and the value off the stack, storing them in the dict. Note that the key was indeed loaded on the stack after the value. Now only the dictionary is left on the stack.
  • Repeat LOAD_CONST, LOAD_CONST and STORE_MAP for the next set
  • RETURN_VALUE returns the current value on the stack to the caller

Pretty simple, when you look at it twice.

For the same code, python 2.5 gives:
>>> dis.dis(f)
1 0 BUILD_MAP 0
3 DUP_TOP
4 LOAD_CONST 1 ('a')
7 LOAD_CONST 2 (1)
10 ROT_THREE
11 STORE_SUBSCR
12 DUP_TOP
13 LOAD_CONST 3 ('b')
16 LOAD_CONST 4 (2)
19 ROT_THREE
20 STORE_SUBSCR
21 RETURN_VALUE

This code is slightly longer and more complicated. Basically, LOAD_CONST, LOAD_CONST, STORE_MAP was implemented with DUP_TOP, LOAD_CONST, LOAD_CONST, ROT_THREE, STORE_SUBSCR

It looks like DUP_TOP was needed because STORE_SUBSCR consumes the dictionary off the stack, and ROT_THREE is needed because the arguments are pushed on the stack in the wrong order.

Seems like a nice and obvious improvement once you understand it. An exercise for the reader is to profile whether this change actually makes things faster in practice.

So, where does this leave me for pychecker ? It now looks deceptively simple. STORE_MAP simply pops off two items of the stack. There is nothing to check for, since we’re in a dictionary context. So all my implementation needs to do is to pop 2 items off the stack, and that’s it.

And thus it was commited to pychecker CVS. Popping one item off the yak stack!

Presentations from LinuxTag / Fudcon Berlin 2009 available; Tomorrow: Hackfests

My presentations on LinuxTag and Fudcon Berlin 2009 went mostly well. Biggest problem when looking back now: Seems it was more then enough stuff to talk for 90 or 100 minutes, but I only had a hour in total :-/.

Want to look at the slides? Follow these links:

Another problem: There are so many people here at FUDCon and I more and more get the impression that the list with people I "want to meet, shake hands with and talk to for a while, to get a better connection between faces, email addresses, nicks, and (those sometimes overrated) real names" isn't getting shorter as fast as it should to meet everyone in the remaining time. Seems I need to speed up somehow...

It's the same for you? If yes and if you ever wanted to talk to the crazy troublemaker that did or does a lot of work for Fedora, EPEL and Livna/RPM Fusion then just grab me and start to talk to me when you see me walking by or sitting somewhere. The photo on the right shows how I look like, just imagine some mostly inconspicuous glasses on the nose. Ohh, and the hair is a bit shorter right now (I guess nature will do it's best to fix that over time).

BTW, again some informations for the FUDConners: I plan a small "RPM Fusion" hackfest for tomorrow *if* people are interested in one. There is no real plan what to do exactly besides the general idea: Do some work to improve RPM Fusion, and work out the details on the fly. Want to join? Then just watch the wiki and the usual places at FUDCon for details!

This weekend’s yak shave

The yak shave started yesterday evening. The yak stack is actually a forked one this time, both of the forks involving pychecker.

I might not remember everything in order, but in a nutshell the stack is something like this:

  • The original goal for this weekend at the beginning of the week was to release my cd ripper, morituri
  • Something in its pychecker run did not run with pychecker 0.8.17 (the version still in Fedora, from 2006), and worked with pychecker 0.8.18 (my own build). Fork point 1.
  • While releasing moap this week, I realized that Freshmeat changed their remote API, which I should fix before I do another release of anything. Fork point 2.

Fork point 1 continues here:

  • I mailed Fedora’s pychecker maintainer, offering my help, and sending a patch to update to 0.8.18, which I built and installed locally. He mailed back informing me about this pychecker bug with anaconda which was blocking the upgrade. Looking at that bug, it looked suspiciously like a bug triggered by code I added to pychecker last year.
  • However, I’d like to confirm so in the easiest way. git’s got this great feature, git bisect, and wouldn’it be nice if I could do that on pychecker now ? Hey, why not add bisection to moap ?
  • CVS is actually not a very manageable VCS if you want to do fancy stuff. It costs me a few hours to figure out how I should get a more-or-less usable date from a CVS checkout. The final solution is similar to the reply to my stackoverflow question
  • moap vcs bisect run is now implemented and finds the commit that broke anaconda’s pychecking. STACK POINTER IS CURRENTLY HERE.

Fork point 2 continues here:

  • moap’s make check didn’t work because pychecker complained about the following code:
    ef func():
    d = { 'a': 1, 'b': 2}
    print d.keys()

which triggers, in python 2.6, the following warning:
Object (d) has no attribute (keys)

  • I can’t let myself change code in moap without a working make check, so on to figuring out what’s wrong in pychecker
  • After lots of debugging and print statements, I figure out that pychecker dispatches Python opcodes, and it silently drops the ones it doesn’t know about. Python 2.6 added a new opcode, STORE_MAP, and so pychecker doesn’t properly handle the stack since it ignores the opcode. I should error out on those opcodes. STACK POINTER IS CURRENTLY HERE
  • Before I can fix that though, I decide I should make pychecker’s test suite error out on unknown opcodes.
  • Of course, this will error out differently for different python versions, so I need different python versions on this machine.
  • I could do that by hand, but I’d also like Twisted’s trial to run the testsuite, which also needs zope-interface in recent versions, which needs setuptools. So hey, why not set up jhbuild stuff to build all these python versions ?
  • Python 2.3 on my 64 bit machine doesn’t work with setuptools, so the newest Twisted without it is 1.3.0, and it takes a while to figure out why trial doesn’t run my testcases (it ends up being because 1.3.0 trial expects subclasses of twisted.trial.unittest.TestCase, not unittest.TestCase)
  • I’ll blog about the useful products of my yak shave separately, for those who don’t enjoy descriptions of yak shavings, only outcomes.

    In general, I actually enjoy yak shaves. They’re massive treasure hunts, you learn a lot, and you end up fixing a nice bunch of things all over the stack if you persevere. But it’s probably more a mentality thing than anything else, and I really only indulge myself in these in my spare time.

    FUDCon Berlin slides

    I pitched two barcamp sessions for this FUDCon. The first one is Fedora Development Cycle 3.0, which reviews the recent Fedora Activity Day I ran recently and explains where it is we're trying to go with our proposals. The second is a review of the AutoQA project, where I will explain what the project is, where we're at, where we're going, and where we're looking for help.

    I've uploaded the slides to my fedorapeople page. Enjoy!

    Book review: The tipping point

    I’m not much a book reader. I often tend to start books, and never finish them. Occasionally I’ll pick them back up with the aim of finishing them, but end up starting over, and then abandoning them again at usually the same point. It’s actually been quite a while since I found a book that I couldn’t stop reading until I finished it. Recently, I found one such book: Malcolm Gladwell’s “The tipping point”.

    The book explains why some ideas ‘tip’ or become really successful, while others never get off the ground. A lot of the book explained psychological reasons for why people behave the way we do in certain scenarios. From well known theories such as group think, to defining types of people, and the roles they play in the dissemination of ideas. I found Gladwells categorisation of people into connectors/mavens/salesmen intriguing, and spent a while thinking of various people I know and trying to categorise them accordingly. (It’s possible for someone to be in more than one category).

    There’s also a lot of anecdotes in there to back up most of his points, coming from some wide and varied scenarios (The studies done by the Sesame Street researchers in order to create the perfect ’sticky’ educational TV show for example). In some cases they do drag on a bit. In some cases there are multiple examples where one would have sufficed, but overall, it wasn’t tedious whilst hammering home the point.

    My only gripe with the book was that there was no mention of the opposite scenario. There’s lots of examples of successful ideas that ‘tipped’, and some ideas that didn’t, but there are no examples or dissection of “Why don’t bad ideas die?”. Some ideas no matter how many times they get shot down seem to bubble back up to the surface every so often. I have some of my own theories why these zombie ideas never go away, but I would have liked to have read the authors take on it..

    Anyway, rambling… Good book. Recommended. Wikipedia also has a pretty decent summary of all the points covered in the book.

    Post from: codemonkey.org.uk

    No related posts.

    Stress, sickness, productivity

    The summer semester has been a bit stressful so far — supply chain taking six to nine hours a week just for class has left me with little time to think or breathe, but luckily that ends next week.  As a result, I think my body decided it had had enough and didn’t really fight off whatever the summer flu going around is.  So to add to the busy factor, I was pretty worn down and sick for a few days this week.

    Today, I finally started feeling back to myself and got a lot of productive stuff done. Finally caught up with a lot of bug stuff, got around to updating the machine that I host everything on past Fedora 9 (!), and even sat down tonight to wrap the handlebars on my CAAD9 with new bar tape. Hadn’t done a bar wrapping job before and I think that it came out okay. There are definitely places it could be better and I learned a few things as I went to use next time, but it seems like it’ll work just fine. And as an added bonus, I’m now fairly comfortable that I can do it myself and not have to always get it done at the bike shop.

    Looking forward to getting out tomorrow for a ride — I only commuted one day this week and other than that, it’s been a week since I’ve been on the bike. Longer than I’d choose usually, but I also know when not to push with getting back on the bike to avoid staying sicker longer.

    Comments

    June 26, 2009

    FUDCon Berlin 2009 Day 1

    It's day one of FUDCon Berlin 2009. Max gave a good intro talk, working in just enough subtle Michael Jackson references to set the mood. Hackfests today, as well as pre-arranged talks in our area that were put on the LinuxTag agenda. Hackfests are a little light, I think there are too many people still trying to take part in the pre-arranged talks and the other LinuxTag offerings. However the wireless folks seem to be making loads of progress by all being in the same room. I sat in with the Red Hat Security team and discussed some issues regarding security and package signing in Fedora. It was very helpful to have them all there. Also loads of other people who have taken advantage of face to face time to have high bandwidth quick discussions has also been very valuable. I feel that we've talked about and reached understanding of things that would have taken weeks or longer via email/IRC.

    I'm putting some final touches on the talks I wish to pitch at our barcamp kickoff this evening, and then bracing for the FUDPub impact. Tomorrow will be all FUDCon all day and should be interesting to see how many new faces we see here at the event. This being my first non-North America FUDCon I have no idea what to expect.

    As mentioned before I have a flickr set for LinuxTag and FUDCon as well as a fudcon tag that others are using as well. I'm also microblogging over at identi.ca for more frequent activity.

    MS XML decryption

    Based on Bill Seddon’s rather excellent notes I’ve implemented decrypting MS XML documents, though only tested it on .xlsx files protected with the default VanillaSweatshop password.

    MSO new-format encrypted documents aren’t in .zip format, but instead the .zip is encrypted and bundled as a stream into a classic OLE2 structured storage container which has some other streams that describe how it’s encrypted. Unlike our own which remain as .zips with encrypted streams.

    Security subsystem changes in the 2.6.30 kernel

    Here's an update on the major changes to the kernel security subsystem for the 2.6.30 kernel.

    • TOMOYO
      The TOMOYO security framework from NTT was merged. This is the first significant LSM scheme to be merged since SELinux in 2003. TOMOYO is characterized by its targeting of non-technical users, where security policy is automatically generated with a "learning mode" tool. This scheme utilizes pathnames for determining access to filesystem objects. Another interesting feature is that a domain, i.e. an active subject which acts on objects, is defined as a history of process invocations, rather than a single process. This allows policy to be applied to a particular branch of processes in the system. For example, an access permitted for init->httpd->perl may not be permitted for init->httpd->bash. Sample policy may be found here.


    • IMA
      IBM's Integrity Measurement Architecture was also merged. This uses the TPM to verify and store cryptographic checksums of files used by the system, i.e. measurement. If a measured file has been modified on disk, this will be detected and stored permanently in the TPM. The aim is to help detect attacks based on modifying files—such as executable binaries or configuration files—although it cannot detect runtime attacks, and requires checksums to be known in advance for the full system startup chain. Files to be measured may be specified in a policy loadable via securityfs.


    • Remove Old SELinux Network Controls
      The original SELinux network controls were deprecated by the iptables-based Secmark system several years ago, although they remained available via the compat_net option for the likely few people who were using them. The old code has now been removed entirely, and users should transition to Secmark: Paul Moore has written a detailed guide for this.

    The remaining changes were primarily bugfixes and enhancements across most parts of the security subsystem, including SELinux, SMACK, and keys.

    Paul and I are finalizing the schedule for the security microconf at the upcoming Linux Plumbers Conference. It's looking like a great line-up at this stage—stay tuned for more details soon.

    June 25, 2009

    I've had hard drives die before, but this one seems to be doing it in an especially pathological way. It started with the kernel throwing irq timeout errors and then stepped up into actual read errors, culminating in a corrupt journal, a read-only block device and a forced fsck. It's been behaving a little better since then, though occasional io stalls (without any kernel error) suggest that it's having to repeatedly retry some sectors. SMART says it's all fine, so obviously I'm backing it all up now before a new disk arrives tomorrow and I can sort out access to the data centre. Email might be a bit spotty for me until then.

    Turns out that running a 2.5" PATA drive for approximately 4 straight years may not have been the best of ideas. Who'd have guessed?

    Multi-process Firefox

    Even if Google Chrome goes nowhere, it has done its duty by lighting the fire under the Mozilla's butt (as mentioned previously). Remember how Classpath forced Sun to open Java? Yep. Thank you, evil Google overlords.

    NetworkManager and ConnMan

    Lately ofono and ConnMan have been in the news, and that’s sparked some discussion about how these two projects relate to NetworkManager.  I’ve mostly just been ignoring that discussion and focusing on making NetworkManager better.  But at some point the discussion needs to become informed and the facts need to be straightened out.

    So what makes NetworkManager great?

    • Flexibility: NM’s D-Bus interface provides a ton of control and information about the network connections of your machine.  Developers and applications simply don’t take enough advantage of this.  Imagine mail automatically pulled whenever the corporate VPN is up.  Or more restrictive firewalls when connected to public networks.  Yeah, you can do that today with NetworkManager.
    • Works everywhere: from the mainframe to the power desktop to the netbook and lower.  There’s nothing stopping you from running NetworkManager on an s390 or a Palm Pre.
    • Integration: most users like NetworkManager’s distro integration, so it’s on by default (but can be turned off for running bare-metal).  NM will read your distro’s network config files: ifcfg on Fedora, /etc/network/interfaces on Debian, etc.  It doesn’t pretend the rest of the world doesn’t exist, but it can if you tell it to.
    • Connection Sharing: you can share your 3G connection to the wired or the wifi interface, or the other way around.  How you share is completely up to you.
    • VPN: it’s got plugins for Cisco (vpnc), openvpn, openconnect, and pptp.  An ipsec/openswan plugin is being written.  It’s just easy to use the VPN of your choice.
    • Makes Linux better: by not working around stupid vendor drivers or other broken components, NetworkManager drives many improvements in drivers, kernel APIs, the supplicant, and desktop applications.    Five years ago I posted a list of wifi problems, many of which got fixed because NetworkManager users complained about them.  Stuff like WPA capability fixes, hidden SSID fixes, suspend/resume improvements, Ad-Hoc mode fixes, and lots of improvements to wpa_supplicant to name just a few.  By encouraging drivers to be open, by fixing bugs in the open drivers and the stack instead of hacking around them, and by encouraging vendors to work upstream, NetworkManager makes Linux better for you.

    What great stuff is coming next?

    All in all, a lot of great stuff is on the plate.  NetworkManager already works well for a ton of people, but we’d like to make it work better for a lot more people.  And it will.

    So what about ConnMan?

    I recently came across a slide deck about ConnMan which makes both disappointing and inaccurate claims about NetworkManager.  It’s also worth emphasizing the philosophical differences between the two projects.

    First, ConnMan primarily targets embedded devices, netbooks, and MIDs (slide #1).  When ConnMan was first released in early 2008, NetworkManager 0.7 was under heavy development, and NetworkManager 0.6 clearly did not meet the requirements.  But 0.7, released in November 2008, works well for a wide range of use-cases and hardware platforms.

    NetworkManager scales from netbooks, MIDs, and embedded devices with custom-written UIs to desktops to large systems like IBM’s s390.  You get the best of both worlds: from phenomenal cosmic power down to itty-bitty living space.

    ConnMan explicitly doesn’t try to integrate with existing distributions (slide #5), partly due to it’s requirements to be as light-weight as possible.  But NetworkManager will use your distro’s normal network config and startup scripts if you tell it to do (but you don’t have to).  Early in NetworkManager days we tried to ignore the rest of the world too.  Turns out that doesn’t work so well; users demand integration with their distribution.  But ConnMan doesn’t pretend to be general purpose, and due to its embedded focus, it can wave this issue away.

    Both ConnMan and ofono reject well-established technologies like GObject (but still uses glib) in favor of re-implementing much of GObject internally anyway. This is a curious decision as GObject is not a memory hog and not a performance drag for these cases.  The NIH syndrome continues with libgppp, libgdhcp, and libgdbus, where instead of improving existing, widely-used tools like dhclient/dhcpcd, pppd, and dbus-glib, ConnMan opts to re-implement them in the name of being more “lightweight”.  With embedded projects that ConnMan targets (like Maemo and Moblin) already using GObject and dhcpcd, I don’t understand why this tradeoff was made.  Perhaps this visceral dislike of GObject and dbus-glib was one reason the project’s creators decided to write their own connection manager instead of helping to improve existing ones.

    NetworkManager in contrast re-uses and helps improve components all over the Linux stack.  Because of that, more people benefit from the fixes and improvements that NetworkManager drives in projects like avahi, wpa_supplicant, the kernel, pppd, glib, dbus-glib, ModemManager, libnl, PolicyKit, udev, etc.

    Taking a look at the deck

    I have things to say about most of the slides, but I’ll concentrate on the most interesting and misinformed ones instead.

    Supposed Words About NetworkManager

    Supposed Words About NetworkManager

    • Very Complex Design: a complete strawman, because it doesn’t say anything.  NetworkManager 0.7 is a mature project with many useful features.  NM is based around a core of objects, each one performing actions based on signals and events from other objects.  It’s modular and flexible.  It’s just not a ConnMan-style box of lego blocks with a rigid plugin API and all the problems that causes.
    • Large Dependency List: NM requires things like wpa_supplicant, udev, dbus, glib, libuuid, libnl, and a crypto library.  pppd and avahi are optional. This list is certainly not large.  When you take ConnMan and its optional dependencies (most of which are needed in a useful system) the list is just about the same.
    • Too Much Decision-making in the UI: Completely bogus and frankly incomprehensible.  The core NM daemon provides a default policy which is in no way connected to the UI, and the rest is up to the user.  nm-applet contains no policy whatsoever.  If the objection is to nm-applet’s desktop-centric interaction model, then it’s important to know there is no lack of applets for different use-cases.
    • Tries to work around distro problems: this is completely a matter of perspective.  Since Intel was creating its own Linux distribution (moblin), they didn’t have to work around any existing issues; these were simply waved away.  Unfortunately NetworkManager lives in the world of reality and not some universe full of ponies.  For users that expect it, NetworkManager integrates with your distros existing network config, init scripts, and DNS resolution.  For users that don’t care, NetworkManager can run bare-metal.
    • Too much GNOME-like source code: seriously, what the hell?  I’m not sure where to begin with this one.  The NetworkManager core does not depend on GNOME.  At all. Yeah, the source-code is in the Gnome style, but is that seriously an issue?
    Uninformed diagram of NetworkManager architecture

    Uninformed diagram of NetworkManager architecture

    (Misinformation shaded blue for your protection)

    The User Settings service is contained in the applet, and it’s completely optional.  The System Settings service has been merged back into the NetworkManager core daemon and is no longer a separate process.  That same commit ported NM from HAL to udev; thus HAL is no longer required.  NetworkManager always used HAL/udev for device detection instead of RTNL (ie, netlink).  NetworkManager also hasn’t used WEXT for a long time; wpa_supplicant handles kernel wireless configuration.  NetworkManager uses distro networking scripts only for service control, as does ConnMan.  The rest of the slide is quite petty and just splits hairs.

    Where to?

    It’s unlikely that either NetworkManager or ConnMan will disappear in the near future.  That means we’ll all have to live with two mutually exclusive connection managers and two completely different network configuration systems.  I think that’s pretty pointless, but I don’t get the last word anyway, since that’s not how Open Source works.  The users will decide which solution works best for them.  And that means NetworkManager will keep getting better, keep getting more useful, and will continue to be the easiest network management solution around.

    thl and knurd @ LinuxTag 2009 and Fudcon Berlin 2009

    I just arrived in Berlin at the fairgrounds where LinuxTag 2009 and Fudcon 2009 are held. I'm staying till Sunday evening, but like so often I have to split my attention and time as I'm partly here for my employer and partly for fun/Fedora (just like last year at Red Hat Summit Boston).

    As usual that's creating some interesting problems where I ideally wound like to be at two places at the same time :-/

    Friday will get most complicated. I for example can't visit the Fudcon-Keynote from Jan Wildeboer, as I'm giving a "Kernel-Log" talk as part of the "Kernel Track" of LinuxTag at just the same time. That wasn't planed like that, but seems Jonathan Corbet canceled his trip and thus can't give his well know and always interesting "The Kernel report"; I three or four weeks ago was asked to fill the position and accepted (presentation is basically finished; luckily Linus closed the merge window a few hours ago). Note that I'm giving that talk in German, not in English, as Jonathan would have done.

    I also have time and space problems for the barcamp pitching session at 5 pm local time: Together with Nils Magnus from LinuxTag e.V. / Linux New Media I'm moderating the Kernel Kwestioning at that time, where the audience can ask some kernel developers on stage a few questions.

    And that are only the hard scheduled events which I really can't miss. I also want (have?) to hear at least two of the other talks (Union Mounts and Ksplice) from the Kernel track on LinuxTag, so I likely walk back and forth between Kernel track and Fudcon a lot on Friday. But I hope to join the Fudcon crowd early for Fudpub in the evening -- I mean, I likely *have to*, as the Fudpub otherwise might run out of pizza and drinks before I arrive (you know how Fedorians are...).

    Remains to be seen how long I can stay at Fudpub, as I still have to prepare my presentation on RPM Fusion which I'm giving at 2 pm on Saturday on Fudcon. I'll give that talk in English -- I'm not completely sure yet if that works out, but I guess I have to give it a try sooner or later. So remember to bring eggs and tomatoes to throw them on the stage if my English is way to worse ;-)

    So, see you on LinuxTag and Fudcon! This is how I look like. But note, I had to buy a add-on a few months ago: now with glasses (often, not always) ;-)

    P.S.: Headline-Explanation: those that contribute to Fedora for a long while will remember that I used to be known as "thl" in the Fedora -world initially. But I already was and still am "thl" at work and a few years ago decided it might be better to use a different nick in the Fedora world -- after a bit of thinking I chose "knurd". That way it's at least a bit harder (especially on IRC!) to get the connection "Ohh, the thl that writes the kernel-log for c't/heise.de/the-h is also the one that contributes to Fedora", which I think is a good thing. But it's not really a secret anyway, as rare blog entry like this make it obvious. But trying to keep it a real secret would not work anyway, as the name "Thorsten Leemhuis" quite rare, so it's not that hard to find out anyway.

    Downtime notification

    This Friday, June 26th, we will be moving to our new house. This also means that kernel newbies, the passive spam block list and the other sites will be offline for most of the day.

    June 24, 2009

    moap 0.2.7 released

    moap is a swiss army knife for maintainers and developers.

    This is MOAP 0.2.7, “MMM…”.

    Coverage in 0.2.7: 1424 / 1899 (74 %), 109 python tests, 2 bash tests

    Features added since 0.2.6:
    - Added moap vcs backup, a command to backup a checkout to a tarball that
    can be used later to reconstruct the checkout. Implemented for svn.
    - Fixes for git-svn, git, svn and darcs.
    - Fixes for Python 2.3 and Python 2.6

    I’ve been fixing things left and right for python 2.6, and in the process I noticed that moap hasn’t had a release for over a year. This release contains mostly bug fixes collected over the year, and a new feature that isn’t implemented yet for all VCS’s. Basically it’s an automatic replacement for something I was doing manually every time I removed an old GNOME cvs/svn/git checkout: figure out what’s in that tree that’s not in the repository (diffs, unversioned files, …), so I can delete everything else and free some disk space.

    The only problem with this release is that, after doing the release, I noticed that Freshmeat removed their XML-RPC interface. Apparently they have some new kind of interface they want people to use. Sigh. But that means 0.2.8 is right around the corner!

    Recovering from a lost /var on Fedora/Red Hat/CentOS

    Last week, after upgrading my home desktop to F11, I had palimpsest tell me one of my disks was broken on the desktop machine. The desktop is running on two 250 GB drives in software raid. It was time to get new drives.

    After a weekend of fiddling with new 1 TB disks for my home desktop, trying failure scenarios, making sure the system can boot from each of the two drives, and waiting for the 4 hour resync of the software RAID in between each step, I finally closed up the desktop machine and cleaned up under my desk again, thinking I was done with my halfyearly messing about with broken disks.

    I guess I was tempting faith anyway. Doing a routine operation on my home server after all the configuration stuff I’d done to set up asterisk last week, suddenly an rsync aborted, a journal errored out, a partition changed to being mounted read-only, and the log was full of scary drive errors. Ouch.

    Well, that’s why I keep around a big box of old drives - for when some drive fails and I want to tempt fate even more by reusing an old drive that’s probably going to fail real soon too. And anyway, I had just spent my hard drive piggybank on the new desktop drives.

    Luckily, I seemed to have a 400 GB SATA drive lying around that used to belong to my media center. I don’t remember why I swapped it out, given that the media center has a 160GB drive for the OS (and two 1.5 TB raid drives for the data, of course), but this was a lucky break. I booted with a rescue cd, and tried copying the root filesystem of my CentOS 5.2 home server partition to this new drive. Which worked fine, except that /var was where I triggered an Input/Output error and some more drive errors in the kernel log.

    So, powered off, took out the broken drive, and put it in a USB chassis. The advantage of a USB chassis is that you can easily just replug the drive to try again, instead of locking up your system terribly and having to reboot. Sadly, /var was broken beyond repair. I ran an e2fsck hoping to recover the contents, and that partly worked, but some of the important stuff is missing even from lost+found (apart from the annoying situation where you have to reconstruct file names, which I usually end up not bothering with).

    But really, how important can /var be ? Turns out, rather important. As in, you need it to boot in the first place. And also, it holds your rpm database. Crap.

    Some Googling gave me some posts on how to reconstruct your rpm database from log files (using –justdb –noscripts –notriggers). But to use those, you actually need those log files. Where are those ? On /var as well. Crap. And they’re not in lost+found either.

    Ok, so time to get creative. Here’s what I ended up doing:

    • create /var/lib/rpm, and run rpm –rebuilddb to end up with an empty rpm database
    • Based on the contents of /usr/bin, figure out what packages ought to be installed:
      rpm -qf /etc/* | grep 'not owned' | cut -f2 -d' ' > /tmp/unowned
      yum --enablerepo=c5-media --disablerepo=base --disablerepo=updates --disablerepo=addons --disablerepo=extras whatprovides `cat /tmp/unowned` | cut -f1 -d' ' | sort | uniq > /tmp/missing
      yum --enablerepo=c5-media --disablerepo=base --disablerepo=updates --disablerepo=addons --disablerepo=extras install `cat /tmp/missing`

    This works by first listing all files that are not owned by rpm (on the first run, that’s all of them), figure out what packages can provide these files, then installing those packages.

  • Repeat the process for other important directories, like /bin, /sbin, /usr/sbin, /usr/lib, /usr/include, …
  • Clean up .rpmnew files that don’t actually contain differences:
    find / -name *.rpmnew | sed s/.rpmnew//g > /tmp/rpmnew
    for c in `cat /tmp/rpmnew`; do echo $c; diff $c $c.rpmnew && mv -f $c.rpmnew $c; done
  • Same for *.rpmorig:
    find / -name *.rpmorig | sed s/.rpmorig//g > /tmp/rpmorig
    for c in `cat /tmp/rpmorig`; do echo $c; diff $c $c.rpmorig && mv -f $c.rpmorig $c; done
  • Inspect the remaining ones, and merge changes.
  • While it’s not an experience I hope to repeat any time soon, it worked out surprisingly well!

    Upgrading to F11

    I managed to completely skip updating to F10. All my machines (work desktop, home desktop, laptop, media center) where running F9 without any real problems I worried about.

    But of course I was curious. And, especially with the move to python 2.6, things I care about where bound to break.

    So, last weekend I took the plunge, and after little over a week here are my first impressions:

    • Overall F11 looks slick. Nice work on the artwork! I particularly liked the GDM background, looking like an ancient brushed metal object, reminding me of how I used to love playing Gods by the Bitmap Brothers.
    • Apparently anaconda now has bugzilla integration, allowing you to file a bug directly from inside anaconda. Luckily for Jeremy (who I assume still maintains it) it has some code in there to look for existing bug entries with the same backtrace. Very nice!
    • Of course, I wouldn’t have found out if I hadn’t run into exceptions in anaconda. I ran into while setting up two completely new hard disks with 2 software raid partition and LVM on the second one.
    • I first installed my work desktop, as usual putting the new installation on a separate partition, keeping my old one around in case the install goes wrong or F11 just isn’t stable enough for me. For me, that involves having a /mnt/alpha and /mt/omega partition between which I alternate. At some point I should figure out if other people do this too and if it makes sense for anaconda to support something like this and at least allow me to keep my GRUB configuration for the older installation. For now I do this manually, using a hugeupgrade text file I follow each time I upgrade, accumulating more and more steps each time I perform the procedure.
    • On my home machine, when I booted into F11, as usual my second monitor didn’t work (I have a Radeon GeCube Pro 2400). My own fault really - I should have tried to get a patch upstream into the default radeon driver the same way I sent a patch for the radeonhd driver that I still use. A rebuild later, I at least had the old radeonhd driver rebuilt to get my second screen working again.
    • Having the second screen now made me change my mind completely about the GDM wallpaper. That lion on the right hand side that I didn’t see before completely ruins the style for me. Sorry!
    • Upon logging in to the work desktop, I had no network. Completely puzzled as to why, until I figured out that I had to actually right-click on NetworkManager’s tray icon, choose to configure, and activate eth0 by default. After some browsing it seems that this was a deliberate choice to increase security. While I can possibly sympathize with the motivation for doing so, it really is terrible to change this by default and not provide *any* indication during or after installation. At the very least, the following things could have been done:
      1. provide a clear notice during installation, and allow a user to choose to enable it anyway, assuming the security risk
      2. the same, but during firstboot
      3. after logging in, having the network manager tooltip say ‘the network is disabled by default in this new release, here’s how you enable it

      I am not entirely sure what the security problems are with enabling the network after installation. The default firewall is pretty locked down, SELinux is enabled by default, and there’s no way I can install updates without the network anyway. But I’m sure that I could find huge bikeshedding threads on fedora-devel about this if I really cared why this was decided.

    • Upon logging in to the home desktop, I was greeted with a tooltip saying that one of my drives was going bad. That was a nice touch! Really good idea to have something like that be monitored by default. This prompted me to ponder to finally replace my desktop’s 250 GB PATA drives with real SATA drives - a story for another post.
    • Various deprecation warnings pop up running various Python programs, including my own. Flumotion needed a patch for running against 2.6 (I rebuilt and pushed to F11). So I have some cleanup ahead, and I should revisit pychecker soon.
    • The first piece of functionality I checked was Evolution’s Google Calendar integration. It still seems a bit shaky, given that I had to restart Evolution a few times as it froze doing stuff with the net, but it does seem to work. That means I will finally be able to accept work invitations done through Outlook and get them on my Google Calendar! Awesome. Now if only I didn’t have to manually configure each of the ten calendars I’m interested in…
    • At work, when I played a video using XVideo, my machine instantly froze. Seems to be a known bug. The intel drivers are being rewritten. I’ve never quite understood why rewriting is an excuse for breaking stuff that worked (I should check if Firewire video finally works reliably now when I have the chance, for example), but all in the name of progress I guess.
    • I don’t know why it’s happening, but once in a while my screens blank. Even in the middle of doing stuff. If I were a gamer I’d be hugely annoyed as my character would be shot through the head in that split instant. The closest bug I can find is this one, where I commented. Hugely annoying bug because I don event know how to begin debugging a bug like this that I can’t catch in the act.
    • PulseAudio integration in GDM seems a bit fragile. I have my pulseaudio configured to send audio to my media center pulseaudio server. Sometimes, after choosing a username in GDM, it doesn’t manage to play the audio sample related to that action, and then GDM is stuck there not showing me the password entry dialog. Pretty sure it’s due to blocking on pulseaudio, because when I kill it the password dialog appears. Pretty painful bug for new users though.

    All in all, not a bad first week experience, and seems like a solid release. Now, off to rebuild bits and pieces, and clean up Python 2.6 deprecation warnings…

    I spent last week in the US, during which I discovered that my preconceptions of Albany as a post-industrial wasteland with no redeeming features were incorrect (it's actually a post-industrial wasteland with some decent bars), made minor contributions (mostly in the form of pesto) to a prize winning chalk picture and cycled on the wrong side of the road for the first time in a ridiculous number of years.

    But more relevantly, I spent most of the week trapped in Westford. One of the things I've been looking into lately is USB autosuspend, a kernel feature which allows devices to be powered down when they're not in use. This is especially interesting for USB, since it's a poll-based protocol. If the end device is powered up then bus traffic will be generated, even if everything's idle. USB autosuspend allows this to be avoided and thus saves a worthwhile amount of power.

    However, drivers need to support USB autosuspend before it's useful. Further, devices need to be able to cope with it. Some older kernel releases had autosuspend enabled by default, leading to all kinds of fun as people discovered that their scanners and printers had stopped working. This was fixed by leaving autosuspend disabled by default for most hardware. The first component that I'm working on is support for drivers to indicate whether or not a given piece of hardware supports autosuspend, allowing it to be enabled by default for that kernel. Handling this at a per-driver level means we shouldn't end up with obnoxious white or blacklists. The second is adding support for autosuspend to more drivers. Right now I'm concentrating on hardware that's common in laptops. There's support for autosuspend in the qcserial and uvc drivers, and some experimental (and unmerged) patches for bluetooth and some other modem drivers. Palm have just released their kernel code, which includes autosuspend support for the cdc-acm driver. I'm working on cleaning these up and getting them into a mergable state, and part of what I was doing in Westford was testing that various pieces of hardware work correctly. The good news is that they seem to, and with a bit of luck we'll try shipping support for a lot of this in F12. If that doesn't end up causing real world problems then it ought to be safe to merge it all to mainline.

    The final component of this is handling autosuspend on devices with userspace drivers. Our only real choice there is for packages to ship udev fragments that enable it for working hardware. I've added support to the fprint package in rawhide, which means that fingerprint readers should now be powered down unless they're opened. Nobody seems to have complained yet, so I'm cautiously optimistic that this can be sent upstream without any problems.

    LinuxTag / FUDCon Berlin 2009 Flickr feed

    I've created a set of photos in my flickr for LinuxTag and FUDCon Berlin 2009. You can follow it here. All the photos should be tagged with fudcon as well.

    Data about Data

    Warning: Long, technical post

    One of the few remaining icky areas of the Nautilus codebase is the metadata store. Its got some weird inefficient XML file format, the code is pretty nasty and its the data is not accessible to other apps. Its been on my list of things to replace for quite some time, and yesterday I finally got rid of it.

    The new system is actually pretty cool, both in the API to access is how it works internally. So, I’m gonna spend a few bits on explaining how it works.

    Lets start with the requirements and then we can see how to fulfil these. We want:

    • A generic per-file key-value store with string and string list values. (String lists are required by Nautilus for e.g. emblems)
    • All apps should be able to access the store for both writing and reading.
    • Access, in particular read access, needs to be very efficient, even when used in typical I/O fashion (lots of small calls intermixed with other file I/O). Getting the metadata for a file should not be significantly more expensive than a stat syscall.
    • Removable media should be handled in a “sane” way, even if multiple volumes may be mounted in the same place.
    • We don’t require transactional semantics for the database (i.e. no need to guarantee that a returned metadata set is written to stable storage). What we want is something I call “desktop transaction semantics”.
      By this I means that in case of a crash, its fine to lose what you changed in the recent history. However, things that were written a long time ago (even if recently overwritten) should not get lost. You either get the “old” value or the “new” value, but you never ever get neither or a broken database.
    • Homedirs on NFS should work, without risking database corruption if two logins with the same homedir write concurrently. It is fine if doing so may lose some of these writes, as long as the database is not corrupted. (NFS is still used in a lot of places like universities and enterprise corporations.)

    Seems like a pretty tall order. How would you do something like that?

    Performance

    For performance reason its not a good idea to require IPC for reading data, as doing so can block things for a long time (especially when data are contended, compare i.e. with how gconf reads are a performance issue on login). To avoid this we steal an idea from dconf: all reads go through mmaped files.

    These are opened once and the file format in them is designed to allow very fast lookups using a minimal amount of page faults. This means that once things are in a steady state lookup is done without any syscalls at all, and is very fast.

    Writes

    Metadata writes are a handled by a single process that ensures that concurrent writes are serialized when writing to disk.

    Clients talk to the metadata daemon via dbus. The daemon is started automatically by dbus when first used, and it may exit when idle.

    Desktop Transaction semantics

    In order to give any consistancy guarantees for file writes fsync() is normally used. However this is overkill and in some cases a serious system performance problem (see the recent ext3/4 fsync discussion). Even without the ext3 problem a fsync requires a disk spinup and rotation to guarantee some data on disk before we could return a metadata write call, which is quite costly (on the order of several milliseconds at least).

    In order to solve this I’ve made the file format for a single database be in two files. One file is the “tree” which contains a static, read only, metadata tree. This file is replaced using the standard atomic replace model (write to temp, fsync, rename over).

    However, we rarely change this file, instead all writes go to another file, the “journal”. As the name implies this is a journal oriented format where each new operation gets written at the end of the journal. Each entry has a checksum so that we can validate the journal on read (in case of crash) and the journal is never fsynced.

    After a timeout (or when full) the journal is “rotated”, i.e. we create a new “tree” file containing all the info from the journal and a new empty journal. Once something is rotated into the “tree” it is generally safe for long term storage, but this slow operation happens rarely and not when a client is blocking for the result.

    NFS homedirs

    It turns out that this setup is mostly OK for the NFS homedir case too. All we have to do is put the log file on a non-NFS location like /tmp so that multiple clients won’t scribble over each other. Once a client rotates the journal it will be safely visible by every client in a safe fashion (although some clients may lose recent writes in case of concurrent updates).

    There is one detail with atomic replace on NFS that is problematic. Due to the stateless nature of NFS an open file may be removed on the server by another client (the server don’t know you have the file open), which would later cause an error when we read from the file. Fortunately we can workaround this by opening the database file in a specific way[1].

    Removable media

    The current Nautilus metadata database uses a single tree based on pathnames to store metadata. This becomes quite weird for removable media where the same path may be reused for multiple disks and where one disk can be mounted in different places. Looking at the database it seems like all these files are merged into a single directory, causing various problems.

    The new system uses multiple databases. libudev is used to efficiently look up the filesystem UUID and label for as mount and if that is availible use that as the database id, storing paths relative to that mount. We also have a standard database for your homedir (not based on UUID etc, as the homedir often migrates between systems, etc) and a fall-back “root” database for everything not matching the previous databases.

    This means that we should seamlessly handle removable media as long as there are useful UUIDs or labels and have a somewhat ok fall-back otherwise.

    Integration with platform

    All this is pretty much invisible to applications. Thanks to the gio/GVfs split and the extensible gio APIs things are automatically availible to all applications without using any new APIs once a new GVfs is installed. Metadata can be gotten with the normal g_file_query_info() calls by requesting things from the “metadata” namespace. Similar standard calls can be used to set metadata.

    Also, the standard gio copy, move and remove operations automatically affect the metadata databases. For instance, if you move a file its metadata will automatically move with it.

    Here is an example:

    $ touch /tmp/testfile
    $ gvfs-info -a "metadata::*" /tmp/testfile
    attributes:
    $ gvfs-set-attribute /tmp/testfile metadata::some-key "A metadata value"
    $ gvfs-info -a "metadata::*" /tmp/testfile
    attributes:
      metadata::some-key: A metadata value
    $ gvfs-copy /tmp/testfile /tmp/testfile2
    $ gvfs-info -a "metadata::*" /tmp/testfile2
    attributes:
      metadata::some-key: A metadata value

    Relation to Tracker

    I think I have to mention this since the Tracker team want other developers to use Tracker as a data store for their applications, and I’m instead creating my own database. I’ll try to explain my reasons and how I think these should cooperate.

    First of all there are technical reasons why Tracker is not a good fit. It uses sqlite which is not safe on NFS. It uses a database, so each read operation is an IPC call that gets resolved to a database query, causing performance issues. It is not impossible to make database use efficient, but it requires a different approach than how file I/O normally looks. You need to do larger queries that does as much as possible in one operation, whereas we instead inject many small operations between the ordinary i/o calls (after each stat when reading a directory of files, after each file copy, move or remove, etc).

    Secondly, I don’t feel good about storing the kind of metadata Nautilus uses in the Tracker database. There are various vague problems here that all interact. I don’t like the mixing of user specified data like custom icons with auto-extracted or generated data. The tracker database is a huge (gigabytes) complex database with information from lots of sources, mostly autogenerated. This risks the data not being backed up. Also, people having problems with tracker are prone to remove the databases and reindexing just to see if that “fixes it”, or due to database format changes on upgrades. Also, the generic database model seems like overkill for the simple stuff we want to store, like icon positions and spatial window geometry.

    Additionally, Tracker is a large dependency, and using it for metadata storage would make it a hard dependency for Nautilus to work at all (to e.g. remember the position of the icons on the desktop). Not everyone wants to use tracker at this point. Some people may want to use another indexer, and some may not want to run Tracker for other reasons. For instance, many people report that system performance when using Tracker suffer. I’m sure this is fixable, but at this point its imho not yet mature enought to force upon every Gnome user.

    I don’t want to be viewed like any kind of opponent of Tracker though. I think it is an excellent project, and I’m interested in using it, fixing issues it has and helping them work on it for integration with Nautilus and the new metadata store.

    Tracker already indexes all kinds of information about files (filename, filesize, mtime, etc) so that you can do queries for these things. Similarly it should extract metadata from the metadata store (the size of this pales in comparison to the text indexes anyways, so no worries). To facilitate this I want to work with the Tracker people to ensure tracker can efficiently index the metadata and get updates when metadata changes for a file.

    Where to go from here

    While some initial code has landed in git everything is not finished. There are some lose ends in the metadata system itself, plus we need to add code to import the old nautilus metadata store into the new one.

    We can also start using metadata in other places now. For instance, the file selector could show emblems and custom icons, etc.

    Footnotes

    [1] Remove-safe opening a file on NFS:
    Link the file to a temporary filename, open the temp file, unlink the tempfile. Now the NFS client on you OS will “magically” rename the tempfile to .nfsXXXXX something and will track this fd to ensure this gets remove when the fd is closed. Other clients removing the original file will not cause the .nfsXXXX link on the server to be removed.

    10th bugiversary

    Some part of me will always be a QA guy, so it is nice to note that today is the tenth anniversary of my first formal bug filing (and first formal participation in Mozilla, I believe): mozilla bugzilla bug 8749, nested <DL> tags don’t display properly. Happy bugday to me, happy bugday to me… :)

    SELinux Developer Summit: CfP closes 1st July (1 week)

    Just a reminder for SELinux developers and anyone interested in the internals of SELinux that the SELinux Developer Summit CfP closes on July 1st, which is about a week away.

    SELinux logo


    Details of the CfP are here. Don't forget to join the event mailing list if you're attending.

    Proposals for presentations, lightning talks, and development sessions should be submitted via email per the instructions in the CfP. Proposals do not need to be especially detailed: if you have a good idea, simply send it in.

    mystery object


    For reading this, you are rewarded with a mystery object (pictured above). See if you can figure out what it is before clicking on it and reading the comments @ flickr.

    June 23, 2009

    Looking for two roommates to move in on September 1st

    Hey everyone, my two roommates are moving out to be closer to the city. Those who have been to my place know it is an amazing apartment in North Cambridge, MA with a five minute walk to the Alewife T station. If you are looking to move or know someone who needs a place let me know. The lease is for a year and is super inexpensive at $635 a month per person plus utilities.

    Oh and you will have to like dogs since Ty is staying too :)

    [read this post in: ar de es fr it ja ko pt ru zh-CN ]

    It's now been over 6 months since Poulsbo hardware with Intel's GMA500 graphics core started shipping in volume. And we're still utterly lacking in any sort of worthwhile driver. It's an impressive turnaround from the recent days when the straightforward recommendation for mobile Linux hardware was "anything that has lots of Intel stuff in lspci", and while the Poulsbo situation in itself doesn't change that hugely it's potentially symptomatic of a worrying trend within parts of Intel.

    The first thing to realise here is that, like most large companies, Intel consists of a large number of business units with different priorities. Their open-source technology center has historically had responsibility for providing Linux support for hardware, but this obviously depends on other business units cooperating with them. And there's strong evidence that many of those business units don't get it.

    There's been signs of this for some time. Back before the days of the Intel X.org driver gaining native modesetting support, some people ran the Intel embedded graphics driver. This was (is?) a closed X driver that was able to provide native modesetting on platforms that could only otherwise be run at incorrect resolutions. One business unit was shipping a driver that was more functional than the official Intel Linux driver. To the best of my knowledge, none of that code was ever used in the rewritten Intel driver that now provides the same features.

    Poulsbo is another example of this. Intel wanted a low-power mobile graphics chipset and chose to buy in a 3D core from an external vendor. IP issues prevent them from releasing any significant information about that 3D core, so the driver remains closed source. The implication is pretty clear - whichever section of Intel was responsible for the design of Poulsbo presumably had "Linux support" as a necessary feature, but didn't think "Open driver" was a required part of that. There's not a lot any other body inside Intel can do once IP-limiting contracts are signed and the hardware's shipping, but it ends up tarnishing the good reputation that other parts of Intel have built up anyway.

    And while Poulsbo is the most obvious example of this to date, it's not the only one. Intel recently decided to make the EFI development kit discussion lists private. Various drivers for Moorestown (the followup platform to Poulsbo) have been submitted to the Linux kernel, and while they have the advantage of being GPLed they have the disadvantage of being barely above the level of typical vendor code. Objections that chunks of them simply don't integrate into Linux correctly has done little to get these problems fixed - I still have no real idea how the runtime interface to power management on the SD driver is supposed to be used, but I suspect the answer is probably "badly".

    This all makes sense if you assume that there are large groups of people in Intel who don't talk to each other. But to the casual observer it just looks schizophrenic. Explaining to an irate user that the Intel who shipped a closed Linux graphics driver is only barely the same Intel who contribute so much to architectural improvements in the Linux graphics stack doesn't make their hardware work. And while all of this confusion is going on, Intel's competitors are catching up. Atheros are now making significant contributions to the state of Linux wireless. AMD are releasing graphics chipset documentation faster than Intel, and radeon support is improving rapidly.

    Is the future going to be one where we can no longer simply say that Intel hardware will Just Work? Is their work on Moblin (easily the most compelling Linux UI for netbooks) going to be wasted on the broader Linux community because it'll mostly end up running on hardware that's not supported by the mainline Linux kernel? Does Intel have a real commitment to open source, or is that being lost in the face of short-term requirements?

    Intel need to demonstrate that they have a company-wide understanding of what Linux support actually means or risk losing much of what they've earned over the past few years. I'm desperately hoping that Poulsbo and what we've seen so far of Moorestown are the exception, not the future norm.

    June 22, 2009

    ­ëÞ­é݆Ší¢x^jÖڕë(}Êj)Þ

    Monitor plug and play works by storing a descriptor called EDID in the monitor. The most important thing in this descriptor is the list of modes that the monitor supports. You would think this would be a fairly straightforward thing, but it's actually not, because you've only got 128 bytes to work with. So there end up being multiple ways to encode a mode in EDID:

    • the Established Timings fields I and II and Manufacturer's Timings, which is a three-byte bitmap of 17 industry-standard timings, plus seven bits of "manufacturer's mask" which could in principle be more modes but no manufacturer publishes what they mean
    • the Standard Timings field, which is an array of eight two-byte codes, thus: eight bits for width in multiples of 8 with 0x01 meaning 256; two bits for image aspect ratio (with one of the combinations overloaded to mean 16:10 or 1:1 depending on the spec revision); six bits for field refresh rate minus 60
    • Up to four of various 18-byte detailed monitor descriptors, including:
      • Detailed Timing descriptors, which encode more or less every timing parameter including sync intervals and polarity, borders, interlacing, stereo, star sign, etc.
      • Display Range Limit descriptors, which can describe sync range intervals and optionally mark them as supporting the GTF or CVT timing formulas, implying that any mode conforming to those sync ranges and timing formulas is supported
      • arrays of up to six Standard Timing identifiers, encoded as above
      • arrays of up to four 3-byte CVT descriptors, which encode twelve bits for image height, two bits for aspect ratio, a two-bit enum for preferred refresh rate, and a five-bit mask for supported refresh rates
      • the Established Timings III descriptor, which comprises 44 bits of yet more industry-standard timings as published in the DMT spec, and 52 unused bits set to zero
      • potentially any of sixteen manufacturer-specific detailed blocks, none of which are documented, any of which could contain mode support information
    Not content with the restrictive expressive power of base EDID, an extension mechanism was introduced.  You can have up to 255 128-byte extension blocks, including:
    • the Consumer Electronics Association extension, which lets you specify:
      • Arbitrary numbers of Short Video Descriptors, which are one-byte indices into a list of timings in the CEA spec
      • Arbitrary numbers of detailed timing blocks, encoded as in base EDID
    • the Video Timing Block extension, which lets you specify:
      • Zero to six detailed timing blocks, encoded as in base EDID
      • Zero to forty CVT 3-byte descriptors, encoded almost as in base EDID except with fewer legal aspect ratios
      • Zero to sixty-one Standard Timing descriptors, encoded as in base EDID
    • the Device Information extension, which lets you specify:
      • A bit under the Miscellaneous Display Capabilities section for "VGA/DOS Legacy Timing Modes" support, which is a predefined set of 24 low-resolution modes you remember from EGA
    • Manufacturer-specific extension blocks, which are naturally undocumented and could contain timing info
    Naturally, the display industry found this lack of flexibility entirely unacceptable.  So, soon and very soon, you will also begin to see displays that conform to the newer DisplayID specification.  Fortunately, to minimize time to market and maximize software reuse, DisplayID timing descriptors are entirely compatible with EDID:
    • Type I detailed timings, which are encoded exactly like the detailed timing descriptors in EDID, except for using an additional byte for pixel clock, moving the stereo and interlace bits, adding an aspect ratio field, eliminating borders, and widening the storage for all the horizontal and vertical timing numbers
    • Type II detailed timings, which are encoded exactly like the detailed timing descriptors in EDID, except for being only 11 bytes long, using an additional byte for pixel clock, moving the stereo and interlace bits, eliminating borders, storing all horizontal timing parameters in terms of eight-pixel character cells, and widening the storage for vertical timing numbers
    • Type III short timings, which encode CVT timings in three bytes exactly like CVT 3-byte descriptors in base EDID, but do it as a bit for preferred-timing-or-not, a bit for reduced-blanking-or-not, four bits for aspect ratio, eight bits for horizontal image size in eight-pixel character cells, a bit for interlaced-or-not, and seven bits for refresh rate
    • Type IV timing codes, which are one-byte indices into the DMT timing list, unlike any EDID encoding method
    • Supported VESA Timings data blocks, which is a ten-byte bitfield corresponding to various modes in DMT, but not in the order that they're in the DMT spec, nor in the same order as the Established Timings I through III in base EDID
    • Supported CEA Timings data blocks, which is a seven-byte bitfield corresponding (shockingly) to a superset of the CEA mode list in the most recent version of the spec I have handy, and even in the same order as in the spec
    • Display Range Limit descriptors, which are exactly like the encoding in base EDID, except for widening the storage for pixel clock, shrinking the storage for horizontal and vertical sync limits, adding maxima for horizontal and vertical blanking, and removing support for GTF
    • CEA-defined blocks, which are presumably in a newer version of the spec than I have
    • Manufacturer-specific data blocks
    Did I say compatible?  That other thing.

    So the other great part about DisplayID is that it requires host software updates to work, because - afaict - it's published at the same I2C address as EDID was.  This is kind of okay for laptops, since you know what software is shipping on them.  It's less okay for standalone displays, because now it raises the very real possibility of being able to buy a monitor that is too new to work with your old operating system.

    Even better, drivers using VESA BIOS Extensions (which admittedly are kind of boned already) are about to be left out in the cold.  The VBE spec only defines a call to get EDID, and no legal DisplayID block can possibly be confused for an EDID block, so if your video BIOS actually checks for the EDID block header, there's no way to get it to return DisplayID anyway.  You now have a combination of video card, cable, and monitor, that can not be used together with VBE.  For people running the unfortunate sort of OS you have to pay for, there's the other kind of connector conspiracy too, where you probably can't get a combination of (native driver that supports DisplayID) and (operating system with native driver support) for your hardware either.

    Thanks VESA.

    Mobile Broadband Assistant makes it Easy

    Yay!  Mobile broadband with NetworkManager is so simple!

    Yay! Mobile broadband with NetworkManager is so simple!

    (credit mandolin davis)

    Easier than your… well, you probably know where I was going with that.  It’s a great leap forward for NetworkManager usability.  Other operating systems either don’t have one, or your network operator gives you the software so of course you don’t have to configure it.  On Linux, we like to work for everyone, so we get to make it easy to get connected to the operator of your choice.

    Antti does the base

    Antti Kaijanmäki did some work last summer (2008) to put together the mobile broadband provider database and write a library and assistant to use that data.  That was a great start, and Ubuntu started shipping it as a patch in 8.10.  Seems to have worked fairly well there, but since we were deep in the middle of getting the NM 0.7 release out at that time, it wasn’t possible to integrate then.  Antti’s patch didn’t get committed to 0.7.1 for mostly licensing and scope-related reasons, but he built the database which the assistant that just hit git uses, and he proved that it was something users wanted.

    Tambet the wrote a compatibly-licensed library to parse the database for network-manager-netbook, which means I didn’t have to, which was nice.

    Implemented with Máirín-induced goodness

    So a few weeks ago I started rewriting the pitiful GSM/CDMA chooser dialog in network-manager-applet into a full GtkAssistant-based helper.  I’m not an interaction expert, so I tricked Máirín Duffy into helping me get the flow and design planned out.  Then we iterated over my implementation and fixed what sucked, and came out with something that works pretty well.  Starting from the user’s perspective is incredibly important, and that’s what we did with the mobile broadband assistant.

    Why do you want this?  (or, WTF is an APN?)

    Because you probably have no idea what a GPRS APN is, or why you need the right one to make things work.  Nor should you have to.  At least CDMA got this right by not having one, they are an interaction nightmare.  Your provider knows exactly what you’re paying for, so they know exactly what to bill you for when you use various services they offer.  But when connecting to GPRS data services, you need to tell your phone or device what APN you’d like to use when connecting which in turn tells the provider how you’d like to be billed for it.  But this sort of access control is simply at the wrong level, and having it a the GSM level instead of the application level sucks for users.

    There are different APNs for everything; for example T-Mobile USA splits it up as follows:

    • wap.voicestream.com - for T-Zones, the WAP-based walled-garden for dumbphones ($6/mo)
    • internet2.voicestream.com - Unblocked access to anything using a NAT-ed IP address ($20/mo)
    • internet3.voicestream.com - Unblocked access to anythign using a public, routable IP address for certain VPN clients (also $20/mo)
    • epc.tmobile.com - new, nobody’s quite sure what its for

    Some providers have a separate APN that downsamples JPEGs to save data costs, others have separate APNs for pay-as-you-go versus contract (they already know whether your IMSI is contract or not, so this baffles me), others have separate APNs per region they serve (BSNL India).  It’s a freaking mess.

    Sanity through NetworkManager

    APNs don’t really change that often, so it’s easy to build up a crowdsourced database of current providers and their APNs.  Which is what Antti did, and that worked out really well.  Máirín decided to loosely map the APN to a provider’s billing plan, which usually maps to brand name or service that users actually care about.  So I reorganized the mobile broadband provider database to allow multiple “services” (ie, APN or CDMA) for each provider.  This almost gave me carpal tunnel since its not easily scriptable.

    Second, I added all the MCC/MNCs (network identification numbers) that I could find, so that in the future we can read you IMSI off your SIM card and automatically suggest your provider when you plug in your phone or data card the first time.  That’s pretty hot.

    Hot Pics

    When you first insert your card, it shows up in the applet menu.  Ubuntu has a patch that will nag you with a notification that you’ve just plugged in new 3G hardware; they’re welcome to port that to the new code and submit.  For now, it looks like this:

    hotplug

    When you click that, you’ll get the Assistant’s intro page:

    intro

    This page explains some of the information you’ll need.  Hopefully you know what provider you signed up for, but if you don’t, you seriously need to stop getting drunk before 3 in the afternoon.  You probably also know what country you’re in, if not,seriously, get a GPS.  I can’t help you with that.

    country

    Now it gets a little tougher, but since you’re filling a wheelbarrow with your money and dumping it on your provider’s doorstep, you’ll probably also know what provider you signed up with.  But maybe you got shanhai-ed into signing a contract, I don’t know.

    provider

    But if your provider isn’t listed, we need your help. File a bug in Gnome Bugzilla, tell us your provider name, your country, the common name of your plan, and the APN you use.  We’ll update the provider database with that information, and thank you profusely for making life easier for everyone else too.  We could, in the future, allow users to automatically send their manually entered settings to a server somewhere, and make the provider database update process less manual.  Patches for that greatly appreciated.  Now you get to choose your plan:

    plan

    Again, if your plan (ie, APN) isn’t listed, file a bug.  But since you signed up, and they probably have some sort of FAQ for that sort of thing, you’ll probably be OK.  Lastly, we have:

    summary

    and if all looks well, you hit Apply, and NetworkManager will activate the connection you just set up.  You can also change the APN easily through the connection editor.  Much rejoicing was heard.

    Future Improvements

    There’s a few things we can do in the future…  If the connection fails for some reason, re-display the Assistant.  We can autodetect your provider based on the first 5 or 6 digits of your IMSI on your SIM, skip the Country page, and automatically select that provider in the Provider page, saving you a step or two.  Unfortunately, we can’t autodetect the plan/APN because that’s not stored anywhere (well it is usually preloaded into your phone, but all the APNs are, and there’s no indication of which one is the one you really want).  So there’s room to make it even more awesome.

    Your help is required

    Again, the mobile broadband provider database is incomplete.  Help fix it up for your country and your provider by filing a bug with your information.  Include your provider name, your country, your plan’s marketing name if you know it, and of course the APN you’re using for data.  If you have a CDMA provider, just tell us your country and provider name.  Username and password are generally ignored by the network and the device, so they aren’t useful.  It’s your help that makes this effort work better for everyone.

    Moving house

    We're moving to a new house this coming Friday and Saturday. Having learned from our last move, we started packing a few weeks ago and have almost run out of things to pack. Also, the garage grew a mountain of moving boxes:

    moving boxesmoving boxes

    Open Video Conference an Amazing Step Forward

    The Open Video Conference just ended yesterday. I attended the first two days and just stopped in briefly during the hack-fests yesterday before having brunch with some old highschool friends and heading back to my parents house where my dog and car were stashed.

    I can say without a doubt the turnout was amazing and even though not everything I heard all weekend was positive it was a giant leap forward in then understanding of the importance of Open Video and culture. I won’t put a figure on how many people attended but some of the upstairs talks were standing room only and after the first day some of the organizers were lamenting that then needed to get bigger rooms (consequently some of the talks were swapped the next day). Speaking about the organizers, they ran an incredibly smooth ship and should be thanked and praised for their efforts.

    The Good

    Apps

    I was mainly there looking to see what video producers wanted from FOSS application developers and to support the PiTiVi/GStreamer teams on behalf of the GNOME Foundation. It is amazing to see the PiTiVi non-linear video editing app at such a usable state. While Edward Hervey (bilboed on irc) gave his mini presentation on PiTiVi I was busy hacking up a “How To Make Chocolate Truffles” video from pictures and clips I had laying around.

    Afterwards I showed him some of the bugs I encountered in the 0.13.1 release and he just rattled off, fixed in git, fixed in git, fixed in git…etc. Sadly the releases are tied to GStreamer releases (which is a good thing from a development/bugs standpoint but not so good from a user standpoint given the early stages of PiTiVi) so we won’t see an official release soon. I plan on trying to automate a Fedora Repository at some point just to be able to view the progress without breaking my system.

    The point is PiTiVi is about 90% there (and perhaps 100% in git) to be able to support my needs for basic video editing in terms of stability and basic tools. That should be pretty reflective of those who need to do things like screen casting and interview style video blogs. Some advanced features like effects (look at Cheese for some examples of this already working in an app) already exist in GStreamer and just need to be integrated in PiTiVi’s UI and rendering pipeline.

    There was also a show of Cinelarra but more interesting is the GTK+ fork Lumiera which unfortunately is not usable yet but the direction they are going in (GTK+ interface and some GStreamer integration) looks like a great re-start in the case of pro level editing tools.

    Also of interest in the pro level space was Blender which seems to be the pro project with the most momentum and features for pro’s. At least that was the initial reaction from some on the Red Hat media team. The dev’s did admit that the functionality is limited to what they needed during production of Big Buck Bunny (and other productions currently in the queue) but that in those areas it is rock solid. It is interesting to see a UI designed with different usability profiles. For instance one of Blender’s usability criteria is the avoidance of repetitive strain injury. To combat RSI mouse clicks are evenly divided between left and right mouse clicks.

    Bassam Kurdali, one of the Blender developers and animators, came up to me later in the conference and said he had noticed me using PiTiVi to edit my video. He was impressed at the simplicity and slickness of the interface and how far along it is. There is plenty of room for different approaches and a real potential for cross pollination between the pro tools and the every day end user tools.

    What Content Producers Want

    Speaking of end users we got to hear from a bunch of them who let us know how we could support them. One of the biggest themes was that Windows tools suck and those who taught others couldn’t just tell them to go out and buy a mac (praises were heaped on iMovie and Final Cut Pro). They really want an easy to use tool, with the unfortunate note that it would have to run on Windows. One really good thing is that a lot of the non-tech content producers understood the need for free codecs. However in the end they just want a simple way to render down to DVD, You Tube, Daily Motion, iPhone, etc. and don’t want to deal with formats.

    I ended up collecting a bunch of buisness cards and am toying with the idea of starting a feedback group with content producers which would get them involved in improving GNOME App usability from the perspective of those who are not yet familiar with the GNOME workflow. If we are serious about expanding our reach we need to go beyond our current self selecting internal feedback loops. The goal wouldn’t be to get these people using GNOME (though giving them a way through the apps wouldn’t be a bad thing). It would be more about getting groups outside of GNOME/Linux to be part of the process of improving GNOME. Will it be fruitful? I don’t know but it is an interesting experiment with a potential huge payoff for a little bit of effort.

    Sita Sings the Blues

    This good section wouldn’t be complete without the mention of Sita Sings the Blues by Nina Paley which is a feature length (82 minutes) animated film released under the Creative Commons Attribution-Share Alike 3.0 United States License. You have complete rights to watch, screen, remix and redistribute the film as long as you abide by the license. I do suggest you watch it and if you like it buy the DVD or simply donate to encourage more works like this (I bought the DVD for $20). Not only is Nina a content producer but she is heavily involved in advocating her distribution methods, going as far as documenting the process that went into releasing Sita under a creative commons license and in her work with QuestionCopyright.org.

    Mozilla and the Open Video Contest

    I was very impressed with Mozilla’s involvement and their push for Ogg Theora to become a base line codec for the HTML 5 video tag. They are also helping launch the upcoming Open Video Contest which would see the winner flown to the 2010 South by Southwest conference. We should probably run some sort of sister contest to encourage GNOME users to submit entries.

    The Bad

    It wasn’t all roses. While I feel we are reaching independent content producers way more than I would have though at this point, some of the big companies still don’t get it or are afraid of Open Video implications.

    Adobe

    It must be said that Adobe has been somewhat good at working with the community over long periods of time but that they just never get around to resolving key issues. What really surprised me was when on one of the industry round tables the Adobe representative pointed to their release of the Flash documentation as a shining example of this relationship. After checking with a developer of an alternative flash implementation I was told those documents are pretty much useless. Due to bugs, some of the spec just doesn’t work as written and other issues makes it impossible to write a third party Flash player.

    YouTube/Google

    While reportedly Chrome will ship with Ogg Theora support their flagship video site YouTube seems afraid to do so. Their rep at the round table stated some pretty audacious things such as continuing the myth that Theora wasn’t good enough when clearly that argument was directly debunked (the side by side comparisons were even playing on HDTV’s at the conference).

    Even more of an issue was the representative’s idea on what Open Video meant. He declared that they would love to support Open Video but that it meant letting anybody do whatever they wanted and that doesn’t work from a buisness perspective.

    Open Video isn’t about wild west, trample on rights. If anything it is about balancing the rights of content producers, end users and fair use. From what I read, YouTube’s position is that they are the 1000 pound gorilla in video distribution and at the end of the day they only believe in a user’s and content producer’s freedoms if it is walled behind their own servers. “All the world’s video” indeed.

    The solution there is to drive traffic to sites like Daily Motion and Blip.tv which understand the issues involved.

    Conclusion

    Nothing is perfect, but we are off to a really good start. In the end it is up to us to keep the momentum going and eventually produce a better experience within the complete Open Video stack, from content production to delivery. The web was built and exploded around the concept of open technology. Let’s continue to make sure this is the case going forward. The last thing we want is the web to become the domain of a few, with creativity being stifled by restrictions in the non-open parts of the stack.

    [read this post in: ar de es fr it ja ko pt ru zh-CN ]

    multi-process firefox

    Chris Jones has put together a demo video of a super-early stage Gecko-driven browser that’s multi-process.  It’s super-duper early and realizing that we’re only in Phase I of the roadmap is important but it’s great to see such speedy progress. Release early, release often!

    Sadfaces. Your browser doesn’t support the <video> tag with open video formats. Try clicking here instead.

    June 21, 2009

    pidgin-2.5.7 fixes Yahoo protocol

    pidgin-2.5.7 was released fixing the sudden Yahoo protocol change.  Builds for Fedora 9, 10, 11 and rawhide are available for testing here.  Please let me know if you see any regressions.  RHEL-4 and RHEL-5 builds are here.

    June 20, 2009

    Yet Another Kit

    A while back I was celebrating that arrival of secure realtime scheduling for the desktop. As it appears this was a bit premature then, since (mis-)using cgroups for this turned out to be more problematic and messy than I anticipated.

    As a followup I'd now like to point you to this announcement I posted to LAD yesterday, introducing RealtimeKit which should fix the problem for good. It has now entered Rawhide becoming part of the default install (by means of being a dependency of PulseAudio), and I assume the other distros are going to adopt it pretty soon, too.

    Read the full announcement.

    iPhone

    Seriously there is Internet in my pants!

    Posted via LiveJournal.app.

    Photos

    I'm now a flickr..er? My account is "iamjessekeating".

    Posted via LiveJournal.app.

    Photos

    I'm now a flickr..er? My account is "iamjessekeating".

    Posted via LiveJournal.app.

    June 19, 2009

    Fedora 11 released, onwards to F12.

    With the release of Fedora 11 recently, we are now back in the position of looking onwards to the next release. F11 released with a 2.6.29 kernel, and we’re already looking at doing a .30 rebase for it soon. (I was hesitant to type that, because the last time I blogged about doing a rebase, we hit some troubles and ended up skipping the 2.6.28 release for F10). While F11 was stabilising before release, devel/ had continued on, being rebased up to 2.6.30. Yesterday Kyle committed the the first rebase to get us back up to Linus’ tree of the day.

    So we’re looking at 2.6.31 for F12. (With various conferences coming up over the next few months, it seems infeasible that .32 will land in time for F12’s release).

    Other planned changes? We’ve talked about dropping the exec shield patch that we’ve been carrying since Fedora Core 1.
    It’s a pain to have to keep carrying it, and rebasing it, and occasionally fixing it, just to add a poor emulation of a feature that has been in all CPUs for the last five years. The decision isn’t final yet, but it’s something that’s being considered.

    Some of the other patches we’ve been carrying (like modesetting drm) are now starting to find their way upstream too, which is obviously a good thing. The only really big thing we’re still carrying that struggles to get upstream is utrace.

    Post from: codemonkey.org.uk

    No related posts.

    Home test, and tftp bits

    After some situations at work this week where I lost time where I really shouldn’t have had to, combined with the observation that I get more useful strategic work done at home in Belgium, and because practically speaking going to Barcelona next week would be silly given that I can only leave on Monday and Wednesday is a day off (which I loathe - San Joan, the most dangerous night in Barcelona), I decided to stay home next week and compensate by fixing my phone setup.

    You see, the only really annoying thing is that any conference call I end up in is terrible because I have a really hard time hearing the other side through either my mobile or my fixed phone, as the audio cuts out several times a second.

    So, I spent a few hours yesterday first setting up the VPN, which aside from some minor issues seems to be working fine now. This was apparently a prerequisite for setting up asterisk because asterisk needs a fixed IP address or something I’ve been told.

    After that, I started setting up Asterisk so that I could use the same THOMSON phone we have at work from home and call people in the office over it.

    All of that is not what this post is about though.

    This post is about the TFTP tricks and things I always need to re-learn any time I meddle with tftp. I’m putting them here because Google usually doesn’t find the problems and solutions I come up with, so maybe they’re of use to you if you play with TFTP. They will definately be of use to me next time I mess with tftp.

    • TFTP runs over UDP on port 69
    • on Linux TFTP typically runs from xinetd. Do yourself a favour, edit /etc/xinetd.d/tftpboot and add -v -v -v to the server_args line. These lines should end up in your /var/log/messages
    • For some reason xinetd is fidgety with tftp. It doesn’t restart in.tftpd properly when you reload or restart xinetd, and so your verbose changes might not happen. Check with ps aux. You can kill it, but then xinetd doesn’t seem to start up in.tftpd properly for a while either. Strange stuff - please tell me if you know what’s going on here
    • Keep a tcpdump running on your tftp server to see requests actually make it in: tcpdump | grep tftp
    • Start by trying a tftp transfer on the server to localhost:tftp localhost -c get testIdeally, you should get Error code 1: File not found back immediately.
    • Now try an nmap from another machine: nmap -sU -p 69 server which should come back with 69/udp open|filtered tftp. If it doesn’t, you probably didn’t open 69/UDP on your server’s firewall. You can confirm by just turning off your firewall on the server for a quick test.
    • If it shows as open|filtered, try tftp server -c get test. This should error out immediately as well. If it doesn’t, it’s probably because your test machine does not allow tftp in. Confirm simply by turning off your firewall. The simplest way to fix this is to load the tftp connection tracking module: modprobe nf_conntrack_tftp. This makes sure that your machine knows to accept the reply tftp request coming in on a random port. On Fedora/RedHat systems you can make this permanent by adding it int /etc/sysconfig/iptables-config to the IPTABLES_MODULES variable. This is the number one thing I keep forgetting when debugging tftp troubles.
    • After that, try with actually existing files. Make sure you have the SEcontext correct; you can run restorecon -vR /tftpbooton the server for that. You can always confirm or deny whether SELinux is giving you trouble by temporarily turning it off. My auditd (the process that logs SELinux violations to /var/log/audit/audit.log) sometimes stops logging properly to the log file, and I need to restart it in that case. It’s easy to spot when auditd is misbehaving because by default it even logs replies to calls like setenforce 0.
    • Be careful with symlinks in /tftpboot if you use them. On your system they should actually be broken, because the tftp server will serve from /tftpboot and treat that as its root, as if it were chroot’d. So, if you have a file /tftpboot/phone/phone.inf, and you want a symlink to that file to exist and work in /tftpboot, you actually need to create a broken symlink like this: ln -sf /phone/phone.inf /tftpbootso that the symlink will work for tftpd.in This is one of those steps that I completely forget every time too.

    Well, that should be it for the next time I have tftp troubles!

    June 18, 2009

    still on vacation but...

    I still found the time today to make new chromium Fedora packages for interested testers:

    http://spot.fedorapeople.org/chromium/

    While you can certainly download the packages and install them manually, you can also setup a yum repo:

    Put this:

    ###

    [chromium]
    name=Chromium Test Packages
    baseurl=http://spot.fedorapeople.org/chromium/F$releasever/
    enabled=1
    gpgcheck=0

    ###

    in /etc/yum.repos.d/chromium.repo

    Then, just: yum install chromium

    (Plus, you'll just get automatic updates as I make newer packages. Theoretically, I might be able to try to do automatic daily builds, but the patches I'm applying seem to break every other day or so.)

    The packages are i386/i586 only (and the i586 chromium is a bit of a lie, it isn't compiled with the correct optflags yet) because chromium depends on v8, which doesn't work on 64bit anything (yet). Also, plugins don't work at the moment and some of the tab functionality doesn't work right, but as a general web browser, it seems functional enough. (And, it seems to pass the Acid3 test, which isn't surprising at all, since WebKit does and Chrome uses WebKit.)

    I'm still on vacation, so I'm not going to go into detail on how painful Chromium is to package or hack on, why I'm having to patch it up so much, or why I'm not sure it will ever make it into Fedora proper, those stories will have to wait for a later date, when I have more time. Now, I will return to eating too much unhealthy food and being as lazy as possible. :)

    A request for some simple testing

    Another thing that’s been on my list to look at that I’ve finally had time to sit down this week is the new isohybrid support in syslinux. This lets you take an ISO image, post-process it and then be able to either burn the ISO to a CD or write it to a USB stick with dd. Given that we stopped making a disk image form of boot.iso a couple of releases ago to save on duplicated/wasted space, this is obviously kind of cool.

    The problem was that the first time I tested it, it looked like it overwrote the checksums we use for the mediacheck functionality in anaconda. It turns out I just wasn’t thinking — we need to implant the checksum *after* we do the isohybrid modification.

    So without further ado, I’ve built a test version of the Fedora 11 boot.iso that is usable in this form. Testing of it would be much appreciated!

    How to test

    1. Download the test image
    2. Try to burn it to a CD like you normally would. Ensure that it still boots normally. You don’t have to go through the full install, just boot it. Extra points if you can test mediacheck
    3. Find a USB stick that’s at least 256 megs that doesn’t have any data you care about on it. Now try to write the test image to it using dd (dd if=test-isohybrid-boot.iso of=/path/to/device bs=1M). Again, you don’t have to install, just boot into the installer. Note that we won’t automatically find the second stage and you’ll get asked where to find the installer images.
    4. Let me know the results in the comments (including type of machine).

    Assuming this works, I’ll get the changes in so that we do this by default with boot.iso and then probably also try to make it so that the loader can automatically find the second stage image on either the CD or the USB stick. I’ll also consider doing similar for the livecds, although there’s more value with liveusb-creator / livecd-iso-to-disk there as you also want to set up persistence in a lot of cases.

    Comments

    June 17, 2009

    rpmfusion-{,non}free-remix-kickstarts -- or -- knurdix

    There was always the idea to build a linux distribution with RPM Fusion packages within the RPM Fusion project -- e.g. live media and installer spins ^w remixes that are basically a distribution like Fedora under a different name and with some packages from RPM Fusion. There is/was Omega 10, which until now never became the official spin for one reason or another. Recently some other work was done to get one step closer to create a official "Fedora Remix" from RPM Fusion sooner or later:

    RPM Fusion since a few weeks contains the packages rpmfusion-free-remix-kickstarts and rpmfusion-nonfree-remix-kickstarts in the repositories for Fedora 11 and Rawhide. They just like the Fedora package spin-kickstarts contain kickstart files that can be used to create your own linux distribution live-image using the Fedora Package Collection and Fedora's livecd-tools. A installer image is still on the todo list and might need some fixes in anaconda and the rpmfusion-{,non}free-release packages afaics.

    The kickstart files in the two RPM Fusion packages are pretty basic: They simply include the kickstart files from Fedora, remove the Fedora branding, add the generic branding and add the repo definitions for RPM Fusion. That's basically it, because the comps groups from RPM Fusion extend the groups that Fedora defines and thus you automatically get all the packages you want depending on what's defined in the RPM Fusion comps files -- that is for example gstreamer-plugins-ugly and gstreamer-ffmpeg in the case of the Gnome groups and xine-lib-extras-freeworld for the KDE remix.

    As a proffce of concept I created four remixes (Desktop and KDE for i586 and x86_64) for testing purposes and uploaded them to to the web for public testing. But warning: They are basically untested and just meant to show what's possible. Ohh, and due to the lack of a better name I just called them "knurdix" ;-) We'll need to find a different name for the official RPM Fusion images (which might be simply "RPM Fusion" or something else that sounds better).

    Hopefully those packages and the kickstart files in it can help to get some people interested in the idea of a RPM Fusion remix -- maybe enough people to build and maintain official "Fedora remixes" with packages from RPM Fusion" in the long term. If you are interested to help just join the rpmfusion-developers list and share your ideas. Or get in contact with me directly, but the mailing list is the preferred way.

    Boot tales, woo ooh!

    (Take the title in the context of the theme from Duck Tales and maybe it makes sense?)

    There was a long and rambling discussion last week about the version of GRUB that’s shipped in Fedora and specifically the fact that the support for ext4 did not land in the version we shipped in Fedora 11. Now, as was said on the thread, this is because the patches weren’t reviewed and ready in time for beta (there are a couple of different ones… so which one is right?) and so we didn’t feel comfortable putting them in after beta, especially as with the way GRUB works, the same filesystem code gets used for ext2, ext3 and ext4 with the patches. A little unfortunate? Yes. Would it have been better if we had gotten them in so that you could do an install of Fedora 11 onto a single partition? Sure. But that’s one of the costs of a time-based release schedule.

    In any case, one of the things that came out of the thread was that I gave a history of the version of GRUB in Fedora. For posterity, I’ll repeat that here, with some edits.

    So, the gory history for those who might be interested. Eight years ago (!), we decided that the advantage of not having to rerun lilo after changing the config file as you can just read the config file off the filesystem with grub was worthwhile. We had, at that point, been patching lilo for quite a while to have a graphical menu. Therefore, keeping a graphical menu was a branding requirement. Connectiva at the time had a patch to grub that worked. We picked it up, shipped it, and it (mostly) worked. Efforts were made to integrate upstream, but they were largely uninterested. Along the way, significant changes to the graphics patch had to be made as grub evolved and a few other efforts were made to push it upstream. Eventually, the answer was “no, we’ll do something in the next big version of grub after grub 1.0″. Then the main developers went away and we were basically left maintaining a (large at this point) fork. As there is no upstream for grub 0.9x left, we’ve been left in a position of maintaining it and we’ve added some real features that have been needed along the way as grub 2’s progress has been slow at best and we were initially unhappy with some of the direction taken

    So, that’s where we are today. We essentially ship a fork of GRUB 0.9x with graphics support, support for a lilo -R type functionality (so you can reboot once into a single kernel), EFI, and some more little things that I’m not thinking of right now.

    With that in mind, I sat down and spent some time with a current snapshot of grub2. Overall, it’s made a lot of progress in the time since I last looked at it (a year ago? maybe a little more?). It was actually able to successfully boot for me in KVM and there’s equivalent graphics support to what we’re carrying in our grub 0.9x package. That said, there’s still quite a bit of things to verify exist before we can switch. And just in my look, there are a number of small things that would need work, especially around the way the config file gets created and updated. And with the very short runway for Fedora 12, I don’t think there’s really time to get it into shape in time. But I do think that it makes sense to look at for Fedora 13. So I’ve started a feature page to track as some of the things get tested and worked on. Then hopefully we can make the switch pretty painlessly early in the Fedora 13 cycle.

    Comments