Sunday, November 20, 2011

Device left behind after umount in Nautilus

Whenever I mount an iso via the loop device an entry representing the virtual drive shows up in Nautilus in Ubuntu. It has been like this for ages and it is correct. However in the Lucid cycle (10.04) I noticed that the drive icon is not always removed after unmounting the iso file.

This turned out to be a kernel issue: the loopback driver failes to emit the change uevent when auto releasing the device. Finally a patch made its way into the -mm branch but the behavior under the current Ubuntu release (11.10) is the same.

I am using a workaround which is emitting a change event manually after umount:

udevadm trigger --action=change --subsystem-match=block --sysname-match="$DEVICE_NAME"

I created a script which sets the $DEVICE_NAME based on the mount point. For my own convenience I packaged it into a deb file that can be installed under Ubuntu. Basically it's just two scripts: mount.loop and umount.loop

The deb can be downloaded from my ppa for Lucid.

Update: The fix made it into Oneiric as an update. Starting from kernel version 3.0.0-16.28 the workaround is not needed.

Friday, October 21, 2011

Rsync transfer hangs

SSH works fine but rsync or scp hangs after a while which makes file transfers impossible. It turned out to be an MTU issue: setting it from the default 1500 to 1492 fixed the rsync/scp issue.
It happened to me again and having taken the first bullet now I decided to figure out what's happening behind the scenes.

Every network link has a maximum packet size called the link's MTU (Maximum Transmission Unit). The full path from one computer to another may travel across many links with different MTUs. The smallest MTU for all the links in a path is the path MTU.

The MTU and the packet should be as large as possible, in other words, the least fragmented. I found an RFC from 1990 about Path MTU discovery which describes how we can make the packet as large as possible.

We have to find out the smallest MTU along the links. The basic procedure is simple: send the largest packet you can, and if it won't fit through some link will get back with a notification saying what size will fit.

Here comes the catch: the notifications arrive as ICMP (Internet Control Message Protocol) packets known as "fragmentation needed" ICMPs.
Some network and system administrators view all ICMPs as risky and block them all, disabling path MTU discovery, usually without even realizing it. Then if the default MTU happens to be larger than the path MTU we are out of luck unless we manually change it.

Saturday, July 30, 2011

WDTV Live Subtitle Issues

I have two WDTV Live media players at home, both running the latest 1.05.04_V firmware. I've been testing them for months and made a list of bugs I found, posted those that others had not found yet on the official forum. Instead of WD fixing the ever-growing list of bugs new features keep popping up. I am a simple guy who wants to watch movies on the player and even on my laptop I hardly ever use Facebook and all the other internet services that have been added to the player's feature list recently.
There's a thread about subtitle issues that Techflaws started last year. Being bitten by those bugs I decided to stop waiting for WD and instead of starting to whine on the forums be more constructive.
Here are my findings about the vobsub related issues.

Issue 1: Internal Vobsubs (idx/sub) are messed up, apparently some colors of the palette are ignored/replaced

I extracted subtitles of various dvds, muxed them into mkv files and starting analyzing their behavior on the media player. I made a small chart about
  • the 16-color palette
  • the 4 color indexes referring to a color in the palette
  • the alpha (opacity) values for each of those 4 colors
  • how the 4 colors of the image are used - which is for the background, the outline, the pattern and the anti-alias
I came to the conclusion that the media player ignores the opacity value of the last color and sets it to zero causing that color to be transparent all the time. If that color is used
  • for the outline then subtitles show up without outline
  • for the pattern then you can see through the characters
  • for the anti-alias then you can see through the anti-alias part between the outline and the pattern
  • for the background then everything's fine since the background is already transparent
Though it seems to be traditional to use the first color (with index 0) as the always transparent background I did not find any information about this being required. All 4 colors are equivalent and might as well be reordered.
If I can swap a transparent color (which already has the alpha value set to zero) with the last one (where the media player has the hardcoded zero alpha value) then the subtitle will look good regardless of being internal or external.

What has to be done is
  • finding a color with zero alpha value (there should be one: the background)
  • swap the index of the transparent and the last color in the rle-encoded subtitle stream
  • swap the palette references
  • swap the alpha values
I have modified the BDSup2Sub source to prove my point. If you load your sup or idx/sub file and set the palette mode to "keep existing" then on the Save/Export dialog you will find a checkbox with which you can activate the workaround I described above.
If you're one of those console guys I have some good news for you: I added a command line switch as well. The following command will use en.idx as input and save to en_exp.idx file. It will keep the resolution, the palette and apply the wdtv workaround. 

java -jar bdsup2sub-wdtv.jar en.idx en_exp.idx /res:keep /wdtv+ /palmode:keep

I updated the help page where you can find the description for all the other command line switches as well.


I sorted out the license issues with BDSup2Sub and it can be downloaded from here. Sources are available here.

Note: This is a modified version of BDSup2Sub 4.0.0 and since it's modified please do not contact the original author (0xdeadbeef) if you have issues with it. Only use this version if you need the WDTV Live workaround feature since otherwise it is identical the the original 4.0.0 version.


Issue 2: dropouts - certain subtitles are not displayed.

I managed to find some subtitle streams that expose this issue and analyzed them. Those subtitles that did not show up do not have the end time set, in other words their duration is indefinite. The player in this case should show the subtitle until another one comes along.
The workaround I could think of was to set an explicit end time for these sub frames. Luckily BDSup2Sub already does this just by loading and exporting subtitle. In the log window you can see a warning about the missing subtitle end time: "WARNING: Invalid end sequence offset -> no end time"

The thread lists some other bugs as well but those are either such that I can live with them (e.g. default stream settings are ignored) or there's not much I can do with them at this point (e.g. PGS (SUP) subtitles from Blu-Rays are only supported in m2ts but not mkv containers).
The two workarounds above satisfy my need at the moment. I decided to publish my findings hoping others can also enjoy their hardware better or I can help WD to find the root cause and fix the issue.
Feel free to leave a comment if you found my findings or the modified BDSup2Sub useful and let's hope together that WD will eventually fix these issues.

Update 1: PGS (SUP) subtitles from Blu-Rays are supported in the latest firmware (1.06.15_V) not only in m2ts but also in mkv container. It works well if the mkv is made with MKVToolNix and does not work if it is made with MakeMkv. For now I suggest you reprocess your mkv with MKVToolNix since that creates the most standard compliant mkv structure.

Update 2: DHStraayer created a tool to help users who have WDTV Live to rip DVDs and Blu-Ray discs to mkv files, especially to deal with bugs in the WDTV Live firmware that interfere with subtitles.
If you have any questions about the tool please contact the original author.

Wednesday, July 27, 2011

Have disciplines

For 2 years I was working on a project called Resource Certification that is meant to make the internet routing safer by certifying internet resources. If you remember the YouTube hijacking you know what I'm talking about.
This project is based on Public Key Infrastructure (PKI) therefore it involves quite some cryptography. We had drafts and standards (actually we were and still are creating them while providing a reference implementation) and we were strictly following them. What happens under the hood is quite complicated and it's hard to imagine how much logic is involved by a simple operation you make on the UI. It's like an iceberg: the strength is beneath the surface and would be hard to examine by naked eyes. We developed it with Test-Driven Development therefore everything was tested. We have a lot of tests. We were disciplined enough to resist the temptation of cutting corners. We were confident that if our continuous integration server showed green boxes the system worked well. I can recall only one occasion when the system failed despite all tests passing. It was because of the way our Hardware Signing Module (HSM) works and of course, we don't use HSMs for unit testing. The bug was caught on the test machine (that does have an HSM) in the next test phase and never made it into production.
During those 2 years I got used to this confidence that if I saw green boxes the system was in good shape. We deployed with confidence and our discipline never let us down. The system is really maintainable - we even got a 5-star certificate as the result of an external audit. The stars are assigned for high quality and low maintenance cost.
A few weeks ago I was assigned to another project. I am happy to change every now and then and I was eager to do something different for a while.
How different is my new project? Here it's much more transparent what's happening. There's a shiny User Interface, no cryptography. Just by looking at the UI you can guess what might happen under the hood and your guess will most likely be right.
What I started missing here on the first day is the confidence. I had no trust in the green boxes for this project anymore. The (missing) tests let me down. There's some duplication, too. I change things and the system breaks.
Can this be fixed? Sure, we are on top of it. Our unit test coverage finally went above 80% and removed a lot of duplicated code. We have automated integration (Selenium) test for the UI. We are doing static code analysis. It's all getting back to where it should have been. I am gaining more and more confidence. While I was afraid to rename a simple field (for a reason) I can do bigger surgeries now. After a few weeks the tests we created already give me confidence for changing things.
But why were those tests not created for the first place? Why the duplication? And why was this tolerated?
In my opinion it can all be traced back to the origin of the project. The goal was not quite clear at the beginning. Neither was the way to achieve them. It all started with some spike. Then the result of the spike was not thrown away but turned into production code but not production quality code. There was a temptation to move on with what the team had at that point. It was working after all, though not thoroughly tested. The technical debt kept accumulating and at some point it all started falling back on the team.
What's the moral of this story? No matter how simple the project or your class seems to be - test it. And I mean automated tests. Defend your processes and follow your disciplines at all times. If you work TDD - keep doing that. Don't ignore tests or stop creating them. If you have a definition of done check if it's really done. Technical debt will bite you later.
And one more thing: don't cut corners even if pressure arises. Your disciplines should be followed even in the depths of a crisis. In the end it all pays off.

Sunday, July 24, 2011

The Clean Coder

Yesterday I finished reading The Clean Coder: A Code of Conduct for Professional Programmers by Robert C. Martin. Most of us in the industry know him as Uncle Bob and read at least one book from him.
In this book Uncle Bob explains what it means to be a professional programmer. He shares his personal experience and since he has been programming for 40 years he has a very strong opinion about professionalism.
He makes very good points and the book itself is easy to read. Having read Clean Code before I was expecting code snippets but the book turned out to be more about our attitude towards programming. It was hard to put it down so I ended up reading it on the train each morning and afternoon.
This book came for me just about the right time. As a junior developer I would have never understood its importance. Having been in the software industry for 9 years I can now assess what I did right and what I could have done better. It's a good milestone where I can stop and say: I came a long way, I am on the right track but I know how far I still can go if I want to consider myself a software professional. And I am desperate to go down that road.

Thursday, June 30, 2011

Defensive copies

One of the building blocks of Domain Driven Design (DDD) is the Value Object: an object that contains attributes but has no conceptual identity. Value Objects should be treated as immutable.

Consider the following class:

public final class Period {
    private final Date start;
    private final Date end;

    public Period(Date start, Date end) {
        this.start = start;
        this.end = end;
    }

    public Date getStart() {
        return start;
    }

    public Date getEnd() {
        return end;
    }
}

This is a common example of the Value Object. What is wrong with this class?

Date start = new Date();
Date end = new Date();
Period p = new Period(start, end);
end.setYear(81); //modifies internals of p!

Just because the Period class has no mutator methods it does not mean we can not change its state. In the example above we kept a reference to a mutable object. If we want to prevent this from happening we have to make a defensive copy of each mutable parameter to the constructor:


public Period(Date start, Date end) {
    this.start = new Date(start);
    this.end = new Date(end);
}

Are we there yet? Not really.

Date start = new Date();
Date end = new Date();
Period p = new Period(start, end);
p.getEnd().setYear(96); //modifies internals of p!

The common mistake made was exposing a field which is mutable. By returning defensive copies of mutable internal fields our Value Object is finally fixed.

public Date getStart() {
    return new Date(start);
}

public Date getEnd() {
    return new Date(end);
}

Defensive copying can have a performance penalty associated with it. In this particular case my best advice would be to consider Joda-Time's DateTime as an immutable replacement for Date.

In general, you should, where possible, use immutable objects as components of your own objects so that you that don't have to worry about defensive copying. If you do use mutable objects you should think twice before returning a reference to an internal component. Chances are, you should return a defensive copy instead. The same goes for a mutable object entering your internal data structure in the constructor.
If the class trusts its clients and the performance penalty for defensive copying would be too high you may accept or return a reference to a mutable object but be aware - subtle bugs can show up regardless what you wrote in your documentation about your class.

If there is any validation in your constructor it should happen after making the defensive copies in order to eliminate the window of vulnerability between the time the parameters are checked and the time they are copied.


public Period(Date start, Date end) {
    this.start = new Date(start);
    this.end = new Date(end);

    //run validation on this.start and this.end
}

Thursday, June 23, 2011

Reification in Java

"I still get unchecked warnings!" - a sentence I hear from time to time when I am sitting close to people trying to use generics. Every unchecked warning represents the potential for a ClassCastException at run-time. Do your best to eliminate these warnings if possible - which is not always the case. The more experience you acquire with generics, the fewer warnings you’ll get, but don’t expect newly written code that uses generics to compile cleanly.
Why do we get these warnings? What's wrong with generics? In short the answer is: they are not reified. In Java a type is reifiable if the type is completely represented at runtime, that is, if erasure does not remove any useful information.

A type is not reifiable if it is one of the following:
  • A type variable (such as T)
  • A parameterized type with actual parameters (such as List<Number>, ArrayList<String>, or Map<String, Integer>)
  • A parameterized type with a bound (such as List<? extends Number> or Comparable<? super String>)
Generics are implemented using erasure, in which generic type parameters are simply removed at run-time. That doesn't render generics useless, because you get type checking at compile-time based on the generic type parameters, and also because the compiler inserts casts in the code (so that you don't have to) based on the type parameters.


        Map<String, String> countryCodes = new HashMap<String, String>();
        countryCodes.put("US", "United States");
        String country = countryCodes.get("US");

If you compile the source above and then de-compile the generated class file you will see that they are not the same:


        Map countryCodes = new HashMap();
        countryCodes.put("US", "United States");
        String country = (String) countryCodes.get("US");

You should note two things in the code above:
  1. the type parameter "<String, String>" is missing
  2. (String) countryCodes.get("US"); typecast is added
This is done by the java compiler while compiling java source code to byte code. This process is called type erasure.
Generics are implemented using erasure as a response to the design requirement that they support migration compatibility: it should be possible to add generic type parameters to existing classes without breaking source or binary compatibility with existing clients.

While solving one set of problems, erasure adds a set of its own problems.

For a type parameter T, you can't
  • write a class literal T.class.
  • use instanceof to test if an object is of type T
  • create an array of T
  • write class literals for generic types like List<String>.class
  • test if an object is an instanceof List<String>
  • create an array of List<String>
Being unable to create an array of a parameterized type also means that you can get confusing warnings when using varargs methods in combination with generic types. This is because every time you invoke a varargs method, an array is created to hold the varargs parameters. If the element type of this array is not reifiable, you get a warning.


    public static <T> List<T> toList(T... item) {
        List<T> list = new ArrayList<T>();
        for (T element : item) list.add(element);
        return list;
    }

    public static <K> List<K> singleton(K element) {
        return toList(element); // unchecked warning
    }

There is little you can do about these warnings other than to suppress them, and to avoid mixing generics and varargs in your APIs.