Fábio Santos's blog: December 2012

Wednesday, 19 December 2012

Bookmarklet compiler

I think I should bookmark this one.

It's a very useful compiler for creating bookmarklets.

I could roll my own, but it's good that there is one already available.

Here is the link. It has source on github.

Unfortunately, it has some problems. It does little or nothing to reduce your script in size, which is a major issue in IE browsers. It apparently should remove comments, but that does not seem to work.

I think I will check out its license, and fork it. I have created good comment-matching regexes for my interesting-c project.

Monday, 17 December 2012

Add christmas to any page

I have devised a simple manner to get christmas on any page! Isn't jQuery animation awesome?

You can look at it here: http://jsfiddle.net/fabiosantoscode/Jw7Hs/

Saturday, 15 December 2012

Python interpreter: Underscore

I noticed that I used this trick in a previous post, and I feel like I should post about it since it's so unspoken of, and useful at the same time.
When in a CPython interpreter session (your plain old python session), the value given to "_" is always the last value automatically printed by the interpreter as the result of an expression.

    >>> 1+1
    2
    >>> _
    2
    >>> x = 3+5
    >>> _
    2
    >>> print "another value"
    another value
    >>> _
    2

As you can see, "_" is always the value that the interpreter automatically prints. If you use variable assignment, you can prevent the shortcut from gaining a value when you want to preserve it for later.
This can be very useful when you don't want to type again the full expression you used to obtain a certain value when you want to perform operations on such value. For example, in a Django shell, performing regex searches, testing generators and lists( list(_) ), etc.
Of course you can always use the Up arrow on your keyboard to repeat the last line of code and then edit it, if your shell supports history. But sometimes you really can't, and it's more practical to do it this way.

    >>> [1,2,3] + [4,5,6]
    [1, 2, 3, 4, 5, 6]
    >>> len([1,2,3] + [4,5,6])
    6
    >>> [1,2,3] + [4,5,6]
    [1, 2, 3, 4, 5, 6]
    >>> len(_)
    6

But don't worry about using "_" as a variable name inside the interpreter shell. It will simply override the behaviour described above.

    >>> _=100
    >>> _
    100
    >>> 4
    4
    >>> _
    100
    >>>

Trivia: the "underscore shortcut" doesn't show up in locals() or dir()

    >>> locals()
    {'__builtins__': , '__name__': '__main__', '__d
    oc__': None, '__package__': None}
    >>> dir()
    ['__builtins__', '__doc__', '__name__', '__package__']
    >>>

Thursday, 13 December 2012

A useful Django Template tag hidden

https://docs.djangoproject.com/en/dev/ref/templates/builtins/?from=olddocs#pprint

It seems to be really useful for debugging and creating those initial quick and dirty templates.

I often want to stay off my template code while programming view code. When I create templates, I make them minimal and just want visual confirmation of success and not creating a full template.

Thursday, 6 December 2012

Wildcards in python (the fnmatch module)

Filtering and matching against wildcard patterns is really easy in python, using the fnmatch module. I found this out while looking in an article by Dan Carrol for a solution to a Django problem of mine.

By using the fnmatch function, one can match strings against a case-insensitive pattern. Use fnmatchcase for case-sensitive matching).

    >>> import fnmatch
    >>> fnmatch.fnmatch('example', 'exampl*')
    True
    >>> fnmatch.fnmatch('example', '*e')
    True
    >>> fnmatch.fnmatch('example', '*es')
    False
    >>> fnmatch.fnmatch('examples', '*es')
    True

There's also a filter function, to filter a list against a pattern.

    >>> files = ['file.py', 'file.txt']
    >>> fnmatch.filter(files, '*.py')
    ['file.py']

Besides *, ? and [] are also available. And that's it. It's a very simple syntax. A moderately powerful syntax which everybody can use.

    >>> fnmatch.fnmatch('a', '?')
    True
    >>> fnmatch.fnmatch('1', '[13579]')
    True
    >>> fnmatch.fnmatch('4', '[13579]')
    False
    >>>

The wildcard characters cannot be escaped with slashes. You can only escape with square brackets. For example:

    >>> fnmatch.fnmatch('Here is a star: *', '*\*')
    False
    >>> fnmatch.fnmatch('Here is a star: *', '* [*]')
    True

At first I thought this way of escaping was impractical, but because of that, you can use unescaped user input without the user ever getting unexpected results. And, if you want to get the original regex (for reusing later) you can always use translate:

    >>> as_regex = fnmatch.translate('m[ae]tch th?is!')
    >>> as_regex
    'm[ae]tch\ th.is\!\Z(?ms)'

Because this module uses regular expressions internally and allows to get the actual regular expression, I can use it as a less error-prone re in some scenarios.

I think this module is great. It gives me a bit less matching power than regular expressions, but then I can empower the user by asking them what and how they want to search for. You could arguably do this with regular expressions, but you would end up wasting time and money in documentation and customer support because regex is error-prone and dangerous.

Check out the docs for more information on this module.

Saturday, 1 December 2012

Phasing subtitles using python

I was trying to watch a film, but the subtitles I had were 1 second behind the actors' lines.
Not content with finding other subtitles on the web, I opened up the python interpreter and loaded the file into lines.
A little code followed

    import datetime
    import re

    lines = open('subs.srt', 'rb').read().splitlines()

    with open('out.srt', 'wb') as outp:
        for line in lines:
            if subtime.findall(line):
               time = datetime.datetime(1,1,1,*map(int, line[:8].split(':')))
               time += datetime.timedelta(seconds=1)
               outp.write('%02d:%02d:%02d%s' % (
                   time.hour, time.minute, time.second, line[8:]))
            else:
               outp.write(line + ' ')

It was just a few lines of code, showing off quite well a lot of the capabilities of python I love most. Text processing is always a cinch.

Explaining the code

The format of the subtitles was:

    [blank line]
    ID
    hh:mm:ss,ms: [text]

This explains why I had to check if the regex findall returned a match. The regex was ^\d\d:\d\d:\d\d.
When this regex found a line with subtitle time written on it, I did the reading, updating and writing the time. Otherwise, I just copied the line verbatim to the output file.
I simply cut the line using slice syntax. [:8] and [8:] got me the line's contents up to the seventh character, and from the eight character onwards, respectively.
I used the first seven characters of the line, split by the colon : character, as arguments to the datetime.datetime constructor, in true functional fashion. I had to map a call to int to turn all these number strings into integers.
To update the seconds correctly, I had to create an instance of datetime.timedelta with seconds set to 1 (which was my estimate of how off the time was), and add it the the time I got from the split string.
Having forgotten how to do date formatting, I just used string formatting against time.hour, time.minute and time.second, and joined in the rest [:8] of the string in the same operation.
It was quite fun, but my friends eventually grew impatient so in the end no film was watched.