Wednesday, 19 December 2012

Bookmarklet compiler

I think I should bookmark this one.

It's a very useful compiler for creating bookmarklets.

I could roll my own, but it's good that there is one already available.

Here is the link. It has source on github.

Unfortunately, it has some problems. It does little or nothing to reduce your script in size, which is a major issue in IE browsers. It apparently should remove comments, but that does not seem to work.

I think I will check out its license, and fork it. I have created good comment-matching regexes for my interesting-c project.

Monday, 17 December 2012

Add christmas to any page

I have devised a simple manner to get christmas on any page! Isn't jQuery animation awesome?

You can look at it here: http://jsfiddle.net/fabiosantoscode/Jw7Hs/

Saturday, 15 December 2012

Python interpreter: Underscore

I noticed that I used this trick in a previous post, and I feel like I should post about it since it's so unspoken of, and useful at the same time.
When in a CPython interpreter session (your plain old python session), the value given to "_" is always the last value automatically printed by the interpreter as the result of an expression.
    >>> 1+1
    2
    >>> _
    2
    >>> x = 3+5
    >>> _
    2
    >>> print "another value"
    another value
    >>> _
    2
As you can see, "_" is always the value that the interpreter automatically prints. If you use variable assignment, you can prevent the shortcut from gaining a value when you want to preserve it for later.
This can be very useful when you don't want to type again the full expression you used to obtain a certain value when you want to perform operations on such value. For example, in a Django shell, performing regex searches, testing generators and lists( list(_) ), etc.
Of course you can always use the Up arrow on your keyboard to repeat the last line of code and then edit it, if your shell supports history. But sometimes you really can't, and it's more practical to do it this way.
    >>> [1,2,3] + [4,5,6]
    [1, 2, 3, 4, 5, 6]
    >>> len([1,2,3] + [4,5,6])
    6
    >>> [1,2,3] + [4,5,6]
    [1, 2, 3, 4, 5, 6]
    >>> len(_)
    6
But don't worry about using "_" as a variable name inside the interpreter shell. It will simply override the behaviour described above.
    >>> _=100
    >>> _
    100
    >>> 4
    4
    >>> _
    100
    >>>
Trivia: the "underscore shortcut" doesn't show up in locals() or dir()
    >>> locals()
    {'__builtins__': , '__name__': '__main__', '__d
    oc__': None, '__package__': None}
    >>> dir()
    ['__builtins__', '__doc__', '__name__', '__package__']
    >>>

Thursday, 13 December 2012

A useful Django Template tag hidden

https://docs.djangoproject.com/en/dev/ref/templates/builtins/?from=olddocs#pprint

It seems to be really useful for debugging and creating those initial quick and dirty templates.

I often want to stay off my template code while programming view code. When I create templates, I make them minimal and just want visual confirmation of success and not creating a full template.

Thursday, 6 December 2012

Wildcards in python (the fnmatch module)

Filtering and matching against wildcard patterns is really easy in python, using the fnmatch module. I found this out while looking in an article by Dan Carrol for a solution to a Django problem of mine.

By using the fnmatch function, one can match strings against a case-insensitive pattern. Use fnmatchcase for case-sensitive matching).

    >>> import fnmatch
    >>> fnmatch.fnmatch('example', 'exampl*')
    True
    >>> fnmatch.fnmatch('example', '*e')
    True
    >>> fnmatch.fnmatch('example', '*es')
    False
    >>> fnmatch.fnmatch('examples', '*es')
    True

There's also a filter function, to filter a list against a pattern.

    >>> files = ['file.py', 'file.txt']
    >>> fnmatch.filter(files, '*.py')
    ['file.py']

Besides *, ? and [] are also available. And that's it. It's a very simple syntax. A moderately powerful syntax which everybody can use.

    >>> fnmatch.fnmatch('a', '?')
    True
    >>> fnmatch.fnmatch('1', '[13579]')
    True
    >>> fnmatch.fnmatch('4', '[13579]')
    False
    >>>

The wildcard characters cannot be escaped with slashes. You can only escape with square brackets. For example:

    >>> fnmatch.fnmatch('Here is a star: *', '*\*')
    False
    >>> fnmatch.fnmatch('Here is a star: *', '* [*]')
    True

At first I thought this way of escaping was impractical, but because of that, you can use unescaped user input without the user ever getting unexpected results. And, if you want to get the original regex (for reusing later) you can always use translate:

    >>> as_regex = fnmatch.translate('m[ae]tch th?is!')
    >>> as_regex
    'm[ae]tch\ th.is\!\Z(?ms)'

Because this module uses regular expressions internally and allows to get the actual regular expression, I can use it as a less error-prone re in some scenarios.

I think this module is great. It gives me a bit less matching power than regular expressions, but then I can empower the user by asking them what and how they want to search for. You could arguably do this with regular expressions, but you would end up wasting time and money in documentation and customer support because regex is error-prone and dangerous.

Check out the docs for more information on this module.

Saturday, 1 December 2012

Phasing subtitles using python

I was trying to watch a film, but the subtitles I had were 1 second behind the actors' lines.
Not content with finding other subtitles on the web, I opened up the python interpreter and loaded the file into lines.
A little code followed

    import datetime
    import re

    lines = open('subs.srt', 'rb').read().splitlines()

    with open('out.srt', 'wb') as outp:
        for line in lines:
            if subtime.findall(line):
               time = datetime.datetime(1,1,1,*map(int, line[:8].split(':')))
               time += datetime.timedelta(seconds=1)
               outp.write('%02d:%02d:%02d%s' % (
                   time.hour, time.minute, time.second, line[8:]))
            else:
               outp.write(line + ' ')

It was just a few lines of code, showing off quite well a lot of the capabilities of python I love most. Text processing is always a cinch.

Explaining the code

The format of the subtitles was:
    [blank line]
    ID
    hh:mm:ss,ms: [text]
This explains why I had to check if the regex findall returned a match. The regex was ^\d\d:\d\d:\d\d.
When this regex found a line with subtitle time written on it, I did the reading, updating and writing the time. Otherwise, I just copied the line verbatim to the output file.
I simply cut the line using slice syntax. [:8] and [8:] got me the line's contents up to the seventh character, and from the eight character onwards, respectively.
I used the first seven characters of the line, split by the colon : character, as arguments to the datetime.datetime constructor, in true functional fashion. I had to map a call to int to turn all these number strings into integers.
To update the seconds correctly, I had to create an instance of datetime.timedelta with seconds set to 1 (which was my estimate of how off the time was), and add it the the time I got from the split string.
Having forgotten how to do date formatting, I just used string formatting against time.hour, time.minute and time.second, and joined in the rest [:8] of the string in the same operation.
It was quite fun, but my friends eventually grew impatient so in the end no film was watched.

Thursday, 29 November 2012

Getting your own (IP) address and port in a javascript web app

While hacking something up using socket.io I needed to know the IP of the server, so I could connect a socket back to it. localhost was the solution at first, but I wanted to access it from other machines in my network.

I didn't want to get the local IP using node. I might want to serve the application from a server with more than one network card. I needed to have the IP fixed.

But I often develop the app while commuting, on my EEEPC. When I get home, I sync my code as I turn on my desktop computer to proceed work.

So I found myself needing to have a fixed IP over two different machines: my desktop PC, and my EEEPC. A fixed IP was clearly not an option.

I had the idea of using window.location. I used window.location.hostname to get the IP address or domain name of any server I was connected to. It was such a simple solution I was very positively surprised.

All uses of window.location:

example url: http://192.168.1.7:8080/chat?nickname=F%C3%A1bio#fragment

  • hash (#fragment)
  • host (192.168.1.7:8080)
  • hostname (192.168.1.7)
  • href (http://192.168.1.7:8080/chat?nickname=F%C3%A1bio#fragment)
  • pathname (/chat)
  • protocol (http:)
  • search (?nickname=F%C3%A1bio)

Wednesday, 28 November 2012

Common Regex

I have been busy lately, hacking up stuff with socket.io

I've created a table for regular expressions to do common validations or searches.

Anyway, here is the link, I hope you enjoy it! If you think that an important regex is missing, drop me a line.

Tuesday, 13 November 2012

Dividing by zero for fun and profit

It's monday, and I'm back from vacation, feeling like I'm wasting my time and inevitably falling victim to the great cycle of life, money and everything. while (42) { }. Fortunately, I have sweet, sweet sarcasm on my side.

Anyway, this post is supposed to be about dividing by zero.

In python, it's a great way to find if a certain code path deep inside your call stack is really getting called, and when. You get to write one line which results in a noisy exeption, so your pain and confusion is properly turned into a Traceback.

    class ReturnStatement(Statement):
        def __init__(self, returnee):
            1/0
            super(ReturnStatement, self).__init__(returnee, '<return>')

"Oh. This time the exception never fired. I was sure this was supposed to be executed."

That's the kind of thought you are supposed to get, or something among the lines of:

"There, the exception.. Then why is this @!$# method not working if it's being called?"

Anyway, you get a good troubleshooting test just for typing three characters and a new line. Good bargain!

Of course, in JavaScript it's useless.

Yields Infinity. That's a discussion for another point in time. Maybe. I may never be inclined again to speak of that matter. Hours and hours of agony because of a number having a completely unpredictable number. Ugh.

In compiled languages, it's mostly useless, too.

Mondays.

Wednesday, 7 November 2012

Valilang

I have started a new project. Its name is valilang.

Create validation rules in a single format, meant to be used in both client and server sides.

Although I have given up my first plan of making valilang a minimal imperative programming language, the name still has the suffix "lang". I have opted to base the language syntax upon JSON, so it will be easier to learn and use.

The format is very easy to write. There are :

  • fields, which correspond to form fields, and
  • rules, which are (mostly premade) short functions taking a value argument and doing an assertion upon that value.

A valilang file will have an object with these keys:

  • fields, a list of fields.
  • fieldValidation, an object mapping field names to a list of rules applied sequentially to these fields.

Here is an example valilang file, for a form with a single field:

    {
        "fields": ["name"],
        "fieldValidation": {
            "name": [
                "required",
                "min-10",
                "max-50"
            ]
        }
    }

In the above object, we can see we have a single field name which has the following rules:

  • It's a required field
  • It takes at least ten characters (min-10)
  • and at most 50 characters (max-50).

These rules (which are actually functions) are executed sequentially, until any function returns null, in which case the validation fails.

Notice how arguments are handled. They are extracted from after the dash in each of the rule strings, provided these strings have a dash.

On the client side, you just have to include a valilang file and valilang.js.

    <script type="text/x-valilang">
        {
            "fields": ["name"],
            "fieldValidation": {
                "name": [
                    "required",
                    "min-10",
                    "max-50"
                ]
            }
        }
    </script>
    <script type="text/javascript" src="valilang.js"></script>

On the server side, you will load the valilang library for your framework or language, and ask it to validate your fields.

Of course this is all when both the client side and the server side are implemented. Remote loading of scripts is not yet supported. The server side hasn't been prepared for any language except for javascript (by requireing valilang.js in node and using its API), and there are too few validators (validation functions). Also unit testing is not done for the client or the server.

However, valilang.js is definitely compact, under 5 kb minified, and it is in a nearly usable state. Care to try it out and maybe contribute to its development? Report bugs and fork the github repository.

AAAAAAAAAA

A AAAA AAAAAA AA AAAA AAAA AAA AAAAAAAAAAAAAA.

    >>> def AAAAAAAAAA(s):
    ...     import re
    ...     return re.sub('\w', 'A', s)
    ...
    >>> AAAAAAAAAA('I want to know what Guido eats for breakfast.')
    'A AAAA AA AAAA AAAA AAAAA AAAA AAA AAAAAAAAA.'
    >>> AAAAAAAAAA("I have found a new way to encode messages. But it's irreversible.")

    "A AAAA AAAAA A AAA AAA AA AAAAAA AAAAAAAA. AAA AA'A AAAAAAAAAAAA."
    >>> AAAAAAAAAA("You have a way with words.")
    'AAA AAAA A AAA AAAA AAAAA.'
    >>>

AAAAA AA AAA AA AAAAAAAAAAAAAA AAAA.

And a thousand pointless points are given to whoever decodes this post.

Tuesday, 6 November 2012

Periodic table of HTML5 elements

A very interesting and helpful link from Josh Duck:

http://joshduck.com/periodic-table.html

It contains all HTML5 elements, formatted in a table that looks much like the Periodic Table of the Elements. It is very nice and informative to look at all of them, grouped into categories.

Find out a couple of new elements, and use it as reference, since every element in the table has links to helpful documentation on MDN and the formal declaration in W3C when you click them.

The little form on top lets you count elements used in any published website. Detect divitis!

Monday, 5 November 2012

Practical example of str.split's optional argument. Config file parser

Taking my little fun discovery for a ride. Used it to create a simple config file reader together with itertools.

    >>> lines = iter(open('config.cfg'))
    >>> import itertools
    >>> splitonce = lambda s:s.split('=',1)
    >>> trim = lambda s:s.rstrip('\n\r')
    >>> trimmed_lines_generator = itertools.imap(trim, lines)
    >>> split_lines_generator = itertools.imap(splitonce, trimmed_lines_generator)
    >>> settings = dict(split_lines_generator)
    >>> settings
    {'ConfigSetting3': '  Allow leading+trailing spaces!   ', 'eqsign': '=', 'ConfigItem2': 'Settle for no "=" signs!', 'Con
    figItem1': 'Configured'}
    >>>

This of course wouldn't work if I hadn't applied the optional argument to str.split, which I talked about in my previous post. It helps me split the file lines into keys and values by the equal sign, and not worry that the values might include equal signs of their own.

I am getting ahead of myself. In case you are not familiar with CFG files, here is the file I used in the above example. The settings file config.cfg:

    ConfigItem1=Configured
    ConfigItem2=Settle for no "=" signs!
    eqsign==
    ConfigSetting3=  Allow leading+trailing spaces!

Stripped out of the interpreter into cleaner code:

    from itertools import imap

    with open('config.cfg') as fp:
        lines = iter(fp)
        def splitonce(s):
            return s.split('=', 1) # split limit
        def trim(s):
            return s.rtrim('\n\r')

        trimmed_lines_generator = imap(trim, lines)
        split_lines_generator = imap(splitonce,
            trimmed_lines_generator)

        settings = dict(split_lines_generator)

I have used itertools.imap. Using iterators and chaining them together will make larger config files use less memory than if I used list(fp) to get a list of lines. This of course is more of a concern in larger config files.

Saturday, 3 November 2012

Timetable selection snippet.

I have put together a simple timetable snippet for helping keep track of time, or to do time allocation of some resource.

Check it out on jsfiddle:

This is a jQuery plugin. It works by converting a table with checkboxes inside each td into a time planner. It does this by moving the checkboxes away from their td containers into a div inserted below the table.

Here is that DIV's HTML. As you can see it has all the checkboxes from every cell.

    <div style="display: none;" class="timeplanner-checkbox-dump">
        <input type="checkbox" checked="checked" name="0-0">
        <input type="checkbox" name="1-0">
        <input type="checkbox" name="2-0">
        <input type="checkbox" che...

Then it routes the events of each td into its containee checkbox. This is done by using an annonymous function for each td. In a jQuery.each loop I declare var checkbox_here, (so it is different for every closure) and then just do $(checkbox_here)-click() in the click event handler for the td.

Python's string formatting syntax

As you are surely aware of, you can use python's % operator to do string formatting. It's part of the reasons why the language is so good at text processing.

The left operand is the format string, the right operand is a single format argument, or an iterable of format arguments. In the format string, you can use %s to insert a string, %d to insert an integer, %f for a float, etc. It's much like C's printf family of functions.

    >>> '%s %d' % ('a string', 123)
    'a string 123'

This you almost surely know. You might not know that the string can take a %r argument which calls repr on the arguments. Now that's useful!

    >>> 'a repr: %r' % ['', ""]
    "a repr: ['', '']"
    >>> 

Or that the format parameters can be in a dictionary.

    >>> '%(a_string)s %(an_int)d' % {'a_string':'a string', 'an_int': 123}
    'a string 123'
    >>> 

If the parameters were passed as a dictionary, the format string will not raise exceptions for extra parameters. So you can use locals() on your (trusted) format string, and format it easily. In a class method, self.__dict__ would also be useful.

    >>> a_number = 3
    >>> '%(a_number)d' % locals()
    '3'

String formatting parameters will also take their own parameters. The parameters are decimal characters (and an optional separation dot) between the % sign and the character indicating the type.

Here is an ugly syntax representation. Bear with me.

% [0|-][leading/trailing character count][An optional dot][Truncation](Type character)

That's all, I think. Be careful to add no spaces. The only characters allowed between % and the type indicator are [0-9\.\-]

  • First, the usual percentage sign.

  • Then, add a single zero if you want to fill the string with zeroes (you specify how many characters you would like to fill the string with in the next argument). If you want the string to have trailing spaces instead of leading spaces, add a minus sign.

  • After that, add the number of characters you want to fill the string with. You can skip this.

  • If you want to use the truncation argument (explained below), add a dot here.

  • Add the truncation argument, which is an integer.

For obvious reasons, you can't have trailing zeroes in anything.

The "Truncation" argument is used to left-truncate a string to that size, to grow (only grow) an integer string by zero-filling the left to that size, or to use that many decimal digits in a float.

Here are some examples with %s. %s doesn't let you put leading zeroes on your string. You can use str.zfill() for that.

    >>> s = '1234'
    >>> '%.1s' % s
    '1'
    >>> '%.4s' % s
    '1234'
    >>> '%.6s' % s
    '1234'
    >>> '%10s' % s
    '      1234'
    >>> '%10.2s' % s
    '        12'
    >>> '%10.4s' % s
    '      1234'
    >>> '%10.6s' % s
    '      1234'

%f examples:

    >>> f = 10.123
    >>> '%f' % f
    '10.123000'
    >>> '%.4f' % f
    '10.1230'
    >>> '%4f' % f
    '10.123000'
    >>> '%20f' % f
    '           10.123000'
    >>> '%20.2f' % f
    '               10.12'
    >>> '%020.2f' % f
    '00000000000000010.12'
    >>> 

%d can also be told to have leading zeroes or leading spaces. As mentioned above the "Truncation" part of the parameter can only make it grow. It wouldn't make sense to let it shrink, since that would shrink the number's value.

    >>> i = 10
    >>> '%d' % i
    '10'
    >>> '%.1d' % i
    '10'
    >>> '%.4d' % i
    '0010'
    >>> '%4d' % i
    '  10'
    >>> '%4.1d' % i
    '  10'
    >>> '%4.3d' % i
    ' 010'
    >>> '%4d' % i
    '  10'
    >>> '%04d' % i
    '0010'
    >>> 

As you know, in python, characters are indeed strings with len() of 1, so if you want to represent a character you just use %s. But you can also use %c. %c will raise an exception if the input string is not a character, and the extra assertion might prove a little useful. You can add leading and trailing spaces to this parameter too.

Finally, it's worth something to note that python has a newer formatting syntax. I haven't seen it used much, and don't use it either. It's not so practical as the one I described.

More information on string formating can be found in the python documentation

This has hopefully been a thorough dissection of python's string formatting syntax. Now go out and enjoy the sun!

Monday, 15 October 2012

Python Discovery: Named Tuples

I had known of the named tuple idiom for ages. It is used sparingly in some python API's, and I feel it is very pythonic, practical, and simple to read and code with.

It's used like this (this example is for the timetuple in datetime.datetime, which is a namedtuple, or looks a lot like one):

>>> tt = datetime.datetime.now().timetuple()
>>> tt
time.struct_time(tm_year=2012, tm_mon=10, tm_mday=15, tm_hour=20, ...)
>>> tt.tm_year
2012
>>> tt[0]
2012
>>> tt[3]
20
>>> tt.tm_mon
10
>>> 

And it's iterable.

>>> list(tt)
[2012, 10, 15, 20, 24, 8, 0, 289, -1]

It's basically a tuple which has attributes accessible through dot notation.

The timetuple idiom is perfect for API design, and for those times where a class is what you want, but a tuple would be okay too. For example, 2D dimensions. You could store them in a class with height, width attrs, or in a tuple.

I thought it was just some pythonic pattern, but today I needed to automate it because I found myself making dumb __init__ methods like these:

def __init__(self, left, right):
    self.left = left
    self.right = right

It is a useful pattern. It's pythonic. It's implementable in pure python with just a little metaprogramming. It could be in the standard library, I thought.

I opened up my python interpreter and entered import collections and dir(collections) looking for NamedTuple or something. namedtuple was there. It wasn't just a coincidence.

collections.namedtuple is a factory for creating namedtuple types, which you can then use. The resulting classes include methods like _asdict, _fields, index and _replace.

>>> a = collections.namedtuple('a', 'b, c')
>>> a(1,2)._asdict()
OrderedDict([('b', 1), ('c', 2)])
>>> a(1,2)._fields()
('b', 'c')
>>> a(1,2).index(1)
0
>>> a(1,2).index(2)
1
>>> a(1,2)._replace(b=3)
a(b=3, c=2)
>>> 

It is hashable, unlike a dict, and pickleable.

Here is how you create one:

>>> collections.namedtuple('SomeNamedTuple', 't1 t2')

The first argument is the name of the output class, and the second argument is the list of parameter names, whitespace - and/or comma - delimited.

This creates a named tuple with the __name__ SomeNamedTuple, and the attributes t1 and t2.

Extend it

When you need more stuff out of your namedtuple you should really subclass. Keep in mind that it is immutable, so you should return a copy of it in every method.

class Dimensions(collections.namedtuple('DimensionsTuple', 'height width')):
    pass

>>> Dimensions(10, 30).height
10
>>> Dimensions(10, 30).width
30

It is rather useful sometimes, so don't forget about the named tuple! You might need it sooner or later.

Thursday, 27 September 2012

jQuery.showWhen

I've created a small jQuery plugin to allow for more responsive forms. Sometimes you only want to show parts of a form when other parts are complete. This helps you simplify forms and only go into detail when the user has already filled in some field. Here is an example, and an example with a checkbox instead
Think about a survey:
  1. Do you smoke?
    1. Yes
    2. No (skip to 4.)
  2. How often do you smoke?
    1. Every hour
    2. Every two hours
  3. Do you realize ...
    1. Not really
  4. Is question four a question?
    1. Yes
    2. No
In this survey, we could have tried to follow the standard of interactive and intuitive forms to hide questions number 3. and 4.
This could be done in a non-obstrusive way: If the "(skip to 4.)" text was inside its own span with its own CSS class, it would be trivial to add data-skip-questions to that span, and have jQuery look for span[data-skip-questions], and use String.split with jQuery.add to register with this plugin.
html code:
<ol class="questions">
    <li>Do you smoke?
        <ol>
            <li><input type="radio" name="q1" value="y" />Yes</li>
            <li><input type="radio" name="q1" value="n" />No <span data-skip-questions="2 3">(skip to 4.)</span></li>
        </ol>
    </li>
    <li>How often do you smoke?
        <ol>
            <li><input type="radio" name="q2" value="1" /> Every hour</li>
            <li><input type="radio" name="q2" value="2" /> Every two hours</li>
        </ol>
    </li>
    <li>Do you realize ...
        <ol>
            <li><input type="radio" name="q3" value="n" /> Not really...</li>
        </ol>
    </li>
    <li>Is question four a question?
        <ol>
            <li><input type="radio" name="q4" value="y" /> Yes</li>
            <li><input type="radio" name="q4" value="n" /> No</li>
        </ol>
    </li>
</ol>
javascript code:
$('span[data-skip-questions]').each(function() {
    var questionsToSkip = $(this).data('skip-questions').split(' '),
        q = $(null),
        watchedInput = $(this).parents('li:first').find('input[type="radio"]');

    $.each(questionsToSkip, function(i, val) {
        q = q.add($('ol.questions > li')
              .eq(val - 1));
    });

    $(q).hide().showWhen(watchedInput, true);

})
// Hide the span, since the user doesn't have to skip the questions snymore.
.hide();
Here's a fiddle for the survey example.

Edit:
The plugin is now on github. It was released under the WTFPL. Feel free to fork and use!

Sunday, 23 September 2012

Python: "or" and "and" operators yield more stuff than bool

Python's and and or operators don't yield bool. They don't calculate the result of the expressions into booleans, and most certainly do not give us True or False.

They return one of the objects we put in. So they are not useful only in if statements, and can't convert to a boolean by themselves (without bool(), that is).

You might be familiar with using or like this since it's a common alternative to ... if ... else .... Take this __init__ method for example:

    def __init__(self, form=None):
        self.form = form or self.make_form()

It is straightforward, readable, and short. Pythonic indeed.

If the right operand is an expression which would throw an error, as long as the left operand is true, you don't have to worry. The right expression will not be evaluated. This is not old. We have seen it in if statements in the C language. I have used it countless times in Java to avoid NullPointerException. It wouldn't make any sense for a machine or virtual machine to evaluate both expressions if the first one already evaluates to true.

>>> def get_sth():
...     raise NotImplementedError
... 
>>> [1] or get_sth()
[1]
>>> [] or get_sth()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in get_sth
NotImplementedError


A very good use of this is to try to tentatively get a resource in one manner which is more likely to succeed, and then try a more risky or less desireable fallback if the first manner yields nothing.

This is awesome for creating intuitively usable classes. For example, a make_form implementation could be a single-line "raise NotImplementedError", so client code can initialize the class without passing in a form object every time, but if they want to use the defaults they just have to inherit the class and make their own make_form method. It's very practical, intuitive and informative behavior just for a single expression, isn't it?

Here is the or expression behavior, described:
  • When one of the operands evaluate to True, return the one which evaluates to True.
    • >>> 1 or 0
      1
      >>> 0 or 1
      1
  • When both evaluate to True, returns the left one (python won't even look at the second operand, so you can rest assured no methods will be called, if any.)
    • >>> 1 or [1]
      1
      >>> [1] or 1
      [1]
  • When both evaluate to False, returns the right one.
    • >>> 0 or []
      []
      >>> [] or 0
      0
We can use the last caracteristic, too. Sometimes two values are False, but we want to use one over the other. (standard falsy values are the empty list, the empty tuple, 0, False, None, and the empty string).

The and operator is rather interesting. It gives you:
  • The right operand, when both evaluate to True:
    • >>> True and 1
      1
      >>> 1 and True
      True 
  • the operand which evaluates to False, when one of them is falsy:
    • >>> 19 and 0
      0
      >>> 0 and 19
      0
  • The left operand, when both evaluate to False
    • >>> 0 and False
      0
      >>> False and 0
      False

Thursday, 20 September 2012

Interesting-c

I'm creating a new programming language!

This language will be created through transcompilation to C. Inspired by the concept of CoffeeScript, and including a lot of Python philosophies and ideas, I will try to create a language which is really easy to use, and yet fast as C.

One of the most important features will be reflection and metaprogramming facilities. This will help developers by describing the structure of their programs at runtime. This makes for good ORM's, serializers, etc.

It will not use classes for encapsulation, but instead a new concept I have created. It will build upon the capabilities of classes and the concept of "is-a" relationships. More on these soon.

interesting-c aims to be able to interpret 99% of existing C programs as interesting-c. So a C program will (almost) always be a interesting-c program. This allows developers to convert gradually to interesting-c.

interesting-c will have a module system. modules will be able to import other modules as well as pure C source or header files. The syntax will be something like this:

    import "c_module.c";
    import icmodule;

When importing, interesting-c will just add a #include directive to the compiled source, but it will simulate the concept and behavior of a namespace. It will parse the #included file and look for identifiers which the current module can use. Optionally, users will be able to use import module_name as alias syntax to assign the module to an identifier so its namespace can be isolated from the local module namespace. This poses a new problem: C has no concept of a namespace and will not allow the programmer to choose between foo in module1.c and foo in module2.c. It's unclear how I will solve this

interesting-c will discourage (but still support) preprocessor directives. But there will be a lot more space for safer, more interesting pre-compile-time magic as well as runtime magic. And yet, you will still be able to shoot yourself in the foot and use macros as much as you want.

Early work

I have started creating interesting-c, and it really gives me new interesting problems to solve.

I am using PLY (python-lex-yacc)'s lex to create the lexer. My challenge right now is to preserve whitespace as possible, as well as most of the code appearance, since it's important for the users to be able to opt in and out of interesting-c at any time, so it will be something easy to adopt.

I have been creating small C programs to test C language features. Programs like these help me test whether and how the language supports something.

    /* Test whether floating point numbers can be octal */

    int main(int argc, char* argv[]){
        float oct = 01.1f;
        return 0;
    }

Monday, 10 September 2012

Python: str.split has a "limit" argument

I have recently made this little discovery. It seems that str.split has a maxsplit argument, which tells it to only split into a certain amount of parts. This could be really useful for text parsing.
I have in the past run into some (rare) situations where I needed to do this, but didn't know of the maxsplit parameter, and ended up using str.join and slices, to recreate the rest of the string with the delimiters.

It's a little boring to do, and it is ugly.
>>> url = '/posts/blog-1/10251/'
>>>
>>> #problem: split the URL into two parts
... #such that first_part == 'posts' and second_part == 'blog-1/10251'
... #first solution: split and join with slices.
...
>>> first_part = url.strip('/').split('/')[0]
>>> second_part = '/'.join(url.strip('/').split('/')[1:])
>>> first_part, second_part
('posts', 'blog-1/10251')
However, if we do this using the split limit argument, it becomes much more readable.
>>> #second solution: use unpacking, and str.split() with the limit argument
...
>>> first_part, second_part = url.strip('/').split('/',1)
>>> first_part, second_part
('posts', 'blog-1/10251')
>>>
The "limit" argument asks you how many splits you would like, not how many fragments you would like. So specify the n-1 when you want n fragments.

What about splitting by whitespace?

Splitting by whitespace is one of the most powerful features in str.split(). Since I usually invoke this functionality using "".split() without any arguments, I was worried about splitting by whitespace, with the limit argument being a positional-only argument, but you can also use "".split(None).
This is nice since the exact whitespace that used to be there would be impossible to recover with the above tactic (since it's not just a delimiter character).
>>> 'OneFragment TwoFragments ThreeFragments'.split()
['OneFragment', 'TwoFragments', 'ThreeFragments']
>>> 'OneFragment TwoFragments ThreeFragments'.split(maxsplit=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: split() takes no keyword arguments
>>> 'OneFragment TwoFragments ThreeFragments'.split(None, 1)
['OneFragment', 'TwoFragments ThreeFragments']

Split by whitespace, and preserve it.

When you split by whitespace, str.split splits by spaces, tabs, carrier returns and newlines. There are many whitespace characters, and sometimes you want to preserve this information. When using string.split and joining it back, you have no way of getting that information back. It's gone. However, the maxsplit argument allows you to preserve the existing whitespace.
>>> 'And together we fled.\r\nWe said:\r\n\t"Hello!"'.split(None, 1)
['And', 'together we fled.\r\nWe said:\r\n\t"Hello!"']
>>> print _[1]
together we fled.
We said:
        "Hello!"

Friday, 7 September 2012

Python: Unpacking works with one item!

One of the advantages in python is that it is very practical to unpack variables out of short lists and tuples. Some pieces of code which would otherwise be repetitive and ugly (a = lst[0]; b = lst[1]) end up clean, short and easy to read.

>>> a,b = [1,2]
>>> a
1
>>> b
2
>>> 
It's the reason behind python's multi-return values making sense. When we create code that is going to be used by other modules, it opens up a lot of possibilities, and eases the writing and understanding of the client code that uses them.
>>> #create_monkey will use the monkey count in its calculations.
... This value is very useful for client code, but it is rather expensive to obtain.
... monkey, new_monkey_count = create_monkey()
>>>

My discovery today is that you can "unpack" a list or tuple even when it contains only one item. Check it out:

ActivePython 2.7.2.5 (ActiveState Software Inc.) based on
Python 2.7.2 (default, Jun 24 2011, 12:22:14) [MSC v.1500 64 bit (AMD64)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> a, = [1]
>>> a
1
>>> a, = []
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: need more than 0 values to unpack
>>> 

This opens up some possibilities. If we are one hundred percent sure that the list contains a single value, we can unpack that into a variable instead of using [0], which is inherently ugly.

When you are not sure whether the container is empty (or more than one item), you can always catch the resulting ValueError. Or use this:

>>> (a,) = [] or ['default']
>>> a
'default'
>>> 
However, the comma is very subtle and future code readers (including the smartass who decided to use this obscure thang) will probably not notice it. This effectively makes refactoring affected code tricky.

Furthermore, someone may think that it is a syntax error, edit the comma out, and TypeErrors and ValueErrors start popping up everywhere. Subtle bug hunting fun!

There is a little workaround for these readability issues, which is to use full tuple syntax:
>>> (a,) = [1]
>>> a
1
>>>
So it seems like I can use this single-unpacking to do good and not just evil.
It seems to be really interesting, but I am not sure whether I should use it in real code. Seems like I have some meditation to do.
In the meantime, I can show off a little.
>>> a, = 1,
>>> a
1
>>>  

Saturday, 25 August 2012

django-model-permissions

I have created a small library to help create access control in Django projects. It uses a "lock" paradigm, where you can lock certain methods in Django Models (right now it actually works on every class, but I may be adding features that need more tight integration).

The purpose is to take access control to the Model layer, and out of the Controller (view) layer.

Here's how you use it (taken from the readme)

First, apply locks to some model methods:
class Example(models.Model, LockingMixin):
    class Meta:
        permissions = [
            ('example_can_get_id', 'Can get ID')
        ]

    name = models.CharField(max_length=33)

    @locks.permission('app.example_can_get_id')
    def get_id(self):
        return self.id
Then, instance it and use it. When you try to access a locked method, modelpermissions will throw an exception.
model = Example.objects.create(name='name')

try:
    model.get_id()
    assert False #remove this
except PermissionDenied:
    'django.core.exceptions.PermissionDenied was thrown'

But now let's unlock this model instance and try again

model.unlock(user)

model.get_id()
'no exception thrown'
 
Locks aren't limited to permission locks. You can use @locks.conditional and pass it a function that receives the model instance and do more flexible stuff with it.

And the future looks nice, too. Instead of just raising PermissionDenied, I feel that this could make more interesting stuff, like use an alternate method, when the user doesn't have enough permissions (even return a different Proxy model from the lock() method, which will start to return the model instance itself for chainability), and a different LockingMixin to lock things only when you explicitly call lock() on the Model.

It's available on github. Try it out!

Tuesday, 14 August 2012

Django Test Coverage

The django-test-coverage (https://github.com/srosro/django-test-coverage) is an exciting project that leverages coverage.py in Django projects.
It is most helpful in figuring out if your project is being comprehensibly covered by your automated tests. When you run ./manage.py test, django-test-coverage outputs coverage information. It doesn't help much that the coverage information prints out even when tests fail, making you scroll your command line session a bit more, but this might be fixable.
I was looking for a way to check test coverage for a Django application, and noticed that the project didn't have its own setup.py, nor was it ready for Django 1.4 because of a python package issue. I forked the project, created an appropriate python package, and moved the files in. Then I created a setup.py.
Not a big change, but it allowed me to test my application with coverage on Django 1.4, as well as install django-test-coverage much more cleanly, using pip.
My own fork is here: https://github.com/fabiosantoscode/django-test-coverage. You can report any bugs, suggest stuff to be added, fork, pull request...
Django becomes an even greater and more useful framework because of small plugins like these. It's fascinating how a few lines of code mostly wrapping external functionality can make such a big difference.

Wednesday, 6 June 2012

The most useless thing in the world?

One day, I will need something like this.

>>> class CallableInt(int):
...    def __call__(self):
...       return self
...
>>> a = CallableInt(1)
>>> a()
1
>>>

Won't I?

Tuesday, 15 May 2012

Leaner CSS and a couple of tricks.

I have been exploring and using LESS CSS for about a month now, and I must say it's an awesome package. Makes writing stylesheets a lot of fun and adjusting styles suddenly becomes 'less' of a drag.
The ability to add mixins and variables to your CSS workflow makes it a lot more easy to concentrate on the bigger picture, instead of adjusting a load of rules
Easily splitting your sheets into multiple files is another big plus.
Less has a small undocumented feature that might be very useful for developing your LESS CSS libraries: mixins can have nested children. For example:
.some-flowed-text-helper-mixin(){
    font-size: 1.1em;
    line-height: 1.5em;
    
    hr{
        margin: 10px 20px;
    }
    
    img{
        float:left;
    }
}

p, .para{
    .some-flowed-text-helper-mixin();
}
becomes:
p, .para {
  font-size: 1.1em;
  line-height: 1.5em;
}
p hr, .para hr {
  margin: 10px 20px;
}
p img, .para img {
  float: left;
}
This will be very helpful when creating your own LESS CSS libraries, since the client LESS code would be very, very simple:
your library:
.some-typography-elements(@size-multiplier, @text-color, @heading-color, @link-color){
    p{
        font-size: 1em * @size-multiplier;
        color: @text-color;
    }
    h2{
        font-size: 1.3em * @size-multiplier;
        color: @heading-color;
    }
    a{
        color: @link-color;
    }
    hr{
        margin: 20px*@size-multiplier 10px*@size-multiplier;
    }
}
client LESS CSS:
@import "somelibrary/typography.less";

#content{
    // Multiplier of font-size and line-height, text color, heading color and link color
    .tipography-elements(1.0, #222, #444, #22e);
}
#content-2{
    .tipography-elements(0.8, #999, #666, #ccc);
}
The mixin would then add the rules for p, span, hr, h1, h2, etc. Very practical if you had flowed text in several places with different colors. Another use would be picture gallery widgets (they need to style the child images).
This might look like it's nothing (and it probably isn't much anyway), but there is more! You can also use mixins IN THE OPEN. You can declare new classes outside any parent by using mixins!
Take the above example, and let's suppose only one kind of text color and size is needed in the whole page. It wouldn't make sense to nest the text information inside an ID. Too much bloat in the output CSS for no good reason. All you had to do was:
.tipography-elements(0.8, #999, #666, #ccc); 

EDIT: I have decided to call this hidden featurette "global-level mixins".

This feature can be very useful, but it's not something we use every day. So it's good to stay mindful of it. I've used it to style nested menus in JOOMLA without incurring in confusing nesting and indentation:
.level1(){
    /* ... */
    a{/* ... */}
}
.level2(){
    /* ... */
}
.level3(){
    /* ... */
}

ul.menu{
    .level1();
    ul{
        .level2();
        ul{
            .level3();
        }
    }
}
It's a good idea to use this on a CSS reset. It's seldom necessary to reset heading, nav, section, article, span, form, fieldset, table, td, tr, input, textarea, and every other element in the HTML5 spec. Instead, just importing the CSS reset and applying the mixins for the elements needed will spare a lot of CSS. Too many rules and selectors equals too much bloat, making my CSS larger and making my pages render more slowly.
Another use for this would be to avoid importing CSS classes when importing a LESS library. I might just want to use the variables, calculations and mixins, not include the pure CSS classes. Also, including these files several times results in multiple copies of the classes being placed in the CSS. If the library is static (no mixins), the CSS classes can be wrapped in a large mixin which adds the classes, to avoid unnecessary importing.
Have fun playing around!

Tuesday, 13 March 2012

Creating my new employer's website

Been a while since I posted.

What I've been doing:

Both at work and in my spare time, I've started messing around with the Android SDK, and am planning a small mobile app. I'm usually very concerned with the UI aspect, so I believe I'll make something good and useable.

I've changed jobs, and my previous bosses at António José Moreira decided that I wouldn't get paid for my work in the month of January. They state that the work was being done on a per-objective basis and that they haven't been satisfied with my work. However, they always paid a constant monthly salary.

Working at António José Moreira was a challenge, but I was always aware of their ruthless nature. As a company, they never cared for the well-being or satisfaction of their customers, or even cared to pay their suppliers or service providers at all.

I doubt that they will pay my salary willingly. However I have earned my pay with hard work, and I'm determined to get it.

My new company and their new website

I work at a company called Descontel Consulting now. They are a small company with less than 20 employees (Not sure of the exact size though).

I've already made their new website. I used Joomla! CMS for the first time, and I must say it was fun customizing a template and making it work with all the components.

First I created the above image (grayscale helps me focus on organisation of content rather than how the page looks like)


Then I was asked to take Twitter off the top of the page, so I decided to make the above 3-column layout. The black thing to the right was a screenshot of the Twitter plugin I thought I was going to use.


After that, I colored it and started coding the HTML


No programming was involved, since everything was handled by Joomla. My work there was just creating the template, planning and organizing content and a lot of HTML and CSS. I used this great jQuery plugin for the slider up on the top. The site currently loads the jQuery library twice, though, because of the twitter module. I'm going to have to optimize that.

I had to support Internet Explorer 6, and I did so using a second stylesheet and a conditional comment in the page header.

Like this:

<!--[if lt IE 7]>
<link rel="stylesheet" href="<?php echo $this->baseurl ?>/templates/<?php echo $this->template?>/css/style_ie6.css" type="text/css" />
<![endif]-->

It does make the browser download another stylesheet, but it's not a big one, and I didn't have the time to support IE 6 properly.