Conifer Productions

From ideas to apps. From mobile to global.

Book review: Data Science at the Command Line

No matter how handy graphical user interfaces are, the good old command line remains a useful tool for performing various low-level data manipulation and system administration tasks. It is the fallback when you need to do something that has no way of graphical control. Being much more expressive and open-ended than a predefined set of controls, the command shell is the ultimate control environment for your computer. Data science has become one of the most intensely practised computer applications, so it is no wonder that it also benefits greatly from the hands-on control approach of the command line shell. Read more →

LCD-like banners in Python

Back in 1998 or so, I wrote a CD player application for Microsoft Windows in Borland Delphi. It was for a magazine tutorial article, and I wanted a cool LCD-like display to show track elapsed and remaining time. There was a good one available for Delphi, called LCDLabel, written by Peter Czidlina (if you’re reading this, thanks once more for your cooperation). I’ve been thinking about doing a modern version of the LCD display component for several times over the years, and I even got pretty far with one for OS X in 2010, but then abandoned it because of other projects. Read more →

Thinking of Learning Python? Start here!

Python is one of the friendliest general-purpose programming languages out there. It is free to use, well supported and used by many big companies. Since its introduction in 1991, it may not have taken the world by storm, but has gained a huge share of programmers’ interest. As of this writing (November 2014), Python is number 8 on the TIOBE Index. Recently I have been studying bioinformatics, and in the course of my studies I have met many people who are learning to program for the first time, and doing it with Python. Read more →

Unicode character dump in Python

Sometimes you just need to see what characters are lurking inside a Unicode encoded text file. Your garden variety dump utility (like the venerable od in UNIX systems and the Windows standard hex dump (though I don’t think there is one) only shows you the plain bytes, so you have to head over to unicode.org to find out what they mean. But first you need to decode UTF-8 to get the actual code points, or grok UTF-16 LE or BE, and so on. It’s fun, but it’s not for everyone.

The udump utility shows you a nice list of character names, together with their offsets in the file. Currently it only handles UTF-8, so the offset is calculated based on the UTF-8 length of the character.

Read more →