Basic Python(Short version)

This is just to give you a glimpse of what Python can do. We select only subset of the feature we think will be useful for doing analysis. We also leave links in various place in case you want to do your own further study and there is also an extended edition if you want to learn more.

For more complete features, you can look at Python official documentation or a book like Think Python

In this tutorial we will be using IPython. To execute current cell and go to the next cell in this tutorial press Shift+Enter. If things go wrong you can restart Kernel by either click on Kernel at the top bar and choose restart or press Ctrl+M+. (Press that DOT symbol too)

Hello World

In [1]:
#press shift+enter to execute this
print 'Hello world'
Hello world
In [2]:
#ipython automatically show representation of 
#the return value of the last command
1+1
Out[2]:
2

Data Type(Usual Stuff)

In [3]:
x = 1 #integer
y = 2.0 #float
t = True #boolean (False)
s = 'hello' #string
s2 = "world" #double quotes works too
#there are also triple quotes google python triple quotes
n=None #Null like variable None.
In [4]:
print x+y #you can refer to previously assigned variable.
3.0
In [5]:
s+' '+s2
Out[5]:
'hello world'
In [6]:
#boolean operations
x>1 and (y>=3 or not t) and not s=='hello' and n is None
Out[6]:
False
In [7]:
#Bonus: The only language I know that can do this
0 < x < 10
Out[7]:
True

Bonus: String formatting

One of the best implentation. There are couple ways to do string formatting in Python. This is the one I personally like.

In [8]:
'x is %d. y is %f'%(x,y)
Out[8]:
'x is 1. y is 2.000000'
In [9]:
#even more advance stuff
#locals returns dictionary of
#local variables which you then use 
#in formatting by name
'x is %(x)d. y is %(y)f'%locals() #easier to read
Out[9]:
'x is 1. y is 2.000000'

List, Set, Tuple, Dictionary, Generator

List

Think of it as std::vector++

In [10]:
l = [1, 2, 3, 4, 5, 6, 7]
print l #[1, 2, 3, 4, 5, 6, 7]
print l[2] #3
print len(l) # list length
print l[-1] #7 negative index works from the back (-1)
l2 = [] #want an empty list?
print l2
[1, 2, 3, 4, 5, 6, 7]
3
7
7
[]
In [11]:
#doesn't really need hold the same type
#but don't recommend. You will just get confused
bad_list = ['dog','cat',1,1.234]
In [12]:
l[1] = 10 #assignment
l
Out[12]:
[1, 10, 3, 4, 5, 6, 7]
In [13]:
l.append(999) #append list
l
Out[13]:
[1, 10, 3, 4, 5, 6, 7, 999]
In [14]:
#can be created from list com
l.sort() #sort
l
Out[14]:
[1, 3, 4, 5, 6, 7, 10, 999]
In [15]:
#searching O(N) use set for O(log(N))
#http://docs.python.org/2/tutorial/datastructures.html#sets
10 in l
Out[15]:
True
In [16]:
11 not in l
Out[16]:
True
In [17]:
#useful list function
range(10) #build it all in memory
Out[17]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [18]:
#List Comprehension
#we will get to for loop later but for simple one
#list comprehension is much more readable
print l
my_list = [2*x for x in l]
print my_list
my_list = [ (2*x,x) for x in range(10)]
print my_list
my_list = [3*x for x in range(10) if x%2==0]
print my_list
[1, 3, 4, 5, 6, 7, 10, 999]
[2, 6, 8, 10, 12, 14, 20, 1998]
[(0, 0), (2, 1), (4, 2), (6, 3), (8, 4), (10, 5), (12, 6), (14, 7), (16, 8), (18, 9)]
[0, 6, 12, 18, 24]
In [19]:
#This might come in handy
[1]*10
Out[19]:
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

Bonus: Python Autocomplete

In [20]:
#in this cell try my_ and press tab
#IPython knows about local variables and
#can do autocomplete (remember locals()?)
In [21]:
#try type len(<TAB> here
#python can give you documentation/function signature etc.

Tuple

Think of it as immutable list

In [22]:
tu = (1,2,3) #tuple immutable list
print tu
tu2 = tuple(l) #convert list to tuple
print tu2
tu3 = 4,5,6 #parenthesis is actually optional but makes it more readable
print tu3
(1, 2, 3)
(1, 3, 4, 5, 6, 7, 10, 999)
(4, 5, 6)
In [23]:
#access
tu[1]
#you can't assign to it
Out[23]:
2
In [24]:
#tuple expansion
print tu
x, y, z = tu
print x #1
print y #2
print z #3
print x, y, z #you can use tuple in print statement too

x, y, z = 10, 20, 30#parenthesis is actually optional
print z, y, x #any order
(1, 2, 3)
1
2
3
1 2 3
30 20 10
In [25]:
#useful for returning multiple values
def f(x,y):
    return x+y, x-y #parenthesis is implied
a, b = f(10,5)
print a #15
print b #5
print a, b #works too
15
5
15 5

Dictionary

Think of it as std::map - ish. It's actually a hash table. There is also OrderedDict if you also care about ordering.

In [26]:
d = {'a':1, 'b':10, 'c':100}
print d #{'a': 1, 'c': 100, 'b': 10}
d2 = dict(a=2, b=20, c=200) #using named argument
print d2 #{'a': 2, 'c': 200, 'b': 20}
d3 = dict([('a', 3),('b', 30),('c', 300)]) #list of tuples
print d3 #{'a': 3, 'c': 300, 'b': 30}
d4 = {x:2*x for x in range(10)}#comprehension (key doesn't have to be string)
print d4 #{0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 5: 10, 6: 12, 7: 14, 8: 16, 9: 18}
d5 = {} #empty dict
print d5 #{}
{'a': 1, 'c': 100, 'b': 10}
{'a': 2, 'c': 200, 'b': 20}
{'a': 3, 'c': 300, 'b': 30}
{0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 5: 10, 6: 12, 7: 14, 8: 16, 9: 18}
{}
In [27]:
print d['a'] #access
print len(d) #count element
d['d'] = 1000#insert
print d #{'a': 1, 'c': 100, 'b': 10, 'd': 1000}
del d['c']#remove
print d #{'a': 1, 'b': 10, 'd': 10}
print 'c' in d #keyexists?
1
3
{'a': 1, 'c': 100, 'b': 10, 'd': 1000}
{'a': 1, 'b': 10, 'd': 1000}
False
In [28]:
#use dictionary in comprehension
#d.items() return generator which gives tuple 
#k,v in d.items() does tuple expansion in disguise
new_d = {k:2*v for k,v in d.items()}
print new_d #{'a': 2, 'b': 20, 'd': 20}
{'a': 2, 'b': 20, 'd': 2000}

Control Flow

if else elif

Indentation in python is meaningful. There is "NO" bracket scoping in python.

Recommended indentation is 4 spaces not Tab. Tab works but not recommended Set your text editor to soft tab. PEP8 which list all the recommended style: space, comma, indentation, new line, comment etc. Fun Read.

In [29]:
x = 20
if x>10: #colon
    print 'greater than 10'#don't for get the indentation
elif x>5: #parenthesis is not really needed
    print 'greater than 5'
else:
    print 'not greater than 10'
x+=1#continue your execution with lower indentation
print x
greater than 10
21
In [30]:
#shorthand if
y = 'oh yes' if x>100 else 'oh no' #no colon
print y
oh no
In [31]:
#since indentation matters sometime we don't need any statement
if x>10:
    print 'yes'
else:
    pass #pass keyword means do nothing
x+=1
print x
yes
22
In [32]:
#why is there no bracket??
from __future__ import braces # easter egg
  File "<ipython-input-32-25218e719ec0>", line 2
    from __future__ import braces # easter egg
SyntaxError: not a chance

For loop, While loop, Generator, Iterable

There is actually no for(i=0;i<10;i++) in python. list is an example of iterable.

In [33]:
#iterate over list
for i in xrange(5): # xrange is a generator
    print i # again indentation is meaningful
print '------'
for i in xrange(20,25): # xrange is a generator see extended version if you wonder
    print i # again indentation is meaningful
0
1
2
3
4
------
20
21
22
23
24
In [34]:
#looping the list
l = ['a','b','c']
for x in l:
    print x
a
b
c
In [35]:
#if you need index
l = ['a','b','c']
for i,x in enumerate(l):
    print i,x
0 a
1 b
2 c
In [36]:
#looping dictionary
d = {'a':1,'b':10,'c':100}
#items() returns a generator which return tuple
#k,v in d.items() is tuple expansion in disguise
for k,v in d.items():
    print k,v
a 1
c 100
b 10
In [37]:
#looping over multiple list together
lx = [1,2,3]
ly = [2*x+1 for x in lx]
print lx, ly
for x,y in zip(lx,ly): #there is also itertools.izip that does generator
    print x,y
[1, 2, 3] [3, 5, 7]
1 3
2 5
3 7
In [38]:
#complete the list with while loop
x = 0
while x<5:
    print x
    x+=1
0
1
2
3
4

See Also

For more complex looping you can look at itertools

Function

Functions in python is a first class object(except in a very few cases).

In [39]:
def f(x, y): #remember the colon
    print 'x =',x #again indentation
    print 'y =',y
    return x+y
f(10,20)
x = 10
y = 20
Out[39]:
30
In [40]:
#python is dynamic typing language
#specifically it's Duck Typing(wikipedia it. Fun Read.)
#this means as long as it has the right signature
#Python doesn't care
f('hello','world')
x = hello
y = world
Out[40]:
'helloworld'
In [41]:
#you can pass it by name too
#this is useful since you can't always remember the order
#of the arguments
f(y='y',x='x') # notice I put y before x
x = x
y = y
Out[41]:
'xy'
In [42]:
#default/keyword arguments
def g(x, y, z='hey'):
    #one of the most useful function
    print locals() # return dictionary of all local variables
g(10,20)
g(10,20,30) # can do it positionally
{'y': 20, 'x': 10, 'z': 'hey'}
{'y': 20, 'x': 10, 'z': 30}
In [43]:
g(10,z='ZZZZ',y='YYYY') #or using keyword
{'y': 'YYYY', 'x': 10, 'z': 'ZZZZ'}
In [44]:
def myfunc(x,y,z, long_keyword="111000"):
    return None
In [45]:
#IPython knows about keyword arguments name try type this
#myfunc(x, y, z, lon<TAB>

Be careful

In [46]:
#in your programming life time you might be
#you might be tempting to put a mutable object like list
#as default argument. Just Don't
def f(x,y,z=[]): #Don't do this
    pass
def f(x,y,z=None):
    z = [] if z is None else z

It has to do with closure. If you wonder why, you can read “Least Astonishment” in Python: The Mutable Default Argument.

Bonus

This might comes in handy

In [47]:
#arbitary number of argument C's va_arg
def h(x,y,*arg,**kwd):
    print locals()
h(10,20,30,40,50,custom_kwd='hey')
{'y': 20, 'x': 10, 'kwd': {'custom_kwd': 'hey'}, 'arg': (30, 40, 50)}
In [48]:
#Bonus: more cool stuff.
#argument expansion
def g(x, y, z):
    print locals()
t = (1,2,3)
g(*t)
{'y': 2, 'x': 1, 'z': 3}
In [49]:
#If you know lambda calculus
f = lambda x: x+1
f(3)
Out[49]:
4

Classes, Object etc.

Think about Object as pointer to object in C. This will answer so many question about whether we are passing by reference or value or is it copy or assignment. Internally, it actually is C pointer to struct.

In [50]:
#define a class
class MyClass:
    x = 1 #you can define a field like this
    
    #first argument is called self. It refers to itself
    #think of it as this keyword in C
    def __init__(self, y): #constructor
        self.y = y #or define it here
    
    def do_this(self, z):
        return self.x + self.y + z
In [51]:
a = MyClass(10)
print a.do_this(100)
111
In [52]:
#press a.<TAB> here for IPython autocomplete
In [53]:
#you can even add field to it
a.z = 'haha'
print a.z
haha
In [54]:
#remember when I said think of it as C pointer??
b = a
b.x = 999 #change b
print a.x #printing a.x not b.x
999
In [55]:
#you may think you won't encounter it but...
a = [1,2,3]
b = a
b[1]=10
print a
[1, 10, 3]
In [56]:
#shallow copy is easy
a = [1,2,3]
b = a[:] #remember slicing? it creates a new list
b[1] = 10
print a, b
[1, 2, 3] [1, 10, 3]

Inheritance

Python support multiple inheritance aka mixin. You can read about it here We won't need it in our tutorial. The basic syntax is the following.

In [57]:
class Parent:
    x = 10
    y = 20
    
class Child(Parent):
    x = 30
    z = 50

p = Parent()
c = Child()
print p.x
print c.x
10
30