Monday, 18 August 2014

Python for data anlysis

numpy

Array creation functions

array: Convert input data (list, tuple, array, or other sequence type) to an ndarray either by inferring a dtype or explicitly specifying a dtype. Copies the input data by default.
asarray: Convert input to ndarray, but do not copy if the input is already an ndarray
arange: Like the built-in range but returns an ndarray instead of a list.
ones, ones_like: Produce an array of all 1’s with the given shape and dtype. ones_like takes another array and produces a ones array of the same shape and dtype.
zeros, zeros_like: Like ones and ones_like but producing arrays of 0’s instead
empty, empty_like: Create new arrays by allocating new memory, but do not populate with any values like ones and zeros
eye, identity: Create a square N x N identity matrix (1’s on the diagonal and 0’s elsewhere)

Unary ufuncs

abs, fabs: Compute the absolute value element-wise for integer, floating point, or complex values. Use fabs as a faster alternative for non-complex-valued data
sqrt: Compute the square root of each element. Equivalent to arr ** 0.5
square: Compute the square of each element. Equivalent to arr ** 2
exp: Compute the exponent e^x of each element
log, log10, log2, log1p: Natural logarithm (base e), log base 10, log base 2, and log(1 + x), respectively
sign: Compute the sign of each element: 1 (positive), 0 (zero), or -1 (negative)
ceil: Compute the ceiling of each element, i.e. the smallest integer greater than or equal to
each element
floor: Compute the floor of each element, i.e. the largest integer less than or equal to each
element
rint: Round elements to the nearest integer, preserving the dtype
modf: Return fractional and integral parts of array as separate array
isnan: Return boolean array indicating whether each value is NaN (Not a Number)
isfinite, isinf: Return boolean array indicating whether each element is finite (non-inf, non-NaN) or infinite, respectively
cos, cosh, sin, sinh, tan, tanh: Regular and hyperbolic trigonometric functions
arccos, arccosh, arcsin, arcsinh, arctan, arctanh: Inverse trigonometric functions
logical_not: Compute truth value of not x element-wise. Equivalent to -arr.

Binary universal functions

add: Add corresponding elements in arrays
subtract: Subtract elements in second array from first array
multiply: Multiply array elements
divide, floor_divide: Divide or floor divide (truncating the remainder)
power: Raise elements in first array to powers indicated in second array
maximum, fmax: Element-wise maximum. fmax ignores NaN
minimum, fmin: Element-wise minimum. fmin ignores NaN
mod: Element-wise modulus (remainder of division)
copysign: Copy sign of values in second argument to values in first argument
greater, greater_equal, less, less_equal, equal, not_equal: Perform element-wise comparison, yielding boolean array. Equivalent to infix operators >, >=, <, <=, ==, !=
logical_and, logical_or, logical_xor: Compute element-wise truth value of logical operation. Equivalent to infix operators & |, ^

Basic array statistical methods

sum: Sum of all the elements in the array or along an axis. Zero-length arrays have sum 0.
mean: Arithmetic mean. Zero-length arrays have NaN mean.
std, var: Standard deviation and variance, respectively, with optional degrees of freedom adjustment (default denominator n).
min, max: Minimum and maximum.
argmin, argmax: Indices of minimum and maximum elements, respectively.
cumsum: Cumulative sum of elements starting from 0
cumprod: Cumulative product of elements starting from 1

Array set operations

unique(x): Compute the sorted, unique elements in x
intersect1d(x, y): Compute the sorted, common elements in x and y
union1d(x, y): Compute the sorted union of elements
in1d(x, y): Compute a boolean array indicating whether each element of x is contained in y
setdiff1d(x, y): Set difference, elements in x that are not in y
setxor1d(x, y): Set symmetric differences; elements that are in either of the arrays, but not both

Linear Algebra

diag: Return the diagonal (or off-diagonal) elements of a square matrix as a 1D array, or convert a 1D array into a square matrix with zeros on the off-diagonal
dot: Matrix multiplication
trace: Compute the sum of the diagonal elements
det: Compute the matrix determinant
eig: Compute the eigenvalues and eigenvectors of a square matrix
inv: Compute the inverse of a square matrix
pinv: Compute the Moore-Penrose pseudo-inverse inverse of a square matrix
qr: Compute the QR decomposition
svd: Compute the singular value decomposition (SVD)
solve: Solve the linear system Ax = b for x, where A is a square matrix
lstsq: Compute the least-squares solution to y = Xb

Random Number Generation

seed: Seed the random number generator
permutation: Return a random permutation of a sequence, or return a permuted range
shuffle: Randomly permute a sequence in place
rand: Draw samples from a uniform distribution
randint: Draw random integers from a given low-to-high range
randn: Draw samples from a normal distribution with mean 0 and standard deviation 1 (MATLAB-like interface)
binomial: Draw samples a binomial distribution
normal: Draw samples from a normal (Gaussian) distribution
beta: Draw samples from a beta distribution
chisquare: Draw samples from a chi-square distribution
gamma: Draw samples from a gamma distribution
uniform: Draw samples from a uniform [0, 1) distribution
Taken from Python for Data Anlysis by Wes McKinney

Sunday, 17 August 2014

IPython debugging

Debugger commands

  • h (help) Display command list
  • help command Show documentation for command
  • c (continue) Resume program execution
  • q (quit) Exit debugger without executing any more code
  • b (break) number Set breakpoint at number in current file
  • b path/to/file.py:number Set breakpoint at line number in specified file
  • s (step) Step into function call
  • n (next) Execute current line and advance to next line at current level
  • u/d (up) / (down) Move up/down in function call stack
  • a (args) Show arguments for current function
  • debug statement Invoke statement statement in new (recursive) debugger
  • l (list) statement Show current position and context at current level of stack
  • w (where) Print full stack trace with context at current position

Post-mortem debugging

%debug

Entering %debug immediately after an exception has occurred drops you into the stack frame where the exception was raised

Utility functions

Poor man’s breakpoint

def set_trace():
    from IPython.core.debugger import Pdb
    Pdb(color_scheme='Linux').set_trace(sys._getframe().f_back)

Putting set_trace() in your code will automatically drop into the debugger when the line is executed.

Interactive function debugging

def debug(f, *args, **kwargs):
    from IPython.core.debugger import Pdb
    pdb = Pdb(color_scheme='Linux')
    return pdb.runcall(f, *args, **kwargs)

Passing a function to debug will drop you into the debugger for an arbitrary function call.

debug(fn, arg1, arg2, arg3, kwarg=foo, kwarg=bar)

Interactive script debugging

Executing a script via %run with -d will start the script in the debugger

%run -d ./my_script.py

Specifying a line number with -b starts the script with a breakpoint already set

%run -d -b20 ./my_script.py # sets a breakpoint on line 20

Taken from Python for Data Anlysis by Wes McKinney