Sunday 15 March 2015

Pandas: merge 2 csv files of market data and plot the spread

import pandas as pd
import numpy as np
from datetime import datetime

%matplotlib inline

# load the csv files into pandas
def read_csv(filename):
    return pd.read_csv(
        filename, 
        dtype = {
            'bid_vol'  : np.float64, 
            'bid_price': np.float64, 
            'ask_vol'  : np.float64, 
            'ask_price': np.float64
        }, 
        na_values=['nan'], 
        index_col='time', 
        parse_dates=['time'], 
        date_parser=lambda x: datetime.strptime(x[:15], "%H:%M:%S.%f"))
df1 = read_csv('mkt1.best')
df1 = read_csv('mkt2.best')

# since both dataframes have the same column names for price data we
#   need to create a multiindex using the instrument id
def create_time_and_id_index(dfs):
    # appends all the dataframes into one big dataframe
    out = dfs.pop(0)
    for df in dfs:
        out = out.append(df)
    # add 'id' to the index
    out = out.set_index('id', append=True)
    # sort on timestamp
    out = out.sort()
    # pivot instr_id from the row index to the column index, leaving only timestamp as the row index
    out = out.unstack()
    # the times in the dataframes may not match up, so if we add them together, pandas will add NAN
    #  values for the other columns, so forward fill
    out = out.ffill()
    # reshuffle the column index so that instr_id is the top level, and the other column labels are the second level
    out = out.swaplevel(0,1,axis=1)
    # resort the column labels so that columns are grouped by instr_id
    out = out.sort(axis=1)
    return out

best = create_time_and_id_index([df1, df2])

# create spread columns
best['sell_spread'] = best[mkt2].ask_price - best.[mkt1].bid_price
best['buy_spread']  = best[mkt1].ask_price - best.[mkt2].bid_price

# plot the results!
best.plot(y='sell_spread', figsize=(20, 8))

Bash tab completion example

#!/bin/bash

_apps()
{
echo $(cat ${APPS} | awk '{print $1}' | grep -v -e '^#\|^$')
}

_servers()
{
echo $(cat ${SERVERS} | awk '{print $2}' | sort -u | cut -f2 -d@)
}

_options()
{
echo "--help --verbose --validate --quiet --server"
}

_commands()
{
echo "status start stop restart kill version config"
}

_contains()
{
  local e
for e in ${@:2}; do
if [[ "$e" == "$1" ]]; then
echo 1
return 0
fi
done
  echo 0
  return 1
}

_complete()
{
    local prev_cmd="${COMP_WORDS[COMP_CWORD-1]}"
    local curr_cmd="${COMP_WORDS[COMP_CWORD]}"

    if [[ ${prev_cmd} == "--server" ]]; then
        COMPREPLY=( $(compgen -W "$(_servers)" -- ${curr_cmd}) )
        return 0
    fi

    if [[ ${curr_cmd} == -* ]]; then
        COMPREPLY=( $(compgen -W "$(_options)" -- ${curr_cmd}) )
        return 0
    fi

    # previous command was an app name, so show commands
    if [[ $(_contains "${prev_cmd}" "$(_apps)") -eq 1 ]]; then
        COMPREPLY=( $(compgen -W "$(_commands)" -- ${curr_cmd}) )
        return 0
    fi

    # otherwise try match an app name
    COMPREPLY=( $(compgen -W "$(_apps)" -- ${curr_cmd}) )
}

_main()
{
complete -F _complete cmd
}
_main

Tuesday 10 March 2015

C++11 - Unevaluated operands

Operands of sizeof, typeid, decltype and noexcept are never evaluated

We therefore only need a declaration, not the definition, to use a function or object's name in these contexts

std::declval<T>()  returns T&&
std::declval<T&>() returns T&

decltype( foo(std::declval<T>()) ) returns foo's return type when foo is called with T&&

declval allows us to provide a declaration without having to evaluate the expression (ie: in an unevaluated context) - useful for SFINAE etc

Example: testing for copy-assignability

template<class T>
class is_copy_assignable
{
    template<class U, class=decltype(declval<U&>()=declval<const U&>())>
    static true_type try_assignment(U&&);

    template<class U>
    static false_type try_assignment(...); // catch-all fallback

public:
    using type = decltype(try_assignment(declval<T>()));
};

How this works:

try_assignment(...) will match anything, but is also always the worst match, so if the other try_assignment can match, it will.

type will be the return type of try_assignment, which will either be true_type or false_type

the true_type overload will only work if the expression U& = const U& is valid - ie: if it is copy assignable

We use a second template parameter to allow SFINAE to kick in. It is unnamed because we only use it for SFINAE.

Example: testing for copy-assignability, and requiring an lvalue reference return type

The above example doesn't force a requirement on the copy assignment returning an lvalue reference.

If we assign an alias template to the returned type:

template<class T>
using copy_assignment_t = decltype(declval<T&>() = declval<const T&>());

We can then check whether that is a T& in a SFINAE specialisation

template<class T, class=void>
struct is_copy_assignable 
    : std::false_type {};

template<class T>
struct is_copy_assignable<T, void_t<copy_assignment_t<T>>>
    : std::is_same<copy_assignment_t<T>,T&> {};