Petaflop: 2011

Wednesday, 14 December 2011

dynamically linked libraries

ELF file format has an RPATH section where it lists hardcoded paths to search for libraries

You can find the RPATH that an application has:
readelf -d app_name | grep RPATH

Hardcoding paths into an application is not very elegant. It is better to allow the Linux dynamic linker to locate the libraries for you.

Fedora has /etc/ld.so.conf.d/ which is a folder it searches for library paths. You can add foo-x86_64.conf, and inside put the library path, and ld will search that path whenever it needs to locate libfoo.so

Tuesday, 13 December 2011

Fedora 16: enable core files

/proc/sys/kernel/core_pattern is used to specify a core dumpfile pattern name.
If the first character of the pattern is a '|', the kernel will treat the rest of the pattern as a command to run. The core dump will be written to the standard input of that program instead of to a file.

The default core_pattern is:
|/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e

So we instead put core.%p into the file so we get a core file.
echo "core.%p" > /proc/sys/kernel/core_pattern

put ulimit -c unlimited into your .bashrc

Monday, 12 December 2011

Fedora 16: Yum and package management

list installed packages
yum list installed

list available packages (not yet installed)
yum list available

list all packages (installed and available)
yum list all

list all package groups
yum grouplist

list all repost
yum repolist

info on a package
yum info package_name

install a package
yum install package_name

install a package group
yum groupinstall "Group package name"

uninstall a package
yum remove package_name

uninstall a package group

yum groupremove "Group package name"

find installation location of a package
rpm -ql package_name

Sunday, 11 December 2011

gcc inline assembler

Register Naming: Register names are prefixed with %, so that registers are %eax, %cl.
Ordering of operands: order of operands is source(s) first, and destination last. mov %edx, %eax" means move contents of edx register to eax
Operand Size: size of memory operands is determined from the last character of the op-code name. The suffix is b for (8-bit) byte, w for (16-bit) word, and l for (32-bit) long. For example, the correct syntax for the above instruction would have been "movl %edx, %eax".
Immediate Operand: Immediate operands are marked with a $ prefix, as in "addl $5, %eax", which means add immediate long value 5 to register %eax).
Memory Operands: Missing operand prefix indicates it is a memory-address; hence "movl $bar, %ebx" puts the address of variable bar into register %ebx, but "movl bar, %ebx" puts the contents of variable bar into register %ebx.
Indexing: Indexing or indirection is done by enclosing the index register or indirection memory cell address in parentheses. For example, "movl 8(%ebp), %eax" (moves the contents at offset 8 from the cell pointed to by %ebp into register %eax).

inline assembler invoked with __asm__("instruction"; "instruction"; "instruction");

__asm__ ("movl %ebx, %eax"); // moves the contents of ebx register to eax
__asm__ ("movb %ch, (%ebx)"); // moves the byte from ch to the memory pointed by ebx

// Add 10 and 20 and store result into register %eax
__asm__ ( "movl $10, %eax;"
          "movl $20, %ebx;"
          "addl %ebx, %eax;" );

Can optionally specify the operands. It allows us to specify the input registers, output registers and a list of clobbered registers.

__asm__ ( "assembly code"
           : output operands                  // optional
           : input operands                   // optional
           : list of clobbered registers );   // optional

If there are no output operands but there are input operands, we must place two consecutive colons surrounding the place where the output operands would go.

It is not mandatory to specify the list of clobbered registers to use, we can leave that to GCC and GCC’s optimization scheme do the needful.

Distilled from here: http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx

More info: http://ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html

Monday, 5 December 2011

Fedora 16: yum install c++ programming libraries

sudo yum install

gcc.x86_64
gcc-c++.x86_64
boost.x86_64

boost-devel.x86_64

glibc.x86_64

glibc-common.x86_64
glibc-debuginfo.x86_64
glibc-debuginfo-common.x86_64
glibc-devel.x86_64
glibc-headers.x86_64
glibc-static.x86_64
glibc-utils.x86_64

libstdc++.x86_64
libstdc++-devel.x86_64
libstdc++-docs.x86_64
libstdc++-static.x86_64

.bashrc

# colour prompt:
export PS1="\e[0;35m\!\e[m \e[0;36m\u@\h\e[m \e[0;32m\w \$ \e[m"

\e[0;35m<PS1>\e[

\e[ Start colour sequence
x;y Colour
<PS1> prompt sequence
\e[m End colour sequence

0;35 = magenta

0;36 = cyan

0;32 = green

\! = history number

\u = user

\h = hostname

\w = full current path

# set xterm title bar

function title()
{
PREFIX=$@;
export PROMPT_COMMAND='echo -ne "\033]0;${PREFIX} [${USER}@${HOSTNAME}: ${PWD/$HOME/~}]\007"'
}

export HISTCONTROL=ignoredups

# display a tree of current directory and below...

alias tree="ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/ /' -e 's/-/|/'"

.vimrc

set autoindent
set cindent
set shiftwidth=4
set tabstop=4
set expandtab

" turn off auto indent in order to past code
set pastetoggle=<F2>

" syntax highlighting
colorscheme desert

' enable syntax highlighting in bash vi mode [http://stackoverflow.com/questions/7115324/syntax-highlighting-in-bash-vi-input-mode]
if expand('%:t') =~?'bash-fc-\d\+'
setfiletype sh
endif

" allow backspacing over everything in insert mode
set backspace=indent,eol,start
" hides buffers instead of closing them (allows opening a new file without writing unsaved changes to current file)
set hidden
" highlight search terms
set hlsearch
" show search matches as you type
set incsearch

" don't create temporary files
set nobackup
set noswapfile

" remap ';' to ':' - increases the speed of most commands
nnoremap ; :

" clear highlighted search terms by pressing ,/
nmap <silent> ,/ :nohlsearch<CR>

" netrw file explorer - show tree view directory listing style
let g:netrw_liststyle=3

Saturday, 3 December 2011

OSX: enable nfs

Turn off automount
add AUTOMOUNT=-NO- to /etc/hostconfig

create mount point in fs
sudo mkdir /mnt/raid

add mount command to /etc/fstab
nas:/mnt/raid /mnt/raid nfs noauto,rw,async

restart

Fedora 16: NFS Server

in /etc/exports, add the path you want to share, and the ip addresss or range you want to give access to, followed by some options

/mnt/raid 192.168.1.0/255.255.255.0(rw,no_subtree_check,async)

rw: read-write access

no_subtree_check: speeds up access by turning off the check that the current folder is a subtree in the same volume

async: speeds up access by turning off synchronous acks

start the requisite services

systemctl start rpcbind.service

systemctl start nfs-server.service

systemctl start nfs-lock.service

systemctl start nfs-idmap.service

enable automatic restart

systemctl enable rpcbind.service

systemctl enable nfs-server.service

systemctl enable nfs-lock.service

systemctl enable nfs-idmap.service

display exported directories
showmount --exports

Friday, 2 December 2011

Fedora 16: Install Squeezebox Server

Obtain repository
rpm -Uvh http://repos.slimdevices.com/yum/squeezecenter/squeezecenter-repo-1-6.noarch.rpm

yum install squeezeboxserver

sudo ln -sf /usr/lib/perl5/vendor_perl/Slim /usr/lib64/perl5/vendor_perl/Slim

yum install perl-CPAN
perl -MCPAN -e'install Log::Log4perl'
perl -MCPAN -e'install CGI::Cookie'

for some reason the CGI::Cookie install failed, go into cpan, and force install it

cpan
force install CGI::Cookie

sudo systemctl start squeezeboxserver.service
sudo systemctl status squeezeboxserver.service

configure it - browse to localhost:9000

Wednesday, 30 November 2011

Fedora 16 - enable vnc server

install vnc server
$ sudo yum install tigervnc-server

configure vnc server
$ sudo su
# cp /lib/systemd/system/vncserver@.service /etc/systemd/system/vncserver@:1.service

edit the configuration to specify the user and set the required geometry
User=steve
ExecStart=/usr/bin/vncserver %i -geometry 2560x1440

# systemctl daemon-reload

set the user's vnc password
# su - steve
# vncpasswd
# exit

launch
# systemctl start vncserver@:1.service

enable for restart
# systemctl enable vncserver@:1.service

Fedora 16: mount software raid

change to root
$ sudo su

create a mount point for the raid
# mkdir /mnt/raid

save old mdadm config (just in case)
# mv /etc/mdadm.conf /etc/mdadm.conf.bak

copy raid details into mdadm config
# mdadm --detail --scan > /etc/mdadm.conf
> creates a file somewhat like this:
> ARRAY /dev/md/NAS:0 metadata=1.2 name=NAS:0 UUID=4dc53f9d:f0c55279:a9cb9592:a59607c9

add the raid configuration to your /etc/fstab file (notice the 1st field is the device, which is the same as the 2nd field returned from mdadm --detail --scan)
/dev/md/NAS:0 /mnt/raid ext4 defaults 1 2

mount the raid
# mount -a

check its status
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sdb1[2] sde1[1] sdd1[4] sda1[0]
5860538880 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>

Fedora 16: Enable ssh

start ssh service
$ sudo systemctl start sshd.service

enable ssh service to restart upon reboot
$ sudo systemctl enable sshd.service

make sure port 22 is open
$ system-config-firewall

enable public key exchange
generate keys
$ ssh-keygen -t rsa
create ssh directory on server
$ ssh srvr_hostname "mkdir .ssh; chmod 0700 .ssh"
copy public key to server
$ scp .ssh/id_rsa.pub srvr_hostname:.ssh/authorized_keys2

Monday, 28 November 2011

Lightweight unit test framework

https://github.com/philsquared/Catch

Wednesday, 23 November 2011

Lean architecture philosophy

Dependency injection (http://en.wikipedia.org/wiki/Dependency_injection) allows system behaviour customisation without binary changes

However, allowing for this hypothetical need violates YAGNI (http://en.wikipedia.org/wiki/You_ain%27t_gonna_need_it).

It is better to adopt a lean architecture, and decide as late as possible (http://en.wikipedia.org/wiki/Lean_software_development#Decide_as_late_as_possible)

"Always implement things when you actually need them, never when you just foresee that you need them".

This is also how unix succeeded with its 'KISS" principle and rule of simplicity (http://en.wikipedia.org/wiki/Unix_philosophy).

MTS malloc

http://www.newcodeinc.com/

C++11 Lambdas

Lambdas are unnamed function object classes which define a function call operator

lambda-introducer
The [] is the lambda-introducer for the stateless lambda, and it tells the compiler that a lambda expression is beginning

lambda-parameter-declaration
The (int n) is the lambda-parameter-declaration, which tells the compiler what the unnamed function object class's function call operator should take. This syntatically consists of: ( lambda-parameter-declaration-list_opt ) mutable_opt exception-specification_opt lambda-return-type-clause_opt

lambda-compound-statement

The { cout << n << " "; } is the lambda-compound-statement which serves as the body of the unnamed function object class's function call operator. By default, the unnamed function object class's function call operator returns void.

lambda-return-type-clause: an optional return type
If a lambda's compound-statement is { return expression; } , then the lambda's return type will be automatically deduced to be the type of expression

eg:

transform(v.begin(), v.end(), front_inserter(d), [](int n) { return n * n * n; });

If you don't start the lambda-compound-statement with return, you must explicitly state the return type, with -> type

eg:

transform(v.begin(), v.end(), front_inserter(d), [](int n) -> double {

if (n % 2 == 0) {

return n * n * n;

} else {

return n / 2.0;

}

});

Passing state into the lambda

If you want to have state passed into your lambda, you must declare your local variables

You can have stateful lambdas too, and this is accomplished through "capturing" local variables. The empty lambda-introducer [] says "I am a stateless lambda". But within the lambda-introducer, you can specify a capture-list:

eg:

v.erase(remove_if(v.begin(), v.end(), [x, y](int n) { return x < n && n < y; }), v.end());

The compound-statement { return x < n && n < y; } serves as the body of the function call operator within that class. Although the compound-statement is lexically within the scope of main(), it is conceptually outside the scope of main(), so you can't use local variables from main() without capturing them within the lambda.

Note that

(a) the captured copies can't be modified within the lambda, because by default the function call operator is const

(b) some objects are expensive to copy

(c) updates to the local variables will not be reflected in the captured copies (You can clearly see that the captures are "by value")

Passing all state into the lambda

You can also "capture everything by value". The syntax for this is the lambda-introducer [=] (the capture-default = is supposed to make you think of assignment or copy-initialization Foo foo = bar;)

eg:

v.erase(remove_if(v.begin(), v.end(), [=](int n) { return x < n && n < y; }), v.end());

When the compiler sees x and y mentioned within the lambda, it captures them from the surrounding scope by value.

Modifying the captured state

By default, a lambda's function call operator is const, but you can make it non-const by saying mutable:

eg:

for_each(v.begin(), v.end(), [=](int& r) mutable {

const int old = r;

r *= x * y;

x = y;

y = old;

});

Modifying external state

Capture by reference. The syntax for doing this is the lambda-introducer [&x, &y]

eg:

for_each(v.begin(), v.end(), [&x, &y](int& r) {

const int old = r;

r *= x * y;

x = y;

y = old;

});

Passing all state into the lambda by reference

You can also "capture everything by reference". The syntax for this is the lambda-introducer [&]

Modifying both the captured and external state

You can capture by reference and make the lambda mutable

Mix capture by value and capture by reference

You can specify some (or all: '=') parameters by value, and some by reference

eg:

for_each(v.begin(), v.end(), [=, &sum, &product](int& r) mutable {

sum += r;

if (r != 0) {

product *= r;

}

const int old = r;

r *= x * y;

x = y;

y = old;

});

The opposite lambda-introducer [&, x, y] would produce exactly the same result (capture everything by reference, except x and y by value).

Note that the lambda expression syntax only allows you to capture local variables. Global or class member variables are not allowed.

Note that this is a local variable, so you can pass this to a lambda - and that allows you to implicitly access member variables (without having to dereference this)

Passing everything by value [=] will implicitly capture this.

eg:

for_each(v.begin(), v.end(), [this](int n) {

cout << "If you gave me " << n << " toys, I would have " << n + m_toys << " toys total." << endl;

});

Nullary lambdas

If you want a nullary lambda (taking no arguments), you can elide the lambda-parameter-declaration entirely.

eg:

generate_n(back_inserter(v), 10, [&] { return i++; });

Note that if you want to say mutable or -> ReturnType, you need empty parentheses between that and the lambda-introducer. (you can't elide the arguments).

Storing lambdas

You can store lambdas using the auto construct

eg:

auto g = [](int n) { cout << n * n * n << " "; };

g(5);

You can also store lambdas in a matching std::tr1::function object, but this comes with overhead

eg:

std::tr1::function<void (int)> g = [](int n) { cout << n * n * n << " "; };

Disruptor Concurrent Programming Framework

http://code.google.com/p/disruptor/

Interactive Linux Kernel Map

http://www.makelinux.net/kernel_map.shtml

Tuesday, 22 November 2011

Page faults

Major page faults: swapped out pages having to be swapped back in from disk
Minor page faults: marking a required page in memory as being in memory

mlockall results in all pages mapped by a process to remain resident in memory

#include <sys/mman.h>

int mlockall(int flags);
int munlockall(void);

flags:
MCL_CURRENT: Lock all of the pages currently mapped into the address space of the process.
MCL_FUTURE: Lock all of the pages that become mapped into the address space of the process in the future, when those mappings are established.

prefaulting in the stack will prevent minor page faults.
push and then pop a large block of data onto the stack at the start of a thread

Spinlocks

With worker threads pinned to cores, an effective synchronisation mechanism is spinlocks.

#include <linux/spinlock.h>

spinlock_t my_lock = SPIN_LOCK_UNLOCKED;
void spin_lock(spinlock_t *lock);
void spin_unlock(spinlock_t *lock);

Real time scheduling

Main worker threads will have been pinned to cores. Now we need to prevent them from being preempted.

Set the scheduler type to SCHED_FIFO, and set the thread priority to maximum

#include <sched.h>

int sched_setscheduler(pid_t pid, int policy,
const struct sched_param *param);

struct sched_param {
...
int sched_priority;
...
};

int sched_get_priority_max(int policy);

http://linux.die.net/man/2/sched_setscheduler
http://linux.die.net/man/2/sched_get_priority_max

Pinning threads to cores

Enforcing cache locality and allowing a deterministic view of where threads are running.

#define _GNU_SOURCE
#include <pthread.h>
int pthread_setaffinity_np(pthread_t th,
size_t size,
const cpu_set_t *cpuset);
int pthread_getaffinity_np(pthread_t th,
size_t size,
cpu_set_t *cpuset);
int pthread_attr_setaffinity_np(
pthread_attr_t *at,
size_t size,
const cpu_set_t *cpuset);
int pthread_attr_getaffinity_np(
pthread_attr_t *at,
size_t size,
cpu_set_t *cpuset);

Only specific worker threads should be pinned to cores.
There should be a maximum of n-1 worker threads, where n is the number of cores.
Supplementary threads should be left unpinned. They will most likely run on the nth core.

http://www.kernel.org/doc/man-pages/online/pages/man3/pthread_setaffinity_np.3.html

Saturday, 12 November 2011

io

epoll
eventfd for inter-thread comms
epollfd for socked comms

#include <sys/epoll.h>

int epoll_create (int size)

A successful call to epoll_create() instantiates a new epoll context, and returns a file descriptor associated with the instance.
The size parameter is a hint to the kernel about the number of file descriptors that are going to be monitored; it is not the maximum number. Passing in a good approximation will result in better performance

once done, release the resources with close(fd)

int epoll_ctl (int epfd, int op,
int fd,
struct epoll_event *event);

epoll_ctl() system call can be used to add file descriptors to and remove file descriptors from a given epoll context

struct epoll_event
{
__u32 events; /* events */
union
{
void *ptr;
int fd;
__u32 u32;
__u64 u64;
} data;
};

https://www.kernel.org/doc/man-pages/online/pages/man2/eventfd.2.html
http://stackoverflow.com/search?q=epoll
http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-03/msg00084.html
http://linux.die.net/man/2/epoll_ctl
http://en.wikipedia.org/wiki/Epoll
http://www.kegel.com/c10k.html
http://bulk.fefe.de/scalability/
http://pl.atyp.us/content/tech/servers.html
http://www.citi.umich.edu/projects/linux-scalability/reports/accept.html
http://www.devshed.com/c/a/BrainDump/Linux-Files-and-the-Event-Poll-Interface/

Petaflop

Wednesday, 14 December 2011

dynamically linked libraries

Tuesday, 13 December 2011

Fedora 16: enable core files

Monday, 12 December 2011

Fedora 16: Yum and package management

Sunday, 11 December 2011

gcc inline assembler

Monday, 5 December 2011

Fedora 16: yum install c++ programming libraries

.bashrc

.vimrc

Saturday, 3 December 2011

OSX: enable nfs

Fedora 16: NFS Server

Friday, 2 December 2011

Fedora 16: Install Squeezebox Server

Wednesday, 30 November 2011

Fedora 16 - enable vnc server

Fedora 16: mount software raid

Fedora 16: Enable ssh

Monday, 28 November 2011

Lightweight unit test framework

Wednesday, 23 November 2011

Lean architecture philosophy

MTS malloc

C++11 Lambdas

Disruptor Concurrent Programming Framework

Interactive Linux Kernel Map

Tuesday, 22 November 2011

Page faults

Spinlocks

Real time scheduling

Pinning threads to cores

Saturday, 12 November 2011

io

Labels

About Me

Blog Archive