Compiling XDR

rpcgen compiler
rpcgen – c square.x

square_svc.c => server stub and skelethon
– main => registration / housekeeping
– square_prog.i
=> inernal code, request parsing, arg marshaling
=> -1 == version1
– square_proc_l_svc => actual procedure: must be impl by developer

Summarizing RPC Developement
From .x => header, stubs…
Developer
-server code
impl of square proc_1_svc
– client side
call squareproc_1()

rpcgen-C square.x => not thread safe!
y = squareproc_1 (&x, client_handle)

RPC daemon == portmapper
./sbin/portmap (need sudo privileges)
Query with rpcinfo -p
./usr/sbin/rpcinfo -p
program id, version, protocol(tcp, udp)
socket port number, service name…

Binding

client determines…
Which server should it connect to?
-service name, version number…
How will it connect to that server?
-IP address, network protocol,…

Registry == database of available services
-search for service name to find service (which) and contact details (now)
-deistributed
any RPC service can register
-machine-specific
for services running on same machine
client must know machine address
-> registry provides port number needed for connection

Application use binding & registries

procedure interface: foo(int, int*)
in local calls: foo(x, y) => ok
in remote calls: foo(x, y) => ???

handling partial failer
when a client hangs..
– server down? service down? network down? message lost?
– timeout and tretry => no guarantees

struct square_in {
	int arg1;
};
struct square_out {
	int res1;
};

program SQUARE_PROG {
	version SQUARE_VERS {
		square_out SQUARE_PROG(square_in) = 1;
	} = 1;
} = 0x31230000;

Remote Procedure Calls

– remote procedure calls
– “implementing remote procedure calls”

client-server, create and init sockets, allocate and populate buffers, include ‘protocol’ info, copy data into buffers(file name, file)

RPC == intended to simplify the development of cross address space & cross-machine interactions Benefit of RPC

higher-level interface for data movement & communication
error handling
hiding complexities of cross-machine interactions

1. client/server interactions
2. Procedure call interface => RPC
– sync. call semantics

Interface specification with IDL
an IDL used to describe the interface the server exports
– procedure name, arg result types
– version #

struct data in {
	string vstr<128>;
};

struct data_out {
	string vstr<128>;
};

program MY_PROG {
	version MY_VERS {
		data_out MY_PROC(data_in) = 1;
	} = 1;
} = 0x31230000;
public interface Hello extends Remote {
	public String sayHello(String s){
		throws RemoteException;
	}
} // Java JMI

Memory Virtualization Full

Full virtualization
– all guests expect contiguous physical memory, starting at 0
– virtual vs. physical vs. machine addresses and page frame numbers

option
-guest page table: VA
-hypervisor: PA

-guest page tables: VA -> PA
-hypervisor shadow PT: VA -> MA

device access control split between
-font and driver in guest VM(device API)
-back-end driver in service VM(or host)
-modified guest drivers
=> i.e., limited to paravirtualized guests

eliminate emulation overhead

Virtualization

virtualization allows concurrent execution of multiple OSs(and their applications) on the same physical machine.
virtual resources == each os thinks that it “owns” hardware resources
-virtual machine(VM) == os + applications
+ virtual resources (guest domain)
-virtualization layer == management of physical hardware (virtual machine monitor, hypervisor)

A virtual machine … is an efficient, isolated duplicate of the real machine
supported by a virtual machine monitor(VHM)

Bare-metal Virtualization
Xen(open source or citrix xen server)
– dom o and dom Us
– drivers in dom O
ESX(VMware)
– many open APIs
– drivers in VMM
– used to have Linux control core, now remote APIs

Hoseted
– host os owns all hardware
– special VMM module provides hardware interfaces to VMs and deals with VM context switching
Example: KVM
(kernel-based VM)
– based on Linux
– KVM kernel module + QEMU for hardware virtualization
– leverages Linux open-source community

Hardware Protection Levels
commodity hardware has more than 2 protection levels
e.g., x86 has 4 protection levels(rings)
ring 3: lowest privilege(apps)
ring 0: highest privilege(os)
-non-root: VMs
-root:

x86 pre 2005
– 4 rings, no root/non-root modes yet
– hypervisor in ring 0, guest OS in ring
But: 17 privileged instructions do not trap! fail silently!
hypervisor doesn’t know, so it doesn’t try to change settings
os doesn’t know, so assumes change was successful

Binary Translation
-goal: full virtualization == guest OS not modified
-approach: dynamic binary translation
1. inspect code block to be executed
2. if needed, translate to alternate instruction sequence
-e.g., to emulate desired behavior, possibly even avoiding trap
3. otherwise, run at hardware speeds

I/O Devices as Files

The following Linux commands all perform the same operation on an I/O device(represented as a file)
– cp file > /dev/lp
– cat file > /dev/lp
– echo “Hello, world” > /dev/lp

Note: Please feel free to use the Internet

Direct Memory Access(DMA)
NIC, data == network packet
-write command to request packet transmission
-configure DMA controller with in-memory address and size of packet buffer

Typical Device Access
user process -> kernel -> driver -> device
-system call
-in-kernel stack
-driver invocation
-device request configuration

OS Bypass
-device regs/data
directly accessible
-OS configures then out-of-the way
“user-level driver”
-relies on device features
-sufficient registers

processes use fules => logical storage unit
kernel file system
– where, how to find and access file
– os specifies interface
generic block layer
– os standardized block interface
device driver

// ...
int fd;
unsigned long numblocks = 0;

fd = open(argv[1], 0_RDONLY);
ioctl(fd, BLKGETSIZE, &numblocks);
close(fd);
// ...

I/O Management

have protocols
– interfaces for device I/O

have dedicated handlers
– device drivers, interrupt handlers…

decouple I/O details from core processing

input: keyboard, microphone
output: speaker, display
both of them: network interface card, flash card, hard disk drive

Basic I/O device features
control register
-command, data transfers, status
internal:micro-contorller, memory, other hardware-specific chips

microcontroller == device’s CPU
on device memory
other logic e.g. analog to digital converters

Peripheral component Interconnect(PCI)
-PCI Express(PCIe) (> PCI-x > PCI)

Other types of interconnects
-scsi bus
-peripheral bus
-bridges handle differences

Device Drivers
-per each device type
-responsible for device access, management and control
-provided by device manufacturers per os/version

Queueing Lock Implementation

init:
	flags[0] = has-lock;
	flags[1..p-1] = must-wait;
	queuelast = 0; // global variable

lock:
	myplace = r&inc(queuelast); // get ticket
	// spin
	while(flags[myplace mod p] == must-wait)
	// now in C.S
	flags[myplace mod p] = must-wait;

unlock:
	flags[myplace+1 mod p] = has-lock;

Under high loads
– queue best(most scalable), test_and_test_and_set worst
– static better than dynamic, ref, better than release(avoid extra invalidations)

Using Reader Write Locks

#include <linux/spinlock.h>
rwlock_t m;
read_lock(m);
	// critical section
read_unlock(m);

write_lock(m);
	// critical section
write_unlock(m);

-rwlock support in windows(.NET), Java, POSIX…
-read/write == shared/exclusive

semantic differences
-recursive read.lock … -> what happens on read.unlock

Monitors specify…
-shared resource
-entry procedure
-possible condition variables
On entry
-lock, check…
On exit
-unlock, check,signal

Monitors==high-level synchronization construct
-mesa by xerox parc
-java

Monitors == programming style
– enter / exit_critical section in threads

More Synchronization Constructs
spinlock, monitor, rwlock, condition variable, mutex, path expression, serializers

spinlock => basic sync construct
-alternative implementations of spinlocks
– generalize techniques

spinlock_init(lock):
	lock = free;

spinlock_lock(lock):
	spin:
		if(lock == free){ lock = busy; }
		else { goto spin; }

spinlock_unlock(lock):
	lock = free;

caches
– hide memory latency; memory “further away” due to contention

Shared memory design considerations

-different APIs/mechanisms for synchronization
-os provides shared memory, and is out of the way
-data passing/sync protocols are up to the programmer

large segment => manager for allocating/freeing mem from shared segment
many small segments => use pool of segments, queue of segment ids
=> communicate segment IOs among processes

synchronization is like … waiting for a coworker to finish so you can continue working
-may repeatedly check to continue
-may wait for a signal to continue
-waiting hurts performance

mutexes
condition variables
why more?
error prone/correctness/ease-of-use
lack of expressive power

Low level support
-hardware atomic instructions

Spinlocks(basic sync construct)
spinlock is like a mutex
-mutual exlusion
-lock and unlock(free)

spinlock_lock(s);
	// critical section
spinlock_unlock(s);

semaphores
-common sync construct in OS kernels
-like a traffic light: STOP and GO
-similar to a mutex… but more general

on init
-assigned a max value(positive int)
on try(wait)
– if non-zero => decrement and proceed
if initialized with
– semaphore == mutex (binary semaphore)
on exit(post)
– increment

#include <semaphore.h>

sem_t sem;
sem_init(sem_t *sem, int pshared, int count);
sem_wait(sem_t *sem);
sem_post(sem_t *sem);