/usr/web/sources/contrib/steve/Blog - Plan 9 from Bell Labs

Plan 9 from Bell Labs’s /usr/web/sources/contrib/steve/Blog

Fri Feb 15 11:42:19 GMT 2008
	bought a PPC Mac.

	Attacked webphoto again and made some progress,
	but photofs (the exif expander) needs a rework
	before its finished - the design had to change again;
	I never realised HTML was quite so badly specified.

	------

	Been thinking about cryptfs again.

	There are two seperate problems.
		encrypting filenames
		encrypting data

	As we are using CBC mode we want to ensure filenames with the
	same prefix don't have the same encrypted prefix, so we add a
	salt of a known length to the start of the filename and pad it
	with random data.  The salt can be random bytes or even the
	checksum (adler being my favorite) of the name.

	The filename is then replaced by the base64 (with '/' replaced
	by some other chatacter) of the encrypted filename, however
	this lengthens the filename which may cause probelsm if the
	native filesystem holding the file has restrictions on
	filename length.  For example you may run cryptfs on top of a
	u9fs session onto an MSDOS box with 8.3 filenames - not likely
	but it could happen; of course adding a salt further
	exacabates this problem.

	Alternatively filenames could be stored in the start of the
	data section of the file and the filename replaced by a short
	random ASCII name.

	Encrypting the data is not too hard though some aditional
	entropy would me nice to make the system more secure - some
	more random data.

	I could use the filename but that would mean that every time
	the filename changed I would have to re-encrypt the file.  I
	could use the salt from the filename but I would like to have
	somthing reasonably big - 16 bytes feels good (wild guess I
	admit) and that would trigger the problems above.

	I could prepend the data file with a block of random data
	which is the salt for the filename and for the file data.
	this feels like a good solution but does grow the file a bit.

	The file should be encrypted block-by-block and reading or
	writing any part of the file means I must read or write a
	whole block.  I can also use the block index in the file to
	ensure that two blocks with the same data (e.g.  a file full
	of zeros) will encrypt to different data.

	The problem comes if two people are accessing the same data
	via two seperate invocations of cryptfs then we have a race
	for a given block of an encrypted file.  If the file is held
	on plan9 I can set the exclusive bit on the file to provide
	some locking but it could be a remote ftp server via ftpfs, I
	can make no assumptions.

	If the host server is plan9 I could use Qid.version to determine
	if two users are accessing the same file and thus invalidating
	our cache though this would be a stat() on verey read & write
	and it is a very coarse indicator - the blocks may well not overlap.

	I could use RC4 which is a stream cypher and can encrypt a
	single byte at a time, thus there is no need for a
	read-modify-write cycle on a block and so no race.  I had
	planed to offer a selection of algorithms: aes, des,
	triple-des etc.  There are some other stream cyphers but again
	I am straying from known territory and I am worried of
	weakening the security of cryptfs, I would be happiest using a
	tried and tested cypher.

	I do know that RC4 can be weak if the first Kbytes of the
	stream are used (see the problems WEP has had) but provided
	this first part is scrapped I believe it is pretty strong.

Tue Nov 13 16:12:46 GMT 2007
	Some success writing a spelling corrector, now in my contrib - suggest.c.bz2
	use it as
		spell document | suggest

	It permutes the word by alternatly:
		deleting a letter,
		substuting a letter by any other,
		transposing a pair of letters
		inserting any letter in any position.
	This gives it a set of possibilities.

	It then generates a soundex value for these words and compares
	that to the soundexes of all the words in the dictionary.  It
	prints the first few words found which have a short edit
	distance (i.e.  the number of deletions, insertions,
	transpositions, or substutions) from the word we started with.

	This is basicially what GNU aspell does (it stores the
	metaphone dictionary in a dbm database, I calculate them every
	time), and produces a useful tool which is quick enough to be
	useful on modern hardware.

	I added another level of validation to reduce disk access,
	rejecting modifications that produce invalid triagrams based on
	those I found in the dictionary.  I was hoping this would
	speed up the program huguely - sadly it gave a 20% win which
	just about pays for itself.

Tue Oct 10 12:19:39 GMT 2007
	Had a play at generating photo gallery automaticially from a
	directory of thumbnail jpegs.  The important point for me is
	to require no configuration files and no initial "generate
	small images" script.  modern digital cameras store quite a
	bit of metadata in their jpegs (the EXIF standard), and this
	usually includes a thumbnail image builtin.

	spent an unreasonable amount of time understanding html to get
	this to work, and I should know more really - sure the HTML
	style police will be after me.

	Keeping the pages very simple (clean in my terms) - struggled
	to find some nice navigation arrows.  I got some from gimp
	which are OK but hopefully I will get some better ones later.

Wed Sep 19 12:59:46 BST 2007
	9win progress stopped children eat all your time!

	just some other thoughts, would be good to have a plot(1)
	driver that generated a plan9 bitmap rather than only the one
	that talks to /dev/draw.  I know you can generate a bitmap
	from your window but this becomes much more difficult if you
	want to generate graphs offline (via cron) for inclusion in
	web pages.  should be very easy really, in fact there may be
	somthing similar on comp.sources.unix or the like.

	Also I would love to write a samterm replacement which opens
	two or three files and does a two-way or three-way merge
	between them (by three-way I mean the two files have a known
	common ancestor).  This would be a GUI app which would allow
	you to take the text from either of two windows to be fed to
	the output or edit the output by hand.  sort of a GUI idiff(1)
	really.

Tue Aug 14 14:35:15 BST 2007
	Discovered to my amazment that Bruce Ellis is working on
	somthing like 9win (he calls it 9ee) also.  I hope we can
	share some work.  I always planned to pick up the RangBoom
	windows IFS driver to allow all windows programs to access the
	9win/9ee files I create - IE both microsoft word and 9win's
	ls(1) will see /dev/cons /proc /fd /net /cron etc.  This is
	improtant as the biggest argument against Russ Cox's 9apm code
	was the mental shift nescessary when moving between windows
	apps and plan9 ones.

Tue Jul 10 14:14:00 BST 2007
	Busy time of it lately, beacme a father which take up more time that I
	though possible, also very busy at work releasing products.

	9win is progressing, libc is now pretty much complete, all utf8
	compatible and uses no microsoft C runtime DLLs, only direct
	windows API calls, I only need to finish exec() and rfork(). 

	Exec is not too bad once you accept that you will have to run a thread
	to tidy up up orphans which are not going to be waited for, and also to
	relay on Notes for the real child - as the plan9 pid (Windows Tid) will
	change when the process execs. There is also the fun of quoting for the
	execed command line args but this has been done before.

	rfork is more painful. Most flags can be simply implemented but rfork(RFPROC)
	is mode difficult. The cygwin way is to CreateProcess() of yourself with
	the new process suspended and then modify its environment to be a mirror
	of your own; This is a good (abet slow) fork emulation.
	
	The book Windows 2000 Native API Reference has an intriguing snippet
	which almost does a native fork, unfortulately cloning our .data segment
	also appears to clone the CSRSS's initialised state and so fat the only
	way to force it to reconnect is to call an unpublished entrypoint (ugh!).

	a third alternative I have seen is to pretend we are doing a vfork() and
	switch internal context to the child process and return. Then on exec()
	exit(), or abort() we longjmp back to the saved parents thread. This assumes
	we only use rfork(RFPROC) as a prelude to exec() which may or may not be
	good enough for the plan9 apps...

	rfork(RFPROC) is also a bit knotty, I think I will just us a thread,
	copying the stack of the parent. This should work OK, though on return
	from the fork the app must not make assumptions about the addresses
	of automatics. This is very kludgy but I think I may get away with
	it given how most plan9 code is written, but I will have to experiment to
	find out.
	
	----------------

	Hackers again, I just got anoyed and blocked all ftp access from china
	in my firewall and the problems pretty much went away, this is a real shame
	but I don't want to waste time with these people.

Fri May 18 11:32:15 BST 2007
	did a bit more work on ps, reworked it to print some more useful info
	even if you don't have SE_DEBUG privilege. I also dropped win16 support
	(we don't really use it any more) and added an executable-type field
	in ps-s usual state field.

	had another hack at 9win - getting closer, main stumbling block seems to
	be crt0 which I can pinch/hack from mingw for now (lcc might also have
	some good ideas). Thanks to russ for a gcc-ised libc/port/pool.c free
	of kenc-isms.

Wed May  9 11:01:42 BST 2007
	had a happy hacker trying to bruteforce my ftp account, no real
	danger of them succeeding as they are using the account name 
	Administrator. Interestingly if you google for their IP address
	it is listed on a couple of andt-brute-force IP lists.

	I modified ftpd to add an exponentially increasing delay on failed
	passwords and limit the number of attempts allowed. I also included
	a delay on the startup time of ftpd based on the number of failed
	sessions from that IP address. I submitted this as a patch but geoff
	quite rightly chopped it down - I was tirred and got over excited.

	I may have a go at adding similar functionality but in the
	/rc/bin/service scripts, based on a new exit starus string from
	telnetd, sshserve, and ftpd.

Wed May  2 10:16:20 BST 2007
	Had a play at running Plan9 (386) binaries unchanged under Windows.

	Not as difficult as it sounds, you can memory map (CreateFileMapping())
	the executable file, MapViewOfFileEx() the text as executable
	(FILE_MAP_EXECUTE) and the data and bss as copy-on-write (FILE_MAP_COPY).
	The stack you can VirtualAllocEx()  to the aproopriate address.

	Windows executables are relocated on load, but you can force them to
	a known (fixed) address with linker options so the emulator could
	force itself to a location outside plan9 normal executable's space.

	Then you can use SetUnhandledExceptionFilter() to catch the "trap 64"s
	that reprisent plan9 system calls (and there are only 50 odd of those to
	emulate).

	I can even put a placeholder segement under and over the stack and
	and after the heap so I can trap some memory access bugs.

	Unfortunately I now discovered that the pagesize on Plan9 x86 is 4kbytes
	whilst on Windows it 64k. This means that the text, data/bss/heap, and
	stack must start on 64k boundries, whereas in plan9 executables they
	are placed on 4k boundries. I can get round this easily enough using
	8l(1)s -R option to specify a different alignment (rounding) for the
	data segment but that means relinking all plan9 executables for windows,
	which is what I was trying not to do.

	Also, I was very suspicious that no-one had tried to do this for Linux
	or MacOs binaries running under windows. why hasn't anyone tried
	this before? maybe its more difficult that I thought?

	So this project is a dead end, shame really.

Fri Mar 23 13:14:48 GMT 2007
	Distracted with trying to bugvix cifs and aquarela, with some success
	but it is timeconsuming work, espicially the latter as I don't
	even know the code.

	Also looking at 9pm, trying to see how I could build windows plan9ports
	like environment. At present I think it would be best to build a central
	plan9-like kernel with a mount driver, devfs, devcons etc (like Russ's
	example and drawterm) rather than emulate at the level of libc like
	9pm did; Its not a clear decision though.

	I am starting by hacking drawterm which appears to be a very clean
	environment I.E. only minimaly changed from the plan9 kernel.

	
Tue Mar 13 08:35:35 GMT 2007
	I found an example of an animated gif which demonstrates the bug:

	hget http://www.quintile.net/broken-animated-image.gif | gif

	The problem appears to be that the image is incorrectly being
	erased between displaying each page of the animated sequence.

Fri Jan 26 11:06:18 GMT 2007

	Very nice detail about how Google Maps works,
	http://www.codeproject.com/useritems/googlemap.asp
	it would be fun to write a client for Plan9 if I only had the time.

	Been playing with the Win32 API learning a bit more,
	u9cpu is getting a little closer.

Previous work
~~~~~~~~~~~~~

Wed Jan 17 23:13:17 GMT 2007

u9cpu
	a CPU server for Win32 would make my life quite a bit easier, this is not
	too difficult if I port the TLS code from p9p (or perhaps the code in
	secstore). The pain comes in supporting Windows and to allow the server to
	automount network drives (hopefully by examining /mnt/term/dev/ns to discover
	which are in use). I also need to ad another authentication method which passes the
	users username and password for re-authentication on windows (sad but true).
	I may be able to get the windows mount driver code from friends which
	would be great but they have comercial interests so it may not be possible.

cifs
	DFS implementation is a bodge, it only works for a single
	server with dfs redirections between its shares (what we have) -
	It is NOT able to connect to a seccond machine.  It should be
	implemented by binding a different part of cifs's own
	namespace or by starting another cifs instance (if the
	redirection is to a remote machine).

	MAC signing doesn't work.  All the code is there but the
	signatures are wrong.  I thought this was required for Win2k3
	but it turns out MAC signing is optional so it is not a
	requirement.  As part of the work to get MAC signing going the
	factotum keys are user/pass rather than MSCHAP/MSCHAPv2; the
	mschap auth being done in the cifs app.  This is a temporary
	measure. When MAC signing works the MSCHAP and MSCHAPv2
	fields passed between cifs and factotum (and pop and IMAP I
	think) will have to be modified to add the MAC signing key which
	is not present currently.

	Shares are enumerated using RAP (Remote Adminstartion
	Protocol).  This has quite a few restrictions, most notably
	that share names cannot be more than 13 chars long.  The
	proper solution to this is to use the MS-RPC (Microsoft Remote
	Procedure Call) interface.  This has many more admin features
	and no 13 character restriction on share names.  I have not
	started MS-RPC and have a fear that it may require SPNEGO auth
	or perhaps kerberos, I can find no definitive answer as yet.

	No support for changing passwords exists at present, a nice
	simple synthetic file into which you write your old password
	and new password and read back either a blank or an error
	message should be enough.  This could be implemenetd with RAP
	and should not be a problem.  I now have password ageing enabled
	at work (sigh) so my need for this is increasing so I may
	implement it soon (Dec 06).

	I feel guilty about adding all the extra synthetic files for
	Users and Shares etc into the main cifs heirarchy.  These
	should really go in a parallel heirarchy which could be
	mounted using a different attach specifier, somthing like
	"admin".

	cifs could abstract out sessions completely so the attach
	specifier could describe the host to attach to and the
	username to use.  Thus dfs would become simply an auto-mount
	or an auto-bind.

	setting/resetting rdonly flag effects mtime - it shouldn't.

	rap and rcp could (should) be seperate programs which generate
	their own synthetic filesystems based on messeges read/written
	to a synthetic file in cifs's filesystem; the protocols layer
	quite cleanly so why not the filesystem implementations?

	More info on RCP here:
	http://www.hsc.fr/ressources/articles/win_net_srv
	http://www.xfocus.net/projects/Xcon/2003/Xcon2003_kkqq.pdf

ncpfs
	All development work stopped, my last NetWare server has gone.
	Ncpfs works fine but doesn't support NDS, only Bindary access
	so you must attach to each server you want to mount individually
	rather than being able to attach to an NDS Tree like you can
	from windows. One recent change - tiny edit as ``reject'' is now a
	keyword in our compilers (C99 I guess).

cvsfs
	No outstanding problems (well none that can be fixed given the CVS protocol).
	There is a possibility of making it a bit quicker to connect by using
	aditional features of NTcvs (NTcvs is not limited to Windows and implements
	an extended feature-set). I haven't looked into this

cryptfs
	Mostly done but not finished as I couldn't decide how I wanted it to work.
	Russ produced a set of patches for kfs(1) to encrypt data
	(see /n/sources/contrib/rsc/cryptfs) but i intended cryptfs to be a
	pass-through file server encrypting data and filenames on a remote file-server.

	The pass-thru file server is done and support for AES, DES, DES3, and IDEA
	(IDEA lifted from inferno), however it stalled there.

pptfs
	done, works, bit clunky as the page numbers seem to be a bit random (no idea why)
	and somtimes unnescessary gubbins appears in some of the text files.
	pptfs cracks the powerpoint stream which itself is cracked from the OLE FAT filesystem
	by aux/olefs (part of the plan9 distribution).
	basicially you want to do:
		aux/olefs mypresentation.ppt
		pptfs
		cat /mnt/doc/ppt/*/*.txt
		page /mnt/doc/ppt/*/*jpg* /mnt/doc/ppt/*/*png* 

ndb/dns
	I submitted a patch which attempts to prevent dns poisioning which
	 was sorry'ed as it was incomplete, I should have another go; see
	/n/sources/patch/sorry/dnstcp-norecursion for more info.

	Still an outstanding bug where looking up an MX record on an non-cached
	DNS record gives "failed" rather than "resource not known", this is easily
	provable as doing a lookup of the A record on the name (to force it into the
	cache) results in MX lookups returning "resource not known" correctly.

	This is only seen when using upas/smtp with SMTP gateways which don't
	have MX records but which are passed on the "-h host" option. If
	"-g gateway" is used then upas/smtp works reliably.

sqlfs
	Started but haven't got very far, look at odbc.c in inferno for an
	apropriate file interface; it would be very nice if sqlfs was
	compatible with this.

chatfs
	Got code snippets and docs to support for MSIM, TOC(yahoo), and IRC
	(thanks Russ).

	Basic framework done and works for IRC (I use it to chat), it needs 
	the other drivers added and the file interface done
	
	Also, would be nice to attach a faces-like client showing who has
	been speaking in the last (say) 10 mins


Bugs and "missing" features
~~~~~~~~~~~~~~~~~~~~~~~~~~~

hget & webfs
	should support:
		 digest auth and proxy auth

		compressed (gzip) data transfers

		caching (bought the book...)
	
upas/fs
	add a "realname" synthetic file for each message.

	This would make it easy to run a script on a message to
	pull the users picture from ldap and feed it through mug(1)
	and write it to $home/lib/face

mug(1)
	should understand colour - maybe it is replaced by the image
	editing tool below.

marshal
	bind the current mail item over /mail/fs/mbox/current so
	scripts can find it easily

bug in pr
	when printing two collums some lines in the first collum
	get lost, dependant on their length - needs tested again.

9grid
	encrypted link between auth servers, using public keys, or
	replace p9sk1 with p9pk1 (public key auth for 9p)

	resource limits:
		core memory limitation
		I/O operations limitation
		fair share scheduler controlled by /lib/ndb/auth
		network bandwidth restriction controlled by /lib/ndb/auth
		disk quotas ??!?!?
	I wonder if these could be done by interspersing a fairly transparent
	filesystem beind the users rc(1), this could simply poll to see I/O use
	etc? proabaly not good enough but worth a punt I guess.

	I started to write a paper on cross domain authentication and the arguments
	for adding Public key authentication. This needs to be finished.

tojpg(1)
	we really should have tojpg in this day and age, though
	perhaps topng(1) makes it irrelevant?

png(1)
	png(1) cannot read some of the images created by topng(1),
	this is very sad.

gif(1)
	animation doesnt work properly with optimised gifs (where
	previous frame is susposed to show through depending on
	Alpha blending).

Vaopur ware - it would be nice but I'al probably never get round to it...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

image processing GUI
	A simple program that displays an image like page but allows you to
	select a reigion of interest (ROI) and push that through a list of filters.
	The list of filters should have a macro language to describe them and their
	mandatory args, this list should live in /lib/imged.rules or the like.
	Thus local tools could be added.

	there would also need to be some standard widgets like the pick-a-colour widget
	which would be envoked if the tool you wanted needed it, e.g.

	# /lib/imged.rules
	#name  		cmdline...
	#
	crop		crop -b $colour -r $rect
	# this would cause the chose-a-rect and pick-a-colour widgets to be envoked.
	#
	resample	resample -x $percent %
	# here we just resample the image with a percentage change

	we also need a load more tools, glg (gamma, lift, gain) - see  mug(1)
	for a very elegant way to offer these parameters. These should also be
	available in red, green, blue or cyan, magenta, yellow space.

	not sure if this model will work (toolkit of external commands which when envoked
	request a widget to get their parameters) but at least its clean.

	There is loads of interesting tools which are quite simple to do
		gain
		lift
		gamma
		median-filter	noise reduction
		rotate		fast for 90° increments, 3 shear algo for degrees
		histeq		auto gain/luminance
		fft2d		windowed 2D FFT
		project 	image by a field of vectors
		enhance		hf gain/loss, linear
		chrispen	nonlinear sharpening
		etc.


Video processing toolkit

	A little language which would allow you to process a stream of video
	by pipeing it through commands and also using low level (pixel level)
	operators. This should look as much like a proceedural language as
	possible but be able to be compiled into a set of dataflow operations
	and perhaps the pixel level ops should be JIT compiled to machine code.

	The important features are high efficency, low latency, a toolkit approach
	like the image editor above, and video streams are arbitarily long, they are
	not sequences of images, it is video.

	the system must respect interlace, 3:2, 4:5, and other strange field cadences

	data should be passed to child programs via pipes but the header of the file
	may contain a shared memory segement address at which the image data may be
	simply read. see: Eric Grosse, "How shall we connect our software tools"

	video must be held in files, the header of which contains simple, ascii token=value
	assignments, these are terminated by a blank line.

	pixel streams which are sampled in the same matrix may be interleaved but
	those sampled differently must be put in seperate files, eg: R,G,B can be held
	in a single file, as can H,S,L but Y must be seperate from U and V as they are
	sampled at half the rate of Y.

	copyright='Cable News Network'
	title='The JFK shooting - new evidence'
	lines=1080		# active lines per frame
	pixels=1920		# active pixels per line
	cadence=1:1		# ie video
	clock=74.25e6		# 74.25 Mhz sample clock
	chans=y			# luma channel only
	aspect=16:6		# display on a 16:9 monitor
	colour=SMPTE274		# colourspace defined by SMPTE274 specs
	...

	Note: there is NO specification of the length of the sequence, you just
	read until EOF (or forever if your input is /dev/tv). the chans are marked
	as 'y' only so the 'uv' channel must be in a seperate file.

webdavfs
	bought the book, nothing more

update X11 port
	make it work fast on all plan9 displays and make
	it offer 8, 16, and 32 bit pixmaps.

fossil
	allow it to snap a vac score so files already absorbed
	can be amalgamated into venti's dump filesystem.

libder.a
	a cannonical library for use in snmpfs, ldapfs, kerberos etc

kerberos
	kerberos support in factotum, no pressing need, just it "should" be there.

ldapfs
	usefull in a Windows enviroment to find peoples phone numbers/email addresses

tsv
	Microsoft terminal services client - ISODE has a lot to answer for,
	the only documentation is some ISO docs and the source of the linux
	rdesktop program - see sourceforge.
(Return to Plan 9 Home Page)