Plan 9 from Bell Labs’s /usr/web/sources/contrib/sascharetzki/audiofs.txt

Copyright © 2021 Plan 9 Foundation.
Distributed under the MIT License.
Download the Plan 9 distribution.




		audio-input	\
					  >	pcm-multiplexer? (audio/fs?)
midi	  ->	synthesizer	/



audio/fs: thread(2)

- some kind of sequencer and pcm-multiplexer
- creates Ioproc to write to /dev/audio
- read/expect config in ./ctl
- create a subdir: project/.
- Clients are external programs, which stdin/out is available as files. They can be
everything, synthesizers (expecting MIDI), an echo-thing (expecting PCM), etc.
clients can be attached via ctl:
	echo client subdirname commandname [arguments] >>ctl
as ctl comes before project, tar(1) can be used to save/restore projects in a whole.
Tar will never have to mkdir, as right after ctl has ben untarred, it is interpreted and 
all clients are started and a generic file hierarchy is generated for every single one.
Actually, clients should be started in the last step, so that tar does not 'overtake' our
fork()/exec() process. We may also poll, but for various reasons, as stated below,
tar should continue and we should "proxy" the input. (./out will be empty)

A client thus looks like this:
subdirname/
	- in	(rw-, r for tar(1). MIDI/PCM data is stored here)
	- out (r--)
	- sched	(rw-, stores the realtime-instructions. you could write to the clients /proc file directly, but audiofs would not notice, so we are a little proxy so we know about these information, so tar(1) can be used)


Each clients stdin/out is thus available, via the Channel (procexec()s first arg?).



How can, in this model, clients be connected together? One cannot make chains with this.

If stdin/stdout, how do clients actually interact in a chain?


/dev/midi 	\
			  >  read by audio/synth -> audio/passfilter -low -> audio/fx -echo bl0b
bla.mid		/
(some sequencer...)

Remember: bind(1) does not help here - audio/fs would not know about the bind.


it works well on the cmdline:
term% cat /dev/midi | audio/synth | audio/passfilter -low | audio/fx -echo bl0b >/dev/audio


Solution:
We tokenize() on '|'. Then we create for each program a listentry in the style of:

typedef struct f00 {
	char *cmd;
	char **argv;
	int pid;
	int stdin;
	int stdout;
	int status

	f00 *next;
	f00 *prev;
};

We need to remember the pid to kill the process if the user wants us to disable it. stdin/stdout store the fds returned by a pipe(). current->stdin may be (set to) current->prev->stdout and current->stdout may be (set to) current->next->stdin, but in case we disable/kill a process, we achieve that data is passed from proc1 over proc2 to proc3 (and, if proc2 is to be disabled, we just override integers in the list, kill everything and repeat. That is why we need the status-integer: We need to know if the process is actually to be created).

The controller is supposed to export the following file hierarchy to make that all possible for the user:

subdirname/
	- synth/
		- in		(search for f00->name == dirname; write data to his stdin)
		- out	(search for f00->name == dirname; write data to his stdout)
		- sched
	- passfilter/
		- in
		- out
		- sched
	- fx/
		- in
		- out	(if f00->next == nil; out is supposed to be returned to the upper layer [X1] )
		- sched
	- in
	- out
	- sched (defunctional, permission --- <- I think its better to simply delete that file from the fileserver completely )

So one can provide input for the pipe (cat /dev/midi > subdirname/in), or insert it at any point in the pipe, (cat test.pcm >subdirname/passfilter/in), in which case the pipe is continued from there, that is, test.pcm is lowpass filtered and then send to audio/fx.


X1:
The upper layer should know where to look for the N-byte memory segment. The controller must read the last stdout anyway, so no fast ptr arithemthic here. The controller writes to a global area (?) known to the upper layer. Every second, that segment is read, mixed with the others, and written to /dev/audio or similar. Everyone knows how big N is, its the total size of one second, samplerate * channels * bits per sample.


Back on topic, as commands probably will have directory names in them, only the last element should be used. In case where there are two identical last elements, the complete path should be shown, but with '/' substituded by '_', that is:

subdirname/
	- audio_synth/
	- test_synth/

audio/fx is special. You will create chains with several fx/-dirs. If the complete-path-rewrite-function recognizes two identical names, the first command line argument is to be included, via '-'. If they are identical, too, the second one is, and so on. That looks like this:

subdirname/
 	- synth/
	- fx-echo-300/
	- fx-echo-0.43/



Special notes: How does 9grid work? What if the command is to be run on a cpu-server? We must catch such events and redirect sched-input to the remote procfs, instead of trying to write to ours. Furthermore, PID-discovery over cpu(1) (probably) works completely different - the pid fork() returned is cpu(1) ones, not the one of the remote process. 


audiofs/ctl command thus supports:
echo 'subdirname cmd1 args | cmd2 args | cmd3 args' >>ctl

Which is parsed out and the chain is to be created.


pros:
- modul-based: sound-effects, synthesizers and filters can be written as a (real-time aware) external program, which does not even have to take care about file servering. Programs may be used outside of audiofs.
- tar(1) works. audiofs just has to proxy initial writes to ./sched and ./in, because the fork()/exec() process may not be completed yet.
- If I want to introduce a status-file, I just change audiofs.
- The chain can be debugged - If the complete chain does not produce expected results, you can input own data at any point.
- For the clients, we are not "limited" to Plan9-C, except of the realtime scheduling, see proc-manpages (for(;;) to tell the scheduler things, etc). That means any ANSI/POSIX-C code may be used as long as it follows some simple rules.
- procs are said to be cheap on Plan9, I tend to believe that.




How are the various stdins/stdouts connected? Can stdin/stdout be mapped into our memory (and then mapped to their counterparts stdin/stdout again, to create a 'chain'? If that works, does it break if the user does echo test.pcm >passfilter/in?)



Other features:
- a delete of the dir kills the process and takes it out of the chain. a chmod u-rw puts it out of the chain, stops the process, but does not kill it. On chmod u+rw, the process is reactivated and hopefully ready again when the chain arrives there. That way, elements in the chain can be switched on/off dynamically without much hassle.
- Maybe we put a ./cmd-file into there, which stores the command that has been executed, including all arguments. It could be rw, so that you can change elements in the chain to change the command line args. Cmdline args are the only way to communicate to the process (in this model), so changing options via this should be there. FIXME: or the toplevel-ctl!?
- There is no mixer in this model yet: Every client (or chains of clients) output at 100% loudness, or the loudness they actually produce. As we are going to mix things together after every subdirname/-client(s) have produced PCM-data anyway, each subdirname/ may get a 'mixer' file. A global 'mixer' file should also be introduced.
- audiofs may collect all data that has been written out in an 'out'-file. This option should be off by default as a lot of data may be collected over time, and RAM isn't endless, astonishingly. Furthermore, realtime-scheduling could get a button, too: In case someone wants to create a PCM-file from the project, that is, "export the project to a file", you don't want real-time scheduling because you want every single bit from every client. 
- The project/ dir may get toplevel files like 'comments' (a 1024-byte file to be filled with human-readable information, not parsed.).




Sequencing
The solution is simple, all in-files are proxys. Data is not instantly written to stdin, but stored. Playback is then activated globally via the global ctl-file, or for every client, which means we want a ctl-file in each client-dir. Does this sustitude sched? The scheduling information are some kind of 'ctl thingie' too. There may be a 'realtime'-flag for each client to indicate wether we want to skip frames or save them. The point is, that the input may either be a midi/pcm-file or a device (/dev/audio, /dev/midi). The latter would mean we endlessly try to proxy the incoming data.


Each client is controlled by a process, which, on the start of playback, rendesvouz(2)s with us. That way, we are as much in sync as we can. The realtime scheduling does the rest for us.



Seeking
For music creation it is essential to just listen to a portion of the Project over and over again, for 'debugging'. I would like to introduce terms used in common debuggers, like breakpoints, single-stepping, tracing, etc right here.
breakpoints - A single start and a single endpoint in which to loop forever.
single-steps - split up the complete song in several single steps, and just play one step.
tracing - Do not actually multiplex into a single PCM-stream, offer the application (the GUI, for instance) all streams (how? Use PCM channels and hand it a channel-map?). That way, the GUI could visualize which waves come from which client.


The main problem is to tell the clients what to do - we need a lot of information for that to succeed. We need to know if the input is actually MIDI or PCM, for example. Then we have to tell each "controller" (that thing which execs the client, feeds data into and reads the results out, handing it the the upper layer, the "PCM multiplexer") to seek to a time (a real one, a global one) and to stop at another time. 

PCM is expected/converted to the same values, so telling ranges is not that of an issue. MIDI does, too - at least MIDI-files. Other things cannot be seeked to, anyway. 


MIDI
----
Event-ID			Codes (hex)		Arguments				Description
-------------------------------------------------------------------
-				0x00 - 0x7F		-						Keys from C2 to G8
NOTE_OFF		0x8n* kk vv		Key, release Velocity			0-127, 0-127
NOTE_ON**		0x9n kk vv		Key, attack Velocity			0-127, 0-127
POLY_AFTERTOUCH 0xAn kk vv		Key Velocity				Describes velocity changes of a single key
CONTROLLCH		0xBn cc vv		Controller, Value			Changes controller values (see below)
CHPROG			0xCn pp			Program					changes to a program. Are there programs in audiofs?
MONO_AFTERTOUCH
				0xDn vv			Velocity					Describes velocity changes for all keys
PITCH_BEND		0xEn vv [vv]		Value1, opt. Value2			7, 8 or 14 Bit different argument length. Oh lord.
SPECIAL			0xFn ll [...]		Length, others				Device-specific commands. Those things can be used
														for a lot of cool things.
libmidi also specifies some more enums that are essential for audiofs etc to work.
TEMPO			0xFF 51 03 tt tt tt	3 time bytes (NOT BPM)		See ***. Length == 0x51 == 81 bytes, can't be true. 


*: All 'n' bytes are Statusbytes, that is, the Channel.
**: A NOTE_ON with velocity 0 may be sent to describe a NOTE_OFF.
***: http://www.borg.com/~jglatt/tech/midifile/tempo.htm

Controller-specific events seem to be pretty uninteresting for us - we'll see. For now, CHCONTROLLER is to be ignored.
Are they? See http://home.snafu.de/sicpaul/midi/midi1.htm#cc
(http://home.snafu.de/sicpaul/midi/midiy.htm)




BPM
----
(taken from midiguy.com:)
BPM divided by 8 (2 bars of 4/4 time) or 6 (2 bars of 3/4 time) = n
60 divided by n = Loop Length in seconds

60 divided by Loop Length in Seconds = n
n * 8 or 6 == BPM



So MIDI knows about timing, that's cool. At least the file format does, and nothing else should be taken care about, as /dev/midi, if existing, will be realtime, anyway.

So the sequencer, which helps the user construct a midi-file, writes the complete midi-file into the in-file. In case of breakpoints (looping a portion of the song over and over again), audiofs can tell the synthesizer-client (no, see below) the position in seconds, and the application can go there by knowing the tempo (if no TEMPO was there, it is 120 BPM, iirc.). 

Care must be taken because MIDI-files apperently can simply change the BPM in the middle of the stream, so in that case, the complete MIDI-stream must be parsed, !TEMPO must be skipped, TEMPO must be interpreted and taken into account to get the right position in the stream. The same thing applies to the end-point of the loop, as there might be TEMPOs in the middle of the selected area.

The work is done in audiofs, the controller. We have no real way to communicate with teh client(s), furthermore every client would have to link to libmidi and take care for itsself - error prone. 



Bell Labs OSI certified Powered by Plan 9

(Return to Plan 9 Home Page)

Copyright © 2021 Plan 9 Foundation. All Rights Reserved.
Comments to [email protected].