I manage the project! You listen to me and do what I say! I am the dictator and not necessarily benevolent!
Stolen from many places, put into practice many times! Heed or die!
- Don't parse. Projects promoting programming in "natural language" are intrinsically doomed to fail. Much of the CS research has been around writing languages and protocols in a natural human language. To get computers to understand natural language demands parsing. Computers aren't human hence the contortions need to write natural languages, compilers, etc. Parsing is very difficult to get right, even with tools like lexx and yacc. Just say no!
- Do as little as possible as root.
- Move separate functions into mutually untrusting programs.
- Sanbox all mutually untrusting programs.
- Use virtualization to mitigate programmer errors
- Chroot all untrusting programs (and all programs are untrusted ;-) )
- ulimit/rlimit all virtualized environments
- Restrict the namespace of all untrusted programs (and all are untrusted)
- Use key-based authentication to maintain capabilities
- All exposed low level interfaces use the same protocol for communication. In our case, the standard protocol is 9P2000, and all services are exported as filesystems.
- All exposed mid-level interfaces use the same data format for communication. In our case all monitors will use a simple hashtable of string-to-string values organized in a hierarchical data format very similar to the Java Properties object. This format is very easy to serialize and very easy to work with.
- All programmatic access to the hierarchical data format will be as a filesystem. This allows for using standard filesystem operations on traversing and modifying the hierarchical data format. It also allows for the use of the sh to access and modify the HDF, leveraging the Inferno shells strong string and filesystem manipulation tools, and sh's embeddability.
- Export as few functions as possible from a module. Modules are the basic code structuring entity in Inferno. A module can contain a large number of functions but only functions which are included in the export list of the module can be called from outside the module. Seen from the outside the complexity of a module depends upon the number of functions which are exported from the module. A module which exports one or two functions is usually easier to understand than a module which exports dozens of functions. Modules where the ratio of exported/non-exported functions is low are desirable in that a user of the module only needs to understand the functionality of the functions which are exported from the module. In addition, the writer or maintainer of the code in the module can change the internal structure of the module in any appropriate manner provided the external interface remains unchanged.
- Try to reduce inter module dependencies. A module which calls functions in many different modules will be more difficult to maintain than a module which only calls functions in a few different modules. This is because each time we make a change to a module interface, we have to check all places in the code where this module is called. Reducing the interdependencies between modules simplifies the problem of maintaining these modules. We can simplify the system structure by reducing the number of different modules which are called from a given module. Note also that it is desirable that the inter-module calling dependencies form a tree and not a cyclic graph
- Don't make assumptions about what the caller will do with the results of a function. Don't make assumptions about why a function has been called or about what the caller of a function wishes to do with the results. For example, suppose we call a routine with certain arguments which may be invalid. The implementer of the routine should not make any assumptions about what the caller of the function wishes to happen when the arguments are invalid.
- Abstract out common patterns of code or behavior. Whenever you have the same pattern of code in two or more places in the code try to isolate this in a common function and call this function instead of having the code in two different places. Copied code requires much effort to maintain. If you see similar patterns of code (i.e. almost identical) in two or more places in the code it is worth taking some time to see if one cannot change the problem slightly to make the different cases the same and then write a small amount of additional code to describe the differences between the two. Avoid "copy" and "paste" programming, use functions!
- Write your program using the top-down fashion, not bottom-up (starting with details). Top-down is a nice way of successively approaching details of the implementation, ending up with defining primitive functions. The code will be independent of representation since the representation is not known when the higher levels of code are designed.
- Don't optimize code. Don't optimize your code at the first stage. First make it right, then (if necessary) make it fast (while keeping it right).
- Use the principle of "least astonishment". The system should always respond in a manner which causes the "least astonishment" to the user - i.e. a user should be able to predict what will happen when they do something and not be astonished by the result. This has to do with consistency, a consistent system where different modules do things in a similar manner will be much easier to understand than a system where each module does things in a different manner. If you get astonished by what a function does, either your function solves the wrong problem or it has a wrong name.
- Try to eliminate, or at least isolate, side effects.
- Don't allow private data structures or functions to "leak" out of a module
- Do not program "defensively". Defensive program is one where the programmer does not "trust" the input data to the part of the system they are programming. In general one should not test input data to functions for correctness. Most of the code in the system should be written with the assumption that the input data to the function in question is correct. Only a small part of the code should actually perform any checking of the data. This is usually done when data "enters" the system for the first time, once data has been checked as it enters the system it should thereafter be assumed correct. The caller is responsible for supplying correct input.
- Do and undo things in the same function.
- Separate error handling and normal case code. Don't clutter code for the "normal case" with code designed to handle exceptions. As far as possible you should only program the normal case. If the code for the normal case fails, your process should report the error and crash as soon as possible. Don't try to fix up the error and continue. The error should be handled in a different process. Clean separation of error recovery code and normal case code should greatly simplify the overall system design. The error logs which are generated when a software or hardware error is detected will be used at a later stage to diagnose and correct the error. A permanent record of any information that will be helpful in this process should be kept.
- Implement a process in one module. Code for implementing a single process should be contained in one module. A process can call functions in any library routines but the code for the "top loop" of the process should be contained in a single module. The code for the top loop of a process should not be split into several modules - this would make the flow of control extremely difficult to understand. This does not mean that one should not make use of generic server libraries, these are for helping structuring the control flow. Conversely, code for no more than one kind of process should be implemented in a single module. Modules containing code for several different processes can be extremely difficult to understand. The code for each individual process should be broken out into a separate module.
- Use processes for structuring the system. Processes are the basic system structuring elements. But don't use processes and message passing when a function call can be used instead.
- Assign exactly one parallel process to each true concurrent activity in the system. When deciding whether to implement things using sequential or parallel processes then the structure implied by the intrinsic structure of the problem should be used. The main rule is: "Use one parallel process to model each truly concurrent activity in the real world". If there is a one-to-one mapping between the number of parallel processes and the number of truly parallel activities in the real world, the program will be easy to understand.
- Each process should only have one "role". Processes can have different roles in the system, for example in
the client-server model. As far as possible a process should only have one role, i.e. it can be a client or
a server but should not combine these roles.
Other roles which process might have are:
- Supervisor: watches other processes and restarts them if they fail.
- Worker: a normal work process (can have errors).
- Trusted Worker: not allowed to have errors.
- Use generic functions for servers and protocol handlers wherever possible. In many circumstances it is a good idea to use generic server programs such as the generic server implemented in the standard libraries. Consistent use of a small set of generic servers will greatly simplify the total system structure. The same is possible for most of the protocol handling software in the system.
- Don't write deeply nested code. Nested code is code containing case/if/receive statements within other case/if/receive statements. It is bad programming style to write deeply nested code - the code has a tendency to drift across the page to the right and soon becomes unreadable. Try to limit most of your code to a maximum of two levels of indentation. This can be achieved by dividing the code into shorter functions.
- Don't write very large modules. A module should not contain more than 400 lines of source code. It is better to have several small modules than one large one.
- Don't write very long functions. Don't write functions with more than 15 to 20 lines of code. Split large function into several smaller ones. Don't solve the problem by writing long lines.
- When a program has nothing surprising to say, it should say nothing.
- When you must fail, fail noisily and as soon as possible.
- Write simple parts connected by clean interfaces.
- Clarity is better than cleverness.
- Design programs to be connected to other programs.
- Separate policy from mechanism; separate interfaces from engines. Example: X: Fashions in the look and feel of GUI toolkits may come and go, but raster operations and compositing are forever.
- Fold knowledge into data, so program logic can be stupid and robust.
- Process connections and data flow are data structures themselves. They should not be hardwired into the application logic. Components need to be loosely coupled to the point where they cannot make ANY assumptions about how they are going to be used.
- Avoid hand-hacking; write programs to write programs when you can. Example: build - build creates makefiles and does a better job at managing dependencies than just mk by itself.
And here are the folks that taught me all of the above!
With so many other distributed computing technologies out there, why are we introducing another one, which is probably less well-known than others? There are several reasons:
- Inferno is designed from the ground up for cross-platform distributed computing.
- It is a very mature technology, originating around 20 years ago in Lucentís Bell Labs. It has been used in industrial-quality products for many years and is known to be robust.
- It is free and Open Source Yes, it has changed!
- It is an operating system, but is typically run as an emulated application in another host OS. In this mode it takes up less than 1MB of RAM so it is extremely lightweight (considering its background in embedded systems, this is not surprising)
- In all these platforms, Inferno provides an identical environment for its applications (i.e. true platform independence). There are no exceptions to this.
- *nix and Linux users will already be familiar with many of Infernoís commands.
Currently, source code is managed using CVS in SourceForge's CVS environment. To browse the CVS Source go here.
All documents are managed using SourceForge. Documents for the Mofo project can be accessed here.