From ead841c2f28d17e7916fba601e50763c8b01c122 Mon Sep 17 00:00:00 2001 From: Sol Jerome Date: Sat, 6 Mar 2010 20:39:12 +0000 Subject: doc: Integrate more unsorted documents Signed-off-by: Sol Jerome git-svn-id: https://svn.mcs.anl.gov/repos/bcfg/trunk/bcfg2@5754 ce84e21b-d406-0410-9b95-82705330c041 --- doc/architecture.txt | 354 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 354 insertions(+) create mode 100644 doc/architecture.txt (limited to 'doc/architecture.txt') diff --git a/doc/architecture.txt b/doc/architecture.txt new file mode 100644 index 000000000..ef2fe3830 --- /dev/null +++ b/doc/architecture.txt @@ -0,0 +1,354 @@ +.. -*- mode: rst -*- + +.. _architecture: + +=========================== +Detailed Bcfg2 Architecture +=========================== + +Bcfg2 is based on a client-server architecture. The client is responsible +for interpreting (but not processing) the configuration served by the +server. This configuration is literal, so no local process is required. +After completion of the configuration process, the client uploads a set +of statistics to the server. This section will describe the goals and +then the architecture motivated by it. + +Goals +===== + +* **Model configurations using declarative semantics.** + + Declarative semantics maximize the utility of configuration management + tools; they provide the most flexibility for the tool to determine + the right course of action in any given situation. This means that + users can focus on the task of describing the desired configuration, + while leaving the task of transitioning clients states to the tool. + +* **Configuration descriptions should be comprehensive.** + + This means that configurations served to the client should be sufficient + to reproduce all desired functionality. This assumption allows the + use of heuristics to detect extra configuration, aiding in reliable, + comprehensive configuration definitions. + +* **Provide a flexible approach to user interactions.** + + Most configuration management systems take a rigid approach to user + interactions; that is, either the client system is always correct, + or the central system is. This means that users are forced into an + overly proscribed model where the system asserts where correct data + is. Configuration data modification is frequently undertaken on both + the configuration server and clients. Hence, the existence of a single + canonical data location can easily pose a problem during normal tool + use. Bcfg2 takes a different approach. + +The default assumption is that data on the server is correct, however, +the client has the option to run in another mode where local changes are +catalogued for server-side integration. If the Bcfg2 client is run in dry +run mode, it can help to reconcile differences between current client +state and the configuration described on the server. The Bcfg2 client +also searches for extra configuration; that is, configuration that is +not specified by the configuration description. When extra configuration +is found, either configuration has been removed from the configuration +description on the server, or manual configuration has occurred on the +client. Options related to two-way verification and removal are useful +for configuration reconciliation when interactive access is used. + +* Plugins and administrative applications. +* Incremental operations. + +The Bcfg2 Client +================ + +The Bcfg2 client performs all client configuration or reconfiguration +operations. It renders a declarative configuration specification, provided +by the Bcfg2 server, into a set of configuration operations which will, +if executed, attempt to change the client's state into that described by +the configuration specification. Conceptually, the Bcfg2 client serves to +isolate the Bcfg2 server and specification from the imperative operations +required to implement configuration changes. + +This isolation allows declarative specifications to be manipulated +symbolically on the server, without needing to understand the properties +of the underlying system tools. In this way, the Bcfg2 client acts +as a sort of expert system that *knows* how to implement declarative +configuration changes. + +The operation of the Bcfg2 client is intended to be as simple as +possible. The normal configuration process consists of four main steps: + +* **Probe Execution** + + During the probe execution stage, the client connects to the server + and downloads a series of probes to execute. These probes reveal + local facts to the Bcfg2 server. For example, a probe could discover + the type of video card in a system. The Bcfg2 client returns this + data to the server, where it can influence the client configuration + generation process. + +* **Configuration Download and Inventory** + + The Bcfg2 client now downloads a configuration specification from the + Bcfg2 server. The configuration describes the complete target state + of the machine. That is, all aspects of client configuration should + be represented in this specification. For example, all software + packages and services should be represented in the configuration + specification. The client now performs a local system inventory. + This process consists of verifying each entry present in the + configuration specification. After this check is completed, heuristic + checks are executed for configuration not included in the configuration + specification. We refer to this inventory process as 2-way validation, + as first we verify that the client contains all configuration that + is included in the specification, then we check if the client has + any extra configuration that isn't present. This provides a fairly + rigorous notion of client configuration congruence. Once the 2-way + verification process has been performed, the client has built a list of + all configuration entries that are out of spec. This list has two parts: + specified configuration that is incorrect (or missing) and unspecified + configuration that should be removed. + +* **Configuration Update** + + The client now attempts to update its configuration to match the + specification. Depending on options, changes may not (or only partially) + be performed. First, if extra configuration correction is enabled, + extra configuration can be removed. Then the remaining changes + are processed. The Bcfg2 client loops while progress is made in the + correction of these incorrect configuration entries. This loop results + in the client being able to accomplish all it will be able to during + one execution. Once all entries are fixed, or no progress is being + made, the loop terminates. Once all configuration changes that can be + performed have been, bundle dependencies are handled. Bundle groupings + result in two different behaviors. Contained entries are assumed + to be inter-dependent. To address this, the client re-verifies each + entry in any bundle containing an updates configuration entry. Also, + services contained in modified bundles are restarted. + +* **Statistics Upload** + + Once the reconfiguration process has concluded, the client reports + information back to the server about the actions it performed during the + reconfiguration process. Statistics function as a detailed return code + from the client. The server stores statistics information. Information + included in this statistics update includes (but is not limited to): + + * Overall client status (clean/dirty) + * List of modified configuration entries + * List of uncorrectable configuration entries + * List of unmanaged configuration entries + +Architecture Abstraction +------------------------ + +The Bcfg2 client internally supports the administrative tools available +on different architectures. For example, ``rpm`` and ``apt-get`` are +both supported, allowing operation of Debian, Redhat, SUSE, and Mandriva +systems. The client toolset is determined based on the availability of +client tools. The client includes a series of libraries which describe +how to interact with the system tools on a particular platform. + +Three of the libraries exist. There is a base set of functions, which +contain definitions describing how to perform POSIX operations. Support +for configuration files, directories, symlinks, hardlinks, etc., are +included here. Two other libraries subclass this one, providing support +for Debian and rpm-based systems. + +The Debian toolset includes support for apt-get and update-rc.d. These +tools provide the ability to install and remove packages, and to install +and remove services. + +The Redhat toolset includes support for rpm and chkconfig. Any other +platform that uses these tools can also use this toolset. Hence, all +of the other familiar rpm-based distributions can use this toolset +without issue. + +Other platforms can easily use the POSIX toolset, ignoring support for +packages or services. Alternatively, adding support for new toolsets +isn't difficult. Each toolset consists of about 125 lines of python code. + +The Bcfg2 Server +================ + +The Bcfg2 server is responsible for taking a network description and +turning it into a series of configuration specifications for particular +clients. It also manages probed data and tracks statistics for clients. + +The Bcfg2 server takes information from two sources when generating +client configuration specifications. The first is a pool of metadata that +describes clients as members of an aspect-based classing system. That is, +clients are defined in terms of aspects of their behavior. The other is +a file system repository that contains mappings from metadata to literal +configuration. These are combined to form the literal configuration +specifications for clients. + +The Configuration Specification Construction Process +---------------------------------------------------- + +As we described in the previous section, the client connects to the server +to request a configuration specification. The server uses the client's +metadata and the file system repository to build a specification that +is tailored for the client. This process consists of the following steps: + +* **Metadata Lookup** + + The server uses the client's IP address to initiate the metadata + lookup. This initial metadata consists of a (profile, image) tuple. If + the client already has metadata registered, then it is used. If not, + then default values are used and stored for future use. This metadata + tuple is expanded using some profile and class definitions also included + in the metadata. The end result of this process is metadata consisting + of hostname, profile, image, a list of classes, a list of attributes + and a list of bundles. + +* **Abstract Configuration Construction** + + Once the server has the client metadata, it is used to create + an abstract configuration. An abstract configuration contains + all of the configuration elements that will exist in the final + specification **without** any specifics. All entries will be typed + (i.e. the tagname will be one of Package, Path, Action, etc) and will + include a name. These configuration entries are grouped into bundles, + which document installation time interdependencies. + +* **Configuration Binding** + + The abstract configuration determines the structure of the client + configuration, however, it doesn't yet contain literal configuration + information. After the abstract configuration is created, each + configuration entry must be bound to a client-specific value. The Bcfg2 + server uses plugins to provide these client-specific bindings. The Bcfg2 + server core contains a dispatch table that describes which plugins can + handle requests of a particular type. The responsible plugin is located + for each entry. It is called, passing in the configuration entry and + the client's metadata. The behavior of plugins is explicitly undefined, + so as to allow maximum flexibility. The behaviours of the stock plugins + are documented elsewhere in this manual. Once this binding process + is completed, the server has a literal, client-specific configuration + specification. This specification is complete and comprehensive; the + client doesn't need to process it at all in order to use it. It also + represents the totality of the configuration specified for the client. + +The Literal Configuration Specification +======================================= + +Literal configuration specifications are served to clients by the +Bcfg2 server. This is a differentiating factor for Bcfg2; all other +major configuration management systems use a non-literal configuration +specification. That is, the clients receive a symbolic configuration that +they process to implement target states. We took the literal approach +for a few reasons: + +* A small list of configuration element types can be defined, each of + which can have a set of defined semantics. This allows the server to + have a well-formed model of client-side operations. Without a static + lexicon with defined semantics, this isn't possible. This allows the + server, for example, to record the update of a package as a coherent + event. +* Literal configurations do not require client-side processing. Removing + client-side processing reduces the critical footprint of the tool. + That is, the Bcfg2 client (and the tools it calls) need to be + functional, but the rest of the system can be in any state. Yet, + the client will receive a correct configuration. +* Having static, defined element semantics also requires that all + operations be defined and implemented in advance. The implementation + can maximize reliability and robustness. In more ad-hoc setups, these + operations aren't necessarily safely implemented. + +The Structure of Specifications +------------------------------- + +Configuration specifications contain some number of clauses. Two types +of clauses exist. Bundles are groups of inter-dependent configuration +entities. The purpose of bundles is to encode installation-time +dependencies such that all new configuration is properly activated +during reconfiguration operations. That is, if a daemon configuration +file is changed, its daemon should be restarted. Another example of +bundle usage is the reconfiguration of a software package. If a package +contains a default configuration file, but it gets overwritten by an +environment-specific one, then that updated configuration file should +survive package upgrade. The purpose of bundles is to describe services, +or reconfigured software packages. Independent clauses contain groups +of configuration entities that aren't related in any way. This provides a +convenient mechanism that can be used for bulk installations of software. + +Each of these clauses contains some number of configuration entities. A +number of configuration entities exist including Path, Package, Service, +etc. Each of these correspond to the obvious system item. Configuration +specifications can get quite large; many systems have specifications +that top one megabyte in size. An example of one is included in an +appendix. These configurations can be written by hand, or generated by +the server. + +Design Considerations +===================== + +This section will discuss several aspects of the design of Bcfg2, and the +particular use cases that motivated them. Initially, this will consist +of a discussion of the system metadata, and the intended usage model +for package indices as well. + +System Metadata +--------------- + +Bcfg2 system metadata describes the underlying patterns in system +configurations. It describes commonalities and differences between these +specifications in a rigorous way. The groups used by Bcfg2's metadata are +responsible for differentiating clients from one another, and building +collections of allocatable configuration. + +The Bcfg2 metadata system has been designed with several high-level +goals in mind. Flexibility and precision are paramount concerns; no +configuration should be undescribable using the constructs present in +the Bcfg2 repository. We have found (generally the hard way) that any +assumptions about the inherent simplicity of configuration patterns tend +to be wrong, so obscenely complex configurations must be representable, +even if these requirements seem illogical during the implementation. + +In particular, we wanted to streamline several operations that commonly +occurred in our environment. + +* Copying one node's profile to another node. + + In many environments, many nodes are instances of a common configuration + specification. They all have similar roles and software. In our + environment, desktop machines were the best example of this. Other than + strictly per-host configuration like SSH keys, all desktop machines + use a common configuration specification. This trivializes the process + of creating a new desktop machine. + +* Creating a specialized version of an existing profile. + + In environments with highly varied configurations, departmental + infrastructure being a good example, "another machine like X but with + extra software" is a common requirement. For this reason, it must be + trivially possible to inherit most of a configuration specification + from some more generic source, while being able to describe overriding + aspects in a convenient fashion. + +* Compose several pre-existing configuration aspects to create a new profile. + + The ability to compose configuration aspects allows the easy creation + of new profiles based on a series of predefined set of configuration + specification fragments. The end result is more agility in environments + where change is the norm. + + In order for a classing system to be comprehensive, it must be usable in + complex ways. The Bcfg2 metadata system has constructs that map cleanly + to first-order logic. This implies that any complex configuration + pattern can be represented (at all) by the metadata, as first-order + logic is provably comprehensive. (There is a discussion later in the + document describing the metadata system in detail, and showing how it + corresponds to first-order logic) + +These use cases motivate several of the design decisions that we +made. There must be a many to one correspondence between clients and +groups. Membership in a given profile group must imbue a client with +all of its configuration properties. + +Package Management +------------------ + +The interface provided in the Bcfg2 repository for package specification +was designed with automation in mind. The goal was to support an +append only interface to the repository, so that users do not need to +continuously re-write already existing bits of specification. -- cgit v1.2.3-1-g7c22