summaryrefslogtreecommitdiffstats
path: root/doc/architecture.xml
blob: f484b65eb12485ebf385345a6e53934341dfdfdb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
<chapter>
<title>Bcfg2 Architecture</title>

  <para>
    Bcfg2 is based on a client-server architecture. The client is
    reponsible for interpreting (but not processing) the configuration
    served by the server. This configuration is literal, so no local
    process is required. After completion of the configuration
    process, the client uploads a set of statistics to the
    server. This section will describe the goals and then the
    architecture motivated by it.
  </para>

  <section>
    <title>Goals</title>

    <itemizedlist>
      <listitem>
	<para>
	  Model configurations using declarative
	  semantics. Declarative semantics maximize the utility of
	  configuration management tools; they provide the most
	  flexibility for the tool to determine the right course of
	  action in any given situation. This means that users can
	  focus on the task of describing the desired configuration,
	  while leaving the task of transitioning clients states to
	  the tool. 
	</para>
      </listitem>
      <listitem>
	<para>
	  Configuration descriptions should be comprehensive. This
	  means that configurations served to the client should be
	  sufficent to reproduce all desired functionality. This
	  assumption allows the use of heuristics to detect extra
	  configuration, aiding in reliable, comprehensive
	  configuration definitions.
	</para>
      </listitem>
      <listitem>
	<para>
	  Provide a flexible approach to user interactions. Most
	  configuration management systems take a rigid approach to
	  user interactions; that is, either the client system is
	  always correct, or the central system is. This means that
	  users are forced into an overly proscribed model where the
	  system asserts where correct data is. Configuration data
	  modification is frequently undertaken on both the
	  configuration server and clients. Hence, the existance of a
	  single canonical data location can easily pose a problem
	  during normal tool use.
	</para>
	<para>
	  Bcfg2 takes a different approach. The default assumption is
	  that data on the server is correct, however, the client has
	  option to run in another mode where local changes are
	  catalogued for server-side integration. If the Bcfg2 client is
	  run in dry run mode, it can help to reconcile differences
	  between current client state and the configuration described
	  on the server.
	</para>
	<para>
	  The Bcfg2 client also searches for extra configuration; that
	  is, configuration that is not specified by the configuration
	  description. When extra configuration is found, either
	  configuration has been removed from the configuration
	  description on the server, or manual configuration has
	  occurred on the client. Options related to two-way
	  verification and removal are useful for configuration
	  reconciliation when interactive access is used.
	</para>
      </listitem>
      <listitem>
	<para>
	  Plugins and administrative applications.
	</para>
      </listitem>
      <listitem>
	<para>
	  Imcremental operations.
	</para>
      </listitem>
    </itemizedlist>
  </section>

  <section>
    <title>The Bcfg2 client</title>

    <para>The Bcfg2 client performs all client configuration or
    reconfiguration operations. It renders a declarative configuration
    specification, provided by the Bcfg2 server, into a set of
    configuration operations which will, if executed, attempt to
    change the client's state into that described by the configuration
    specification. Conceptually, the Bcfg2 client serves to isolate
    the Bcfg2 server and specification from the imperative operations
    required to implement configuration changes. This isolation allows
    declarative specifications to be manipulated symbolically on the
    server, without needing to understand the properties of the
    underlying system tools. In this way, the Bcfg2 client acts as a
    sort of expert system that "knows" how to implement declarative
    configuration changes.</para>

    <para>The operation of the Bcfg2 client is intended to be as
      simple as possible. The normal configuration process consists of
      four main steps:</para>

<!--     <procedure> -->
<!--       <title>Bcfg2 Client Execution Process</title> -->

<!--       <step>foo</step> -->
<!--     </procedure> -->

    <orderedlist>
      <listitem>
	<para>Probe Execution</para>
	<para>During the probe execution stage, the client connects to
	  the server and downloads a series of probes to execute. These
	  probes reveal local facts to the Bcfg2 server. For example, a
	  probe could discover the type of video card in a system. The
	  Bcfg2 client returns this data to the server, where it can
	  influence the client configuration generation
	  process. 
	</para>
      </listitem>
      <listitem>
	<para>Configuration Download and Inventory</para>
	<para>
	  The Bcfg2 client now downloads a configuration
	  specification from the Bcfg2 server. The configuration
	  describes the complete target state of the machine. That is,
	  all aspects of client configuration should be represented in
	  this specification. For example, all software packages and
	  services should be represented in the configuration
	  specification. 
	</para>
	  
	<para>
	  The client now performs a local system inventory. This
	  process consists of verifying each entry present in the
	  configuration specification. After this check is completed,
	  heuristic checks for configuration not included in the
	  configuration specification. We refer to this inventory
	  process as 2-way validation, as first we verify that the
	  client contains all configuration that is included in the
	  specification, then we check if the client has any extra
	  configuration that isn't present. This provides a fairly
	  rigorous notion of client configuration congruence.  
	</para>

	<para>
	  Once the 2-way verification process has been
	  performed, the client has built a list of all configuration
	  entries that are out of spec. This list has two parts:
	  specified configuration that is incorrect (or missing) and
	  unspecified configuration that should be removed.
	</para>
      </listitem>
      <listitem>
	<para>Configuration Update</para>
	<para>
	  The client now attempts to update its configuration to
	  match the specification. Depending on options, changes may
	  not (or only partially) be performed. First, if extra
	  configuration correction is enabled, extra configuration can
	  be removed. Then the remaining changes are processed. The
	  Bcfg2 client loops while progress is made in the correction
	  of these incorrect configuration entries. This loop results
	  in the client being able to accomplish all it will be able
	  to during one execution. Once all entries are fixed, or no
	  progress is being made, the loop terminates.
	</para>
	  
	<para>
	  Once all configuration changes that can be performed have
	  been, bundle dependencies are handled. Bundle groupings
	  result in two different behaviors. Contained entries are
	  assumed to be inter-dependant. To address this, the client
	  re-verifies each entry in any bundle containing an updates
	  configuration entry. Also, services contained in modified
	  bundles are restarted.
	</para>
      </listitem>
      <listitem>
	<para>Statistics Upload</para>
	<para>
	  Once the reconfiguration process has concluded, the
	  client reports information back to the server about the
	  actions it performed during the reconfiguration
	  process. Statistics function as a detailed return code from
	  the client. The server stores statistics
	  information. Information included in this statistics update
	  includes (but is not limited to):
	</para>

	<itemizedlist>
	  <listitem>
	    <para>Overall client status (clean/dirty)</para>
	  </listitem>
	  <listitem>
	    <para>List of modified configuration entries</para>
	  </listitem>
	  <listitem>
	    <para>List of uncorrectable configuration entries</para>
	  </listitem>
	</itemizedlist>
      </listitem>
    </orderedlist>

    <section>
      <title>Architecture Abstraction</title>

      <para>The Bcfg2 client internally supports the administrative
	tools available on different architectures. For example, rpm and
	apt-get are both supported, allowing operation of Debian,
	Redhat, SUSE, and Mandriva systems. The client toolset is
	specified in the configuration specification. The client merely
	includes a series of libraries which describe how to interact
	with the system tools on a particular platform.
      </para>

      <para>
	Three of the libraries exist. There is a base set of
	functions, which contain definitions describing how to perform
	POSIX operations. Support for configuration files,
	directories, and symlinks are included here. Two other
	libraries subclass this one, providing support for Debian and
	rpm-based systems.
      </para>

      <para>
	The Debian toolset includes support for apt-get and
	update-rc.d. These tools provide the ability to install and
	remove packages, and to install and remove services.
      </para>

      <para>
	The Redhat toolset includes support for rpm and chkconfig. Any
	other platform that uses these tools can also use this
	toolset. Hence, all of the other familiar rpm-based
	distributions can use this toolset without issue. 
      </para>

      <para>
	Other platforms can easily use the POSIX toolset, ignoring
	support for packages or services. Alternatively, adding
	support for new toolsets isn't difficult. Each toolset
	consists of about 125 lines of python code.
      </para>
    </section>
  </section>
  <section>
    <title>The Bcfg2 Server</title>
    <para>
      The Bcfg2 server is responsible for taking a network description
      and turning it into a series of configuration specifications for
      particular clients. It also manages probed data and tracks
      statistics for clients. 
    </para>

    <para>
      The Bcfg2 server takes information from two sources when
      generating client configuration specifications. The first is a
      pool of metadata that describes clients as members of an
      aspect-based classing system. That is, clients are defined in
      terms of aspects of their behavior. The other is a file system
      repository that contains mappings from metadata to literal
      configuration. These are combined to form the literal
      configuration specifications for clients.
    </para>

    <section>
      <title>The Configuration Specification Construction Process</title>
      <para>
	As we described in the previous section, the client connects
	to the server to request a configuration specification. The
	server uses the client's metadata and the file system
	repository to build a specification that is tailored for the
	client. This process consists of the following steps:
      </para>

      <orderedlist>
	<listitem>
	  <para>Metadata Lookup</para>
	  <para>
	    The server uses the client's IP address to initiate the
	    metadata lookup. This initial metadata consists of a
	    (profile, image) tuple. If the client already has
	    metadata registered, then it is used. If not, then
	    default values are used and stored for future use.
	  </para>
	  
	  <para>
	    This metadata tuple is expanded using some profile and
	    class definitions also included in the metadata. The end
	    result of this process is metadata consisting of
	    hostname, profile, image, a list of classes, a list of
	    attributes and a list of bundles.
	  </para>
	</listitem>
	<listitem>
	  <para>Abstract Configuration Construction</para>
	  <para>
	    Once the server has the client metadata, it is used to
	    create an abstract configuration. An abstract
	    configuration contains all of the configuration elements
	    that will exist in the final specification without any
	    specifics. All entries will be typed (ie the tagname
	    will be one of Package, ConfigurationFile, Service,
	    Symlink, or Directory) and will include a name. These
	    configuration entries are grouped into bundles, which
	    document installation time interdependencies.
	  </para>
	</listitem>
	<listitem>
	  <para>Configuration Binding</para>
	  <para>
	    The abstract configuration determines the structure of
	    the client configuration, however, it doesn't contain
	    literal configuration information. After the abstract
	    configuration is created, each configuration entry must
	    be bound to a client-specific value. The Bcfg2 server
	    uses plugins to provide these client-specific bindings.
	  </para>

	  <para>
	    The Bcfg2 server core contains a dispatch table that
	    describes which plugins can handle requests of a
	    particular type. The responsible plugin is located
	    for each entry. It is called, passing in the
	    configuration entry and the client's metadata. The
	    behavior of plugins is explicitly undefined, so as to
	    allow maximum flexibility. The behaviors of the stock
	    plugins are documented elsewhere in this manual.
	  </para>

	  <para>
	    Once this binding process is completed, the server has a
	    literal, client-specific configuration
	    specification. This specification is complete and
	    comprehensive; the client doesn't need to process it at
	    all in order to use it. It also represents the totality
	    of the configuration specified for the client.	   
	  </para>
	</listitem>
      </orderedlist>
    </section>
  </section>

  <section>
    <title>The Literal Configuration Specification</title>
    
    <para>
      Literal configuration specifications are served to clients by
      the Bcfg2 server. This is a differentiating factor for Bcfg2;
      all other major configuration management systems use a
      non-literal configuration specification. That is, the clients
      receive a symbolic configuration that they process to implement
      target states. We took the literal approach for a few reasons:
    </para>
    
    <itemizedlist>
      <listitem>
	<para>
	  A small list of configuration element types can be defined,
	  each of which can have a set of defined semantics. This
	  allows the server to have a well-formed model of
	  client-side operations. Without a static lexicon with
	  defined semantics, this isn't possible. This allows the
	  server, for example, to record the update of a package as a
	  coherent event. 
	</para>
      </listitem>
      <listitem>
	<para>
	  Literal configurations do not require client-side
	  processing. Removing client-side processing reduces the
	  critical footprint of the tool. That is, the Bcfg2 client
	  (and the tools it calls) need to be functional, but the rest
	  of the system can be in any state. Yet, the client will
	  receive a correct configuration.
	</para>
      </listitem>
      <listitem>
	<para>
	  Having static, defined element semantics also requires that all
	  operations be defined and implemented in advance. The
	  implementation can maximize reliability and robustness. In
	  more ad-hoc setups, these operations aren't necessarily
	  safely implemented.
	</para>
      </listitem>
    </itemizedlist>

    <section>
      <title>The Structure of Specifications</title>
      <para>
	Configuration specifications contain some number of
	clauses. Two types of clauses exist. Bundles are groups of
	inter-dependant configuration entities. The purpose of bundles
	is to encode installation-time dependencies such that all new
	configuration is properly activated during reconfiguration
	operations. That is, if a daemon configuration file is
	changed, its daemon should be restarted. Another example of
	bundle usage is the reconfiguration of a software package. If
	a package contains a default configuration file, but it gets
	overwritten by an environment-specific one, then that updated
	configuration file should survive package upgrade. The purpose
	of bundles is to describe services, or reconfigured software
	packages. Independent clauses contains groups of configuration
	entities that aren't related in any way. This provides a
	convenient mechanism that can be used for bulk installations
	of software.
      </para>

      <para>
	Each of these clauses contains some number of configuration
	entities. Five types of configuration entities exist:
	ConfigurationFile, Package, SymLink, Directory, and
	Service. Each of these correspond to the obvious system item.
	Configuration specifications can get quite large; many systems
	have specifications that top one megabyte in size. An example
	of one is included in an appendix. These configurations can be
	written by hand, or generated by the server. The easiest way
	to start using Bcfg2 is to write small static configurations for
	clients. Once configurations get larger, this process gets
	unwieldy; at this point, using the server makes more sense.
      </para>
    </section>
  </section>

  <section>
    <title>Design Considerations</title>

    <para>
      This section will discuss several aspects of the design of
      bcfg2, and the particular use cases that motivated
      them. Initially, this will consist of a discussion of the system
      metadata, and the intended usage model for package indices as
      well. 
    </para>

    <section>
      <title>System Metadata</title>

      <para>
	Bcfg2 system metadata describes the underlying patterns in
	system configurations. It describes commonalities and
	differences between these specifications in a rigorous way. The
	groups used by bcfg2's metadata are responsible for
	differentiating clients from one another, and building
	collections of allocatable configuration.
      </para>

      <para>
	The Bcfg2 metadata system has been designed with several
	high-level goals in mind. Flexibility and precision are
	paramount concerns; no configuration should be undescribable
	using the constructs present in the bcfg2 repository. We have
	found (generally the hard way) that any assumptions about the
	inherent simplicity of configuration patterns tend to be
	wrong, so obscenely complex configurations must be
	representable, even if these requirements seem illogical
	during the implementation. 
      </para>

      <para>
	In particular, we wanted to streamline several operations that
	commonly occurred in our environment. 
      </para>
      
      <orderedlist>
	<listitem>
	  <para>Copying one node's profile to another
	  node.</para>

	  <para>
	    In many environments, many nodes are instances of a common
	    configuration specification. They all have similar roles
	    and software. In our environment, desktop machines were
	    the best example of this. Other than strictly per-host
	    configuration like SSH keys, all desktop machines use a
	    common configuration specification. This trivializes the
	    process of creating a new desktop machine.
	  </para>
	</listitem>
	<listitem>
	  <para>Creating a specialized version of a
	  currently existing profile. 
	  </para>

	  <para>
	    In environments with highly varied configurations,
	    departmental infrastructure being a good example, "another
	    machine like X but with extra software" is a common
	    requirement. For this reason, it must be trivially
	    possible to inherit most of a configuration specification
	    from some more generic source, while being able to
	    describe overriding aspects in a convenient fashion.
	  </para>
	</listitem>
	<listitem>
	  <para>
	    Compose several pre-existing configuration aspects to
	    create a new profile.
	  </para>

	  <para>
	    The ability to compose configuration aspects allows the
	    easy creation of new profiles based on a series of
	    predefined set of configuration specification
	    fragments. The end result is more agility in environments
	    where change is the norm.
	  </para>
	</listitem>
      </orderedlist>

      <para>
	In order for a classing system to be comprehensive, it must be
	usable in complex ways. The Bcfg2 metadata system has
	constructs that map cleanly to first-order logic. This implies
	that any complex configuration pattern can be represented (at
	all) by the metadata, as first-order logic is provably
	comprehensive. (There is a discussion later in the document
	describing the metadata system in detail, and showing how it
	corresponds to first-order logic)
      </para>

      <para>
	These use cases motivate several of the design decisions that
	we made:
      </para>

      <orderedlist>
	<listitem>
	  <para>
	    There must be a many to one correspondence between clients
	    and groups. Membership in a given profile group must imbue
	    a client with all of its configuration properties. 
	  </para>
	  </listitem>
      </orderedlist>
    </section>

    <section>
      <title>Package Management</title>

      <para>
	The interface provided in the bcfg2 repository for package
	specification was designed with automation in mind. The goal
	was to support an append only interface to the repository, so
	that users do not need to continuously re-write already
	existing bits of specification.
      </para>
    </section>
  </section>
</chapter>