From 98e2821b38a775737e42a2479a6bc65107210859 Mon Sep 17 00:00:00 2001 From: Elliot Kroo Date: Thu, 11 Mar 2010 15:21:30 -0800 Subject: reorganizing the first level of folders (trunk/branch folders are not the git way :) --- infrastructure/ace/easysync-notes.txt | 129 ++++++++++++++++++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 infrastructure/ace/easysync-notes.txt (limited to 'infrastructure/ace/easysync-notes.txt') diff --git a/infrastructure/ace/easysync-notes.txt b/infrastructure/ace/easysync-notes.txt new file mode 100644 index 0000000..6808f40 --- /dev/null +++ b/infrastructure/ace/easysync-notes.txt @@ -0,0 +1,129 @@ +Goals: + +- no unicode (for efficient escaping, sightliness) +- efficient operations for ACE and collab (attributed text, etc.) +- good for time-slider +- good for API +- line-ending aware +X more coherent (deleting or styling text merging with insertion) +- server-side syntax highlighting? +- unify author map with attribute pool +- unify attributed text with changeset rep +- not: reversible +- force final newline of document to be preserved + +- Unicode bad: + - ugly (hard to read) + - more complex to parse + - harder to store and transmit correctly + - doesn't save all that much space anyway + - blows up in size when string-escaped + - embarrassing for API + + +# Attributes: + +An "attribute" is a (key,value) pair such as (author,abc123456) or +(bold,true). Sometimes an attribute is treated as an instruction to +add that attribute, in which case an empty value means to remove it. +So (bold,) removes the "bold" attribute. Attributes are interned and +given numeric IDs, so the number "6" could represent "(bold,true)", +for example. This mapping is stored in an attribute "pool" which may +be shared by multiple changesets. + +Entries in the pool must be unique, so that attributes can be compared +by their IDs. Attribute names cannot contain commas. + +A changeset looks something like the following: + +Z:5g>1|5=2p=v*4*5+1$x + +With the corresponding pool containing these entries: + +... +4 -> (author,1059348573) +5 -> (bold,true) +... + +This changeset, together with the pool, represents inserting +a bold letter "x" into the middle of a line. The string consists of: + +- a letter Z (the "magic character" and format version identifier) +- a series of opcodes (punctuation) and numeric values in base 36 (the + alphanumerics) +- a dollar sign ($) +- a string of characters used by insertion operations (the "char bank") + +If we separate out the operations and convert the numbers to base 10, we get: + +Z :196 >1 |5=97 =31 *4 *5 +1 $"x" + +Here are descriptions of the operations, where capital letters are variables: + +":N" : Source text has length N (must be first op) +">N" : Final text is N (positive) characters longer than source text (must be second op) +"0" : Final text is same length as source text +"+N" : Insert N characters from the bank, none of them newlines +"-N" : Skip over (delete) N characters from the source text, none of them newlines +"=N" : Keep N characters from the source text, none of them newlines +"|L+N" : Insert N characters from the source text, containing L newlines. The last + character inserted MUST be a newline, but not the (new) document's final newline. +"|L-N" : Delete N characters from the source text, containing L newlines. The last + character inserted MUST be a newline, but not the (old) document's final newline. +"|L=N" : Keep N characters from the source text, containing L newlines. The last character + kept MUST be a newline, and the final newline of the document is allowed. +"*I" : Apply attribute I from the pool to the following +, =, |+, or |= command. + In other words, any number of * ops can come before a +, =, or | but not + between a | and the corresponding + or =. + If +, text is inserted having this attribute. If =, text is kept but with + the attribute applied as an attribute addition or removal. + Consecutive attributes must be sorted lexically by (key,value) with key + and value taken as strings. It's illegal to have duplicate keys + for (key,value) pairs that apply to the same text. It's illegal to + have an empty value for a key in the case of an insertion (+), the + pair should just be omitted. + +Characters from the source text that aren't accounted for are assumed to be kept +with the same attributes. + +Additional Constraints: + +- Consecutive +, -, and = ops of the same type that could be combined are not allowed. + Whether combination is possible depends on the attributes of the ops and whether + each is multiline or not. For example, two multiline deletions can never be + consecutive, nor can any insertion come after a non-multiline insertion with the + same attributes. +- "No-op" ops are not allowed, such as deleting 0 characters. However, attribute + applications that don't have any effect are allowed. +- Characters at the end of the source text cannot be explicitly kept with no changes; + if the change doesn't affect the last N characters, those "keep" ops must be left off. +- In any consecutive sequence of insertions (+) and deletions (-) with no keeps (=), + the deletions must come before the insertions. +- The document text before and after will always end with a newline. This policy avoids + a lot of special-casing of the end of the document. If a final newline is + always added when importing text and removed when exporting text, then the + changeset representation can be used to process text files that may or may not + have a final newline. + +Attribution string: + +An "attribution string" is a series of inserts with no deletions or keeps. +For example, "*3+8|1+5" describes the attributes of a string of length 13, +where the first 8 chars have attribute 3 and the next 5 chars have no +attributes, with the last of these 5 chars being a newline. Constraints +apply similar to those affecting changesets, but the restriction about +the final newline of the new document being added doesn't apply. + +Attributes in an attribution string cannot be empty, like "(bold,)", they should +instead be absent. + + + + + +------- +Considerations: + +- composing changesets/attributions with different pools +- generalizing "applyToAttribution" to make "mutateAttributionLines" and "compose" -- cgit v1.2.3-1-g7c22