-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathTonelSpec1.0-draft1.txt
712 lines (387 loc) · 40.5 KB
/
TonelSpec1.0-draft1.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
=======================
Tonel File and Directory Format Specification, version 1.0 draft 1
Martin McClure, principal author
====================
About draft 1:
This draft is an attempt to specify what was agreed upon between GemTalk (Martin McClure, Dale Henrichs) and Pharo (Esteban Lorenzano) in 2017 and early 2018, which I refer to as Tonel v1.0. Pharo 6 and 7 use a somewhat earlier version of Tonel, for which no specification exists AFAIK. Tonel as practiced by Pharo 6 and 7 may be informally referred to as "Tonel v0". Issues of conversion and compatibility between Tonel v0 and Tonel v1.0 are certainly important, but this specification does not address them. They may or may not be addressed in future drafts, depending on feedback from the community of Tonel implementors.
Existing Tonel 0 implementations will need some changes to move from Tonel 0 to Tonel 1.0. These changes are relatively minor. However, since there are *some* required changes, it gives us an opportunity to make more changes, if we as a community so desire. Personally, I am more interested in having a standard that covers our needs and is robust for moving into the future than I am in minimizing the changes from Tonel 0. I also like simplicity, and there are some minor ways to simplify the Tonel grammar by deviating slightly from the Tonel 1.0 agreed upon in 2017-2018.
In this draft, I have marked places where I have questions with '***'. Some of these are places where more community input is needed to produce a complete specification, and others are places where I see an opportunity to simplify or improve and would like feedback.
Thanks for reading and commenting,
-Martin
====================
Introduction
------------
Tonel is a set of formats for storing and sharing Smalltalk code in disk directories and files.
Goals of Tonel include
* Interchange between Smalltalk implementations
* Management of Smalltalk code in Git or another text-file-based code versioning system (for example, Subversion). Whenever the term "Git" is used in this standard, it should be understood to apply equally to any similar versioning system.
* Ability to edit Smalltalk code in plain-text editors, as well as within a Smalltalk IDE
* Meaningful use of third-party diffing tools for Smalltalk code
Declarative
Traditional Smalltalk fileins are imperative -- they execute code as the file is parsed.
Tonel is declarative. In a typical implementation, Tonel files are parsed into a graph of definition objects (class definition, method definition, etc.) and these definition objects are then analyzed and either applied to the system or delivered to tools that can manipulate the definitions directly.
A Tonel format is used to store a package definition. A package definition is a set of class definitions and class extension definitions, and other information about the package.
Tonel is not concerned with code organization at granularities larger than a package. How multiple packages might be stored and relate to each other is beyond the scope of the current Tonel specification. This specification may be expanded in the future to include inter-package structures used by the Rowan package manager.
Similarly, although the Tonel formats fully describe the units that make up a package, Tonel does not specify what any of those units would look like outside the context of the entire package.
About this specification
This specification defines the format of the files and directories that are to be produced by an abstract software entity called a Tonel Writer, and to be recognized by an abstract software entity called a Tonel Reader. Each implementation that complies with this specification will have its own way of making concrete these abstract entities. Since one of the goals of Tonel is to be human-readable and, when necessary, editable by humans in a normal text editor, it must be made clear that a human is not a Tonel Reader, and a text editor is not a Tonel Writer -- these terms are reserved for software that understands the Tonel format.
Where not otherwise specified herein, the rule is that Tonel Writers must strictly follow the format, and Tonel Readers should be permissive (where possible) of slight deviations from the precise format. For example, the exact whitespace characters to be used by writers is specified, but any whitespace sequences should be accepted by readers. Relaxed reading is necessary to allow humans to edit Tonel files without getting every last detail right. Strict writing is necessary to avoid identical code written by different Tonel implementations having differences in Git.
To conform to this specification, a Tonel Writer and Tonel Reader must both be implemented. A third software entity may also be implemented, a Tonel Verifier. A Tonel Verifier is similar to a Tonel Reader, but strictly enforces the Tonel format. Its primary purpose is the testing of Tonel Writers.
Terms: Must, should, may:
An implementation of a Tonel Reader or Tonel Writer does not conform with this specification unless its behavior matches every part of this specification that uses the word "must." If the word "should" is used to describe a behavior, that behavior is highly recommended, but not mandatory for conformance. If the word "may" is used, the described behavior will not prevent conformance. If none of the qualifiers "must," "should," or "may" appear, the described behavior is mandatory, as if the word "must" was used.
==============
Overview
Three formats are specified:
1) File-per-method, a format with the smallest granularity
2) File-per-class, a format with medium granularity
3) File-per-package, a format with large granularity
These three formats are semantically identical, therefore it is possible to convert anything represented in one of the formats to any of the other formats with no information loss. The developer is free to choose to work in any of the three formats, depending on their needs and circumstances.
Rationale for choosing these formats (this section is non-normative):
File-per-method
This format is the easiest to merge in Git, since file merges are only required if multiple developers change the same method. It is also easier to track the history of an individual method using standard Git tools. However, on a typical platform a file takes a minimum disk space of several thousand bytes, and Smalltalk methods take much less than this, so file-per-method often takes an order of magnitude more space on disk than the actual source code.
It is anticipated that this format might be used for areas of code that are under active development. Large projects might use File-per-method for only a portion of their entire code base, switching to one of the other formats when an area becomes relatively stable.
File-per-class
This format is anticipated to be the most-used format. Files with class granularity waste much less space than using method granularity, but merging and tracking problems are not as intrusive as with file-per-package.
File-per-package
This format, unlike the others, contains *all* of the information about a package within a single file. It is anticipated that this will be used as a way to package a stable release version of a package for download. It is less desirable during development.
--------
Conceptually understanding the relationship between the formats
This specification tries to keep the relationships between the three formats as simple as possible. Every Tonel file contains a short file header, in most cases followed by a file body. Each format of larger granularity is formed by grouping together files from the smaller-granularity format and concatenating their bodies together in a well-defined order to form the body of one new file.
Thus, a class file in class-per-file format is a new header followed by the bodies of the class definition file and all of the method definition files from the file-per-method format. Similarly, a package file in package-per-file format is a new header followed by the bodies of all the package definition file and the class files in the file-per-package format.
There are a small number of exceptions to this simple rule, detailed below.
<end non-normative section>
==================================================================================
Things in common between the formats
==================================================================================
The directory properties file
-----------------------------
Each directory in the Tonel file-per-method and Tonel file-per-class formats contains a file with the filesystem name 'properties.st'. This file allows a Tonel Reader to determine what information is represented in the directory. The fact that this file has the same name in every directory makes it easy for a Tonel Reader to find. The contents of this file differs by purpose of directory, and is covered below.
File and directory names
------------------------
Each Tonel file and each Tonel directory has both a canonical name and a filesystem name. The canonical name is the logical, ideal, name for the directory or file, as specified below. The canonical name of a Tonel file or directory can be arbitrarily long, and can contain any Unicode character.
The filesystem name is the name actually found in the filesystem. Whenever possible, the canonical name and the filesystem name must be identical. However, there are several reasons why this might not always be possible:
* Platform restrictions on filename character set
* Platform restriction on filename length
* Name collision due to canonical names that differ only by case in the same directory of a case-insensitive filesystem.
* Name collision due to two canonical names of "properties.st" in the same directory.
* Possibly other reasons
When the filesystem name cannot be made identical to the canonical filename, the filesystem name must be chosen by the Tonel Writer to be as close as is practical to the canonical name.
*** We should standardize how the filesystem name is related to the canonical filename for known cases where they must be different. What cases are known?
The canonical name of a Tonel file is found in the header of the file itself. The canonical name of a Tonel directory can be determined by the contents of that directory's properties.st file, as detailed below.
The canonical name of the properties.st file is 'properties.st'. The filesystem name must also be 'properties.st'. The name 'properties.st' is valid in all currently known filesystems. If a class or method file has the canonical name 'properties.st', its filesystem name must not be 'properties.st' since that is reserved for the directory properties file.
*** This is unnecessarily awkward. Should we change the suffix of properties files to avoid possible name collisions? Perhaps "properties.tonel"? If not, we need to specify what the filesystem names of a class or method file whose canonical name is 'properties.st' would be.
The filesystem name of a Tonel file appears only in the filesystem, not within the files themselves.
When reading Tonel files and directories, Tonel Readers must treat each file or directory according to its canonical name, regardless of what its filesystem name is. Tonel Readers and Tonel Writers must examine the headers of each file in a directory to determine what Tonel content is actually present. They cannot rely on filenames.
Character Set and Encoding
--------------------------
All Tonel files are encoded in UTF-8.
Tonel files, and the canonical names of Tonel files and directories, can contain any Unicode character.
Structure
---------
The information within Tonel consists of these parts:
1) The directory structure itself conveys information. For instance, in file-per-class format, the information of which package defines a class is conveyed by the class file being inside a directory that belongs to the package.
2) Within each file, information takes these forms:
a) Class or package comments, in double quotes.
b) Smalltalk method source code, in square brackets.
c) Tonel STON. The STON (Smalltalk Object Notation) permitted in a Tonel file is a subset of standard STON, though the semantics are slightly different. The syntax of standard STON can be found at https://github.com/svenvc/ston/blob/master/ston-spec.md. The syntax of Tonel STON is specified below, and will remain constant even if the syntax of standard STON changes. In this standard, "STON" refers to Tonel STON unless otherwise indicated.
Write Strictly, Read Permissively
---------------------------------
In order to use Git to determine the changes between two versions of the same package, it is vital that the only changes to the files and directories representing that package reflect actual semantic differences. Therefore, all Tonel Writers conforming to this specification must produce character-for-character identical output for semantically identical packages.
However, one goal of Tonel is to be readable by humans, and to be edited by humans using plain text editors. To reduce the burden of correctness on human editors of Tonel files, Tonel Readers must allow variations that do not alter the semantic content.
Two areas where this principle of "write strictly, read permissively" applies are ordering and whitespace.
Ordering
--------
Tonel Writers must write the elements of all files in a strict order as specified below. Some sorting is specified below as "code point order." This means to sort in a manner equivalent to the following steps:
* Perform Unicode normalization of the strings to be sorted to Normalization Form Canonical Composition (NFC).
* Sort the normalized strings in order of increasing Unicode code point, leftmost character the most significant. When sorting two normalized strings, one of which is the leading substring of the other, the shorter string must sort before the longer. For example, 'fire' sorts before 'firefighter'.
Using code point order avoids file differences arising from collation differences in different locales.
In cases where the order of elements is not semantically significant (for instance, keys in a dictionary or methods in a class) Tonel Readers must accept those elements in any order. This makes it easier for humans using normal text editors to edit Tonel files to create files that will be accepted by Tonel Readers.
Whitespace
----------
Line ending conventions
In all Tonel files, line ends must use the convention of the local operating system environment. This means CRLF for Windows, LF for Unix/Mac. It is recommended that users of Tonel configure their Git repositories to use LF line endings internally, and let Git handle the translation on the way in and out. A single LF (Unix) or CRLF sequence (Windows) is referred to below as a "newline."
Tonel files can also contain multi-line Smalltalk string literals. Tonel Writers and Tonel Readers must use the local OS environment's line ending convention in these cases, converting if necessary from the internal line end used by the local Smalltalk implementation.
Where whitespace is specified herein, Tonel writers must write the exact sequence and number of spaces, tabs, and newlines specified herein.
Tonel readers shall allow any number and sequence of spaces, tabs, newlines, and optionally other whitespace characters (e.g. non-breaking space) wherever the syntax specifies "whitespace."
Transience of variations
------------------------
Because of the principle of reading permissively, a Tonel Reader may successfully read a file that does not strictly conform to Tonel format (because it was written by a human or a faulty Tonel Writer implementation). Reading such a file results in objects representing code components such as methods, classes, and packages. When a Tonel Writer re-writes these components, it must write them in the strict format. Any variations present when reading must *not* be preserved when written by a Tonel Writer. The recommended development practice is to re-write any hand-edited files after they are read, to clean up any non-standard variations in order, whitespace, etc., before committing the files to Git.
==============
Details
=======
Syntax
------
Syntax in this specification is presented in PEG format (see https://en.wikipedia.org/wiki/Parsing_expression_grammar and
https://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf).
Tonel files contain information in three main syntaxes: Tonel STON, Comments, and Tonel Method Source. All three of these use a common whitespace syntax. Wherever a syntax references a production whose name starts with 'Ws', TonelWriters must write exactly these characters, but TonelReaders must accept the production 'Whitespace'. For example, "WsTabTab" is defined as a sequence of two tab characters. At a point in the grammar where this production appears, Tonel Writers must write exactly these two bytes, but Tonel Readers must accept any sequence of whitespace characters.
Whitespace Syntax
-----------------
WsBlank <- '\40' #Space character. PEG does escapes in octal.
WsNl <- '\12' / '\15\12' #Tonel Writers must always use the alternative native to the local platform.
WsTab <- '\11'
WsTabTab <- WsTab WsTab
Whitespace <- ('\40' / '\11' / '\12' / '\15')+
Tonel Readers must accept at least these four characters as whitespace.
Readers may also accept other non-printing Unicode codepoints (such as non-breaking space) as whitespace.
Tonel writers must never output arbitrary whitespace between elements of Tonel syntax, but follow the Ws* productions.
Tonel STON
----------
Tonel STON is a simplified version of Smalltalk Object Notation (https://github.com/svenvc/ston). Tonel STON is a subset of standard STON with a Tonel-specific interpretation of class tags. For instance, the class tag "Class" has different content and interpretation than it would in standard STON.
For Tonel STON, each section has mandatory keys and optional keys as specified below. Tonel Writers must write all mandatory keys before any optional keys. Mandatory keys must be written in the order they are specified below. Any optional keys must be written after the mandatory keys. Optional keys must be written sorted in code point order of the key symbol.
Optional Keys
-------------
Some Tonel STON objects are specified below with optional keys that, if they appear, must have a specified meaning. Implementations may also define implementation-specific or Smalltalk-dialect-specific keys. These keys must have a short prefix ending with an underscore indicating the implementation or dialect for which they are used. Examples might be 'gs_' for GemStone, 'va_' for VAST, etc.
Because Tonel Writers may write prefixed optional keys not specified herein, Tonel Readers must be prepared to read keys unknown to the implementation, to which they cannot assign a meaning. Tonel Readers must accept these keys and their associated values. Except for the Tonel STON used for file headers (details below), the Tonel Reader must arrange for the optional keys and their associated values to be remembered, and Tonel Writers must write these keys unchanged from when they were read. The specific mechanism used for remembering these key-value pairs belongs in the realm of a package manager, and is therefore beyond the scope of this standard.
Tonel STON Syntax
-----------------
TonelFileObject <- 'TonelFile' WsBlank Map
PackagePropertiesObject <- 'Package' WsBlank Map
ClassPropertiesObject <- 'Class' WsBlank Map
TonelClassExtensionObject <- 'Extension' WsBlank Map
TonelMethodDescriptor <- Map
Map <- SingleEntryMap / MultipleEntryMap
SingleEntryMap <- '{' WsBlank Association WsBlank '}' WsNl
*** It would simplify the syntax to use the MultipleEntryMap format even when there is only one key-value pair, but this
*** is not done in Pharo 7, where method properties are all on one line, and limited to #category.
MultipleEntryMap <- '{' WsNl AssociationLines '}' WsNl
AssociationLines <- NotLastAssociationLine+ LastAssociationLine
NotLastAssociationLine <- WsTab Association ',' WsNl
LastAssociationLine <- WsTab Association WsNl
Association <- SimpleSymbol WsBlank ':' WsBlank (String | List)
List <- '[' WsNl ListElements WsTab ']' WsNl
ListElements <- (ListElement WsNl) | (ListElement ',' WsNl ListElements)
ListElement <- WsTabTab string
String <- "'" StringChars "'"
StringChars <- (StringChar / StringEscape)*
StringChar <- !"'" !'\\' . #Any Unicode character except ' and \
StringEscape <- '\\\\' / "\\'"
(Since all Tonel files are UTF-8, the only necessary string escapes are those for backslash and single quote.
To avoid ambiguity in the representation of strings, Tonel Writers must not use any other STON string escapes.
Tonel Readers may accept other STON string escapes.)
SimpleSymbol <- '#' SimpleSymbolChar+
SimpleSymbolChar <- [A-Z] / [a-z] / [0-9] / '-' / '_' / '.' / '/'
======================
File headers
Every Tonel file, in every Tonel format, must start with a Tonel file header. This header has the syntax TonelFileObject.
Mandatory keys in file headers:
#tonelVersion
#contentType
#name
For this version of the Tonel spec, the value associated with #tonelVersion must be the string '1.0'.
The value associated with #name must be the canonical name of the file it is contained in.
#contentType will vary. The standard values of #contentType are specified below. Implementations must ignore and not change any file with a non-standard #contentType that it does not recognize. Implementations writing files with non-standard #contentType must prefix the content type with the same prefix used for implementation-specific or Smalltalk-dialect-specific keys in Tonel STON.
Defined optional keys in file headers:
#license
#copyright
Tonel Writers must write the key/value associations for #license and/or #copyright when, and only when, they are defined in the package properties. Whenever written, they must have the same value as the value for the corresponding key in the package properties.
Tonel Readers must ignore #license and #copyright keys in file headers.
Tonel Verifiers must verify that the keys #copyright and #license appear in file headers if and only if the matching key appears in the package properties, and that if the keys appear the corresponding values are identical.
*** Any other mandatory or defined optional keys?
Implementation-specific optional keys in file headers are allowed, but discouraged. Because the file header does not contribute to the semantic content of the package (informally speaking, it's part of the structure and not part of the payload), Tonel implementations are not required to remember and re-write any implementation-specific optional file header keys that they do not recognize, though they are encouraged to do so when possible. When writing a package in a different Tonel format than was read (for instance, reading file-per-class and subsequently writing file-per-package) there may not be a meaningful way to write arbitrary information that was read from file headers.
=====================
Comments
--------
Comments must appear in Tonel files where specified below, and only where specified below. These comments are significant. They are not comments on the Tonel file itself, they are package or class comments. The syntax of a comment is:
Comment <- '"' ((!'"' .) / '""')* '"' WsNl WsNl
=======================
Representation of Methods
-------------------------
Tonel has two ways of representing a method. Most methods enclose the source code in square brackets using the syntax SquareBracketMethod (detailed below), with other information about the method contained in a Tonel STON TonelMethodDescriptor.
For a TonelReader to be able to correctly find the right bracket that denotes the end of the method, the method itself must contain properly balanced square brackets. The SquareBracketMethod syntax compensates for unbalanced square brackets that appear in character literals, string literals, and comments. However, some Smalltalks permit extended syntax that can result in some methods being unable to be represented in SquareBracketMethod syntax. One example is Smalltalk/X, which allows C code to be embedded in Smalltalk methods.
To allow arbitrary syntax to be expressed in Tonel methods, a method that cannot be represented in SquareBracketMethod syntax must have its behavior, selector, and source included in the Tonel STON TonelMethodDescriptor.
Every method whose source *can* be represented in SquareBracketMethod form *must* be so represented. Only methods that cannot be represented in SquareBracketMethod form may include their source code in the TonelMethodDescriptor.
The syntax for a method is:
DescribedSmalltalkMethod <- TonelMethodDescriptor? SquareBracketsMethod?
The optional TonelMethodDescriptor in the above production is forbidden if there are no keys to put in it, and mandatory otherwise.
The optional SquareBracketsMethod in the above production is forbidden if the source is being included in the TonelMethodDescriptor, mandatory if it is not.
TonelMethodDescriptors
----------------------
Mandatory keys in a TonelMethodDescriptor are:
If a method's source code is being included in a TonelMethodDescriptor, three keys are mandatory. If the source of a method is able to be represented as a SquareBracketMethod, these three keys are prohibited for that method:
#selector : The selector as a String (Note: A String, *not* a Symbol. *All* Tonel STON values are Strings.)
#isMeta : The String 'true' if the method is a class-side method, 'false' if instance-side.
#source : The method's full source code as a String. This source *includes* the method pattern.
*** What are the other mandatory and defined optional keys of a TonelMethodDescriptor? Pharo 7 just uses #category. It's not clear to me that this should be mandatory, though it should probably be a defined optional key.
Defined optional keys are:
#category : String category name
SquareBracketsMethod syntax
---------------------------
SquareBracketsMethod <- MethodIdentification WsBlank '[' WsNl SmalltalkCode WsNl ']' WsNl
MethodIdentification <- ClassIdentifier WsBlank '>>' WsBlank MethodPattern
ClassIdentifier <- ClassName (WsBlank 'class')?
ClassName <- (!Whitespace .)+
MethodPattern <- (!(WsBlank '[') .)+
SmalltalkCode <- (SmalltalkStringLiteral / SmalltalkComment / SmalltalkCharacterLiteral / SmalltalkBlock / OtherSmalltalkCode)*
SmalltalkStringLiteral <- "'" (!"'" .)* "'" (note that detecting '' within a string literal is not necessary in Tonel)
SmalltalkComment <- '"' ( !'"' . )* '"'
SmalltalkCharacterLiteral <- '$' .
SmalltalkBlock <- '[' SmalltalkCode ']'
OtherSmalltalkCode <- (!('"' / "'" / '$' / '[') .)*
The full source code of a method described using this syntax is the concatenation of the MethodPattern, a newline, and the SmmalltalkCode from the SquareBracketsMethod production.
*** Dale points out that putting the class name in each MethodIdentification is redundant, and that would be simpler to omit it.
*** He has had to rename classes by editing Tonel files in a text editor, and this is *much* simpler if the
*** class name is not repeated throughout.
*** We would still need to indicate whether it's a class method or not.
*** Possible replacements for ClassName >> MethodPattern could be 'instance >> MethodPattern' and 'class >> MethodPattern'.
=========================
Representation of Classes
-------------------------
Class definitions
-----------------
Classes are defined in Tonel STON with the syntax of ClassProperties:
ClassProperties <- Comment? ClassPropertiesObject
The optional Comment is the class comment, which should describe the class being defined.
Mandatory keys of the ClassPropertiesObject are:
#name : String class name
#superclass : String class name
*** Should there be any other mandatory keys in a class definition? Which ones?
The value associated with the key #name is the name of the class being defined.
The value associated with the key #superclass is the name of the superclass of the class being defined, or the string 'nil'. Note that the standard STON value nil must not be written by Tonel Writers. Tonel Readers may accept the standard STON value nil as equivalent to the String value 'nil'.
Defined optional keys are:
#category : String category name
#instVars : List of string instance variable names, in order of definition.
*** Any other defined optional keys?
Class extensions
----------------
Classes are extended (given additional methods) in Tonel STON using the syntax of TonelClassExtensionObject.
Mandatory keys are:
#name : String class name
The value associated with the key #name is the name of the class being extended.
==========================
Representation of Packages
--------------------------
Definition and properties of packages uses the syntax:
PackageProperties <- Comment? PackagePropertiesObject
The optional Comment is the package comment, which should describe the package.
The mandatory keys of PackagePropertiesObject are:
#name
The value associated with #name is the name of the package.
The defined optional keys of PackagePropertiesObject are:
#copyright
#license
The values associated with #copyright and #license describe the legal status of the code in the package. These keys are optional, but their use is encouraged. When either or both of #copyright and #license is used, Tonel Writers must also write these key/value associations into the file header of every file in the Tonel representation of the package.
*** Should there be any other mandatory keys or defined optional keys for package properties?
*** Should we specify some standard values for #license? For open-source, the SPDX license identifier would seem to be a good choice (see https://spdx.org/ids). For proprietary software, the string "All Rights Reserved" would seem reasonable.
======================================================================================
THE FORMATS
======================================================================================
The file-per-method format
==========================
Directory structure
-------------------
The directory for a package contains a file named "properties.st" and a directory for each class defined or extended in the package.
Package directory
properties.st file
Class1 directory
Class2 directory
...
ClassN directory
Package Directory
-----------------
A package directory's canonical name is the name of the package, as specified in the properties.st file in that directory.
A package directory contains a properties.st file and a directory for each class defined or extended by the package.
The properties.st file of a package directory contains a TonelFile header followed by a package definition. The syntax of this file is:
PackagePropertiesFile <- TonelFileObject WsNl PackageProperties
In this file's TonelFileObject, the value associated with #contentType must be 'package'.
Class Directories
-----------------
A class directory represents either the definition of a class and its methods within a package, or a class extension. A class extension is intended to add methods to a class that is defined in some other package.
Tonel Readers must report an error if a class is defined more than once in the same package, or is extended more than once in the same package, or is both defined and extended in the same package.
Class Definition Directory
--------------------------
A class definition directory's canonical name is the name of the class being defined, followed by '.class'. For example, the class Object would be defined by a directory whose canonical name is 'Object.class'.
A class definition directory contains a properties.st file and two subdirectories, whose canonical names and filesystem names are 'instance' and 'class'. Tonel Readers must ignore, and Tonel Writers must not modify, any other files or subdirectories of a class definition directory.
Tonel Writers must write both the 'instance' and 'class' subdirectories, and the properties.st files within them, even if there are no other contents for those directories. Tonel Readers must interpret the absence of one or both of these subdirectories as equivalent to a directory defining no methods.
The properties.st file of a class definition directory contains a TonelFile header followed by the class definition. The syntax of this file is:
ClassDefinitionPropertiesFile <- TonelFileObject WsNl ClassProperties
In this file's TonelFileObject, the value associated with #contentType must be 'classDefinition'.
The mandatory and optional keys of the ClassPropertiesObject in the ClassProperties are given above in the "Representation of Classes" section.
Class Extension Directory
-------------------------
A class extension directory's canonical name is the name of the class being extended, followed by '.extension'. For example, methods would be added to the class Object by a directory whose canonical name is 'Object.extension'.
A class extension directory contains a properties.st file and two subdirectories, whose canonical names and filesystem names are 'instance' and 'class'. Tonel Readers must ignore, and Tonel Writers must not modify, any other files or subdirectories of a class definition directory.
Tonel Writers must write both the 'instance' and 'class' subdirectories, and the properties.st files within them, even if there are no other contents for those directories. Tonel Readers must interpret the absence of one or both of these subdirectories as equivalent to a directory defining no methods.
The grammar of the properties.st file in a class extension directory is:
ClassExtensionPropertiesFile <- TonelFileObject WsNl TonelClassExtensionObject
In this file's TonelFileObject, the value associated with #contentType must be 'classExtension'.
The mandatory and optional keys of the TonelClassExtensionObject are given above in the "Representation of Classes" section.
Instance and class directories
------------------------------
An instance directory defines any methods for a class. A class directory defines any methods for its associated metaclass.
Tonel writers must write the subdirectories 'class' and 'instance' under each class definition directory or class extension directory, and put a valid properties.st file in each directory, even if that directory defines no methods. Tonel readers must treat a missing 'class' or 'instance' directory as a valid directory that defines no methods.
The properties.st file in each instance or class directory has the syntax TonelFileObject.
In an instance directory's properties.st file, the #contentType must be 'instanceMethods'.
In a class directory's properties.st file, the #contentType must be 'classMethods'.
In addition to the properties.st file, an instance or class directory contains method files, one per method being defined.
Method files
------------
The canonical name of a method file is <methodSelector>.st.
*** Do we want to specify the mapping from canonical name to filesystem name for characters such as slash? Do we want to use the same convention as FileTree did for this?
The contents of a method file have the syntax:
TonelMethodFile <- TonelFileObject WsNl DescribedSmalltalkMethod
In each method file's TonelFileObject, the value associated with #contentType must be 'method'.
See the Representation of Methods section for mandatory and optional keys of the DescribedSmalltalkMethod.
==========================================================================================
*** Note that this spec does not prohibit mixing file-per-method and file-per-class formats in
*** the same package, though each class must be one or the other.
*** We should probably either explicitly allow or explicitly prohibit this mixing.
*** The mixing works technically -- I'm interested in hearing reasoning
*** for allowing or prohibiting it.
The file-per-class format
=========================
Directory structure
-------------------
The directory for a package contains a file named "properties.st" and a file for each class class defined or extended in the package.
Package directory
properties.st file
Class1.class file
Class2.extension file
...
ClassN.class file
Package Directory
-----------------
A package directory's canonical name is the name of the package.
A package directory contains a properties.st file and a file for each class defined or extended by the package.
The properties.st file of a package directory in the file-per-class format is the same as the properties.st file of a package directory in the file-per-method format.
Class files
-----------
There are two kinds of class files. One kind represents the definition of a class and its methods within a package. The other represents a class extension, intended to add methods to a class that is defined in some other package.
Tonel Readers must report an error if a class is defined more than once in the same package, or is extended more than once in the same package, or is both defined and extended in the same package.
Class definition file
---------------------
A class definition file's canonical name is the name of the class being defined, followed by '.class.st'. For example, the class Object would be defined by a file whose canonical name is 'Object.class.st'.
The syntax of a class definition file is:
ClassDefinitionFile <- TonelFileObject WsNl ClassDefinition
ClassDefinition <- ClassProperties (WsNl SmalltalkMethods)?
SmalltalkMethods <- (DescribedSmalltalkMethod WsNl SmalltalkMethods) / DescribedSmalltalkMethod
In each class definition file's TonelFileObject, the value associated with #contentType must be 'classDefinition'.
Note that it is permitted for a class definition to contain no methods. Any methods present must be ordered. Class methods come first, followed by instance methods. Within each of these two categories, methods must be sorted by selector, in code point sort order.
For every method present and using SquareBracketsMethod syntax, the ClassName in the MethodIdentification must be the same as the value associated with the #name key in the ClassPropertiesObject. Tonel Readers must report an error if this is not the case.
Class extension file
--------------------
A class extension file's canonical name is the name of the class being extended, followed by '.extension.st'. For example, the class Behavior would be extended by a file named 'Behavior.extension.st'.
The syntax of a class extension file is:
ClassExtensionFile <- TonelFileObject WsNl ClassExtension
ClassExtension <- TonelClassExtensionObject WsNl SmalltalkMethods
In each class definition file's TonelFileObject, the value associated with #contentType must be 'classExtension'.
The mandatory and optional keys of the TonelClassExtensionObject are given above in the "Representation of Classes" section.
Note that a class extension must contain at least one method. The methods must be ordered. Class methods come first, followed by instance methods. Within each of these two categories, methods must be sorted by selector, in code point sort order.
For every method present and using SquareBracketsMethod syntax, the ClassName in the MethodIdentification must be the same as the value associated with the #name key in the ClassPropertiesObject. Tonel Readers must report an error if this is not the case.
==========================================================================================
The file-per-package format
===========================
Directory structure
-------------------
There is no specified directory structure; each package is contained in its entirety in a single file.
Package File
------------
A package file represents a package, including information about the package itself, the classes the package defines and extends, and all of the methods defined by the package.
A package file's canonical name is the name of the package being defined, followed by '.package.st'. For instance, the canonical name of the file for the package Collections-Abstract would be 'Collections-Abstract.package.st'.
The syntax of a package file is:
PackageFile <- TonelFileObject WsNl PackageProperties WsNl PackageContents
PackageContents <- (ClassDefinitions WsNl ClassExtensions) / ClassDefinitions / ClassExtensions
ClassDefinitions <- (ClassDefinition WsNl ClassDefinitions) / ClassDefinition
ClassExtensions <- (ClassExtension WsNl ClassExtensions) / ClassExtension
In each package file's TonelFileObject, the value associated with #contentType must be 'package'.
Note that a package file must define or extend at least one class. Class definitions and extensions must be ordered. Any class definitions come before any class extensions. Within class definitions, and within class extensions, classes are ordered in code point order of class name.
For every method present and using SquareBracketsMethod syntax, the ClassName in the MethodIdentification must be the same as the value associated with the #name key in the ClassPropertiesObject of the immediately preceding ClassDefinition or ClassExtension. Tonel Readers must report an error if this is not the case.