Skip to content

Commit ce50dfd

Browse files
authored
Merge pull request #255 from common-workflow-language/matt-edits
Matt edits
2 parents 25784a9 + db11e6e commit ce50dfd

File tree

6 files changed

+51
-50
lines changed

6 files changed

+51
-50
lines changed

v1.0/CommandLineTool.yml

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -42,15 +42,14 @@ $graph:
4242
Since draft-3, this draft introduces the following changes and additions:
4343
4444
* The [Directory](#Directory) type.
45-
* "id: name" is now just "name"; "class: Classname" is now just
46-
"Classname".
45+
* "id: name" is now "name"; "class: Classname" is now "Classname".
4746
* [InitialWorkDirRequirement](#InitialWorkDirRequirement): list of
4847
files and subdirectories to be present in the output directory prior
4948
to execution.
5049
* Shortcuts for specifying the standard [output](#stdout) and/or
5150
[error](#stderr) streams as a (streamable) File output.
52-
* SoftwareRequirement: a lightweight method of describing software
53-
dependencies
51+
* [SoftwareRequirement](#SoftwareRequirement) for describing software
52+
dependencies of a tool.
5453
5554
## Purpose
5655
@@ -60,12 +59,12 @@ $graph:
6059
languages and execute concurrently on multiple hosts. However, POSIX
6160
does not dictate computer-readable grammar or semantics for program input
6261
and output, resulting in extremely heterogeneous command line grammar and
63-
input/output semantics among program. This a particular problem in
62+
input/output semantics among program. This is a particular problem in
6463
distributed computing (multi-node compute clusters) and virtualized
6564
environments (such as Docker containers) where it is often necessary to
6665
provision resources such as input files before executing the program.
6766
68-
Often this is gap is filled by hard coding program invocation and
67+
Often this gap is filled by hard coding program invocation and
6968
implicitly assuming requirements will be met, or abstracting program
7069
invocation with wrapper scripts or descriptor documents. Unfortunately,
7170
where these approaches are application or platform specific it creates a
@@ -203,8 +202,8 @@ $graph:
203202
items: string
204203
doc: |
205204
Find files relative to the output directory, using POSIX glob(3)
206-
pathname matching. If provided an array, find files that match any
207-
pattern in the array. If provided an expression, the expression must
205+
pathname matching. If an array is provided, find files that match any
206+
pattern in the array. If an expression is provided, the expression must
208207
return a string or an array of strings, which will then be evaluated as
209208
one or more glob patterns. Must only match and return files which
210209
actually exist.
@@ -554,14 +553,14 @@ $graph:
554553
type: int[]?
555554
doc: |
556555
Exit codes that indicate the process failed due to a possibly
557-
temporary condition, where excuting the process with the same
556+
temporary condition, where executing the process with the same
558557
runtime environment and inputs may produce different results.
559558
560559
- name: permanentFailCodes
561560
type: int[]?
562561
doc:
563562
Exit codes that indicate the process failed due to a permanent logic
564-
error, where excuting the process with the same runtime environment and
563+
error, where executing the process with the same runtime environment and
565564
same inputs is expected to always fail.
566565

567566

@@ -574,7 +573,7 @@ $graph:
574573
the image.
575574
576575
If a CommandLineTool lists `DockerRequirement` under
577-
`hints` or `requirements`, it may (or must) be run in the specified Docker
576+
`hints` (or `requirements`), it may (or must) be run in the specified Docker
578577
container.
579578
580579
The platform must first acquire or install the correct Docker image as

v1.0/Process.yml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ $graph:
4444
- cwl:File
4545
- cwl:Directory
4646
doc:
47-
- "Extends primitive types with the concept of a file as a first class type."
47+
- "Extends primitive types with the concept of a file and directory as a builtin type."
4848
- "File: A File object"
4949
- "Directory: A Directory object"
5050

@@ -77,7 +77,7 @@ $graph:
7777
remote resource (due to unsupported protocol, access denied, or other
7878
issue) it must signal an error.
7979
80-
If the `location' field is not provided, the `contents` field must be
80+
If the `location` field is not provided, the `contents` field must be
8181
provided. The implementation must assign a unique identifier for
8282
the `location` field.
8383
@@ -165,7 +165,7 @@ $graph:
165165
"sha1$ + hexadecimal string" using the SHA-1 algorithm.
166166
- name: size
167167
type: long?
168-
doc: Optional file size.
168+
doc: Optional file size
169169
- name: "secondaryFiles"
170170
type:
171171
- "null"
@@ -186,7 +186,7 @@ $graph:
186186
_type: "@id"
187187
identity: true
188188
doc: |
189-
The format of the file. This must be a IRI of a concept node that
189+
The format of the file: this must be an IRI of a concept node that
190190
represents the file format, preferrably defined within an ontology.
191191
If no ontology is available, file formats may be tested by exact match.
192192
@@ -217,7 +217,7 @@ $graph:
217217
218218
- name: Directory
219219
type: record
220-
docParent: "#CWLType"
220+
docAfter: "#File"
221221
doc: |
222222
Represents a directory to present to a command line tool.
223223
fields:
@@ -362,7 +362,7 @@ $graph:
362362
doc: |
363363
Only valid when `type: File` or is an array of `items: File`.
364364
365-
For input parameters, this must be one or more IRIs of a concept nodes
365+
For input parameters, this must be one or more IRIs of concept nodes
366366
that represents file formats which are allowed as input to this
367367
parameter, preferrably defined within an ontology. If no ontology is
368368
available, file formats may be tested by exact match.
@@ -391,8 +391,8 @@ $graph:
391391
- type: enum
392392
name: Expression
393393
doc: |
394-
Not a real type. Indicates that a field must allow runtime parameter
395-
references. If [InlineJavascriptRequirement](#InlineJavascriptRequirement)
394+
'Expression' is not a real type. It indicates that a field must allow
395+
runtime parameter references. If [InlineJavascriptRequirement](#InlineJavascriptRequirement)
396396
is declared and supported by the platform, the field must also allow
397397
Javascript expressions.
398398
symbols:

v1.0/concepts.md

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## References to Other Specifications
1+
## References to other specifications
22

33
**Javascript Object Notation (JSON)**: http://json.org
44

@@ -28,7 +28,7 @@ serve as a reference for the behavior of conforming implementations.
2828
The terminology used to describe CWL documents is defined in the
2929
Concepts section of the specification. The terms defined in the
3030
following list are used in building those definitions and in describing the
31-
actions of an CWL implementation:
31+
actions of a CWL implementation:
3232

3333
**may**: Conforming CWL documents and CWL implementations are permitted but
3434
not required to behave as described.
@@ -68,16 +68,16 @@ A **process** is a basic unit of computation which accepts input data,
6868
performs some computation, and produces output data. Examples include
6969
CommandLineTools, Workflows, and ExpressionTools.
7070

71-
An **input object** is an object describing the inputs to a invocation of
72-
process.
71+
An **input object** is an object describing the inputs to an invocation of
72+
a process.
7373

74-
An **output object** is an object describing the output of an invocation of a
75-
process.
74+
An **output object** is an object describing the output resulting from an
75+
invocation of a process.
7676

7777
An **input schema** describes the valid format (required fields, data types)
7878
for an input object.
7979

80-
An **output schema** describes the valid format for a output object.
80+
An **output schema** describes the valid format for an output object.
8181

8282
**Metadata** is information about workflows, tools, or input items.
8383

@@ -87,7 +87,7 @@ CWL documents must consist of an object or array of objects represented using
8787
JSON or YAML syntax. Upon loading, a CWL implementation must apply the
8888
preprocessing steps described in the
8989
[Semantic Annotations for Linked Avro Data (SALAD) Specification](SchemaSalad.html).
90-
A implementation may formally validate the structure of a CWL document using
90+
An implementation may formally validate the structure of a CWL document using
9191
SALAD schemas located at
9292
https://github.com/common-workflow-language/common-workflow-language/tree/master/draft-4
9393

@@ -112,7 +112,7 @@ Another transformation defined in Schema salad is simplification of data type de
112112
Type `<T>` ending with `?` should be transformed to `[<T>, "null"]`.
113113
Type `<T>` ending with `[]` should be transformed to `{"type": "array", "items": <T>}`
114114

115-
## Extensions and Metadata
115+
## Extensions and metadata
116116

117117
Input metadata (for example, a lab sample identifier) may be represented within
118118
a tool or workflow using input parameters which are explicitly propagated to
@@ -138,13 +138,13 @@ associated datatype or schema. During execution, values are assigned to
138138
parameters to make the input object or output object used for concrete
139139
process invocation.
140140

141-
A **command line tool** is a process characterized by the execution of a
141+
A **CommandLineTool** is a process characterized by the execution of a
142142
standalone, non-interactive program which is invoked on some input,
143143
produces output, and then terminates.
144144

145145
A **workflow** is a process characterized by multiple subprocess steps,
146-
where step outputs are connected to the inputs of other downstream steps to
147-
form a directed graph, and independent steps may run concurrently.
146+
where step outputs are connected to the inputs of downstream steps to
147+
form a directed acylic graph, and independent steps may run concurrently.
148148

149149
A **runtime environment** is the actual hardware and software environment when
150150
executing a command line tool. It includes, but is not limited to, the
@@ -168,14 +168,14 @@ not covered by this specification. Some areas that are currently out of
168168
scope for CWL specification but may be handled by a specific workflow
169169
platform include:
170170

171-
* Data security and permissions.
171+
* Data security and permissions
172172
* Scheduling tool invocations on remote cluster or cloud compute nodes.
173173
* Using virtual machines or operating system containers to manage the runtime
174174
(except as described in [DockerRequirement](CommandLineTool.html#DockerRequirement)).
175175
* Using remote or distributed file systems to manage input and output files.
176176
* Transforming file paths.
177-
* Determining if a process has previously been executed, skipping it and
178-
reusing previous results.
177+
* Determining if a process has previously been executed, and if so skipping it
178+
and reusing previous results.
179179
* Pausing, resuming or checkpointing processes or workflows.
180180

181181
Conforming CWL processes must not assume anything about the runtime
@@ -190,7 +190,7 @@ command line line tools) is as follows.
190190
1. Load, process and validate a CWL document, yielding a process object.
191191
2. Load input object.
192192
3. Validate the input object against the `inputs` schema for the process.
193-
4. Validate that process requirements are met.
193+
4. Validate process requirements are met.
194194
5. Perform any further setup required by the specific process type.
195195
6. Execute the process.
196196
7. Capture results of process execution into the output object.
@@ -288,25 +288,25 @@ characters around a parameter reference, the effective value of the field
288288
becomes the value of the referenced parameter, preserving the return type.
289289

290290
If the value of a field has non-whitespace leading or trailing characters
291-
around an parameter reference, it is subject to string interpolation. The
292-
effective value of the field is a string containing the leading characters;
293-
followed by the string value of the parameter reference; followed by the
291+
around a parameter reference, it is subject to string interpolation. The
292+
effective value of the field is a string containing the leading characters,
293+
followed by the string value of the parameter reference, followed by the
294294
trailing characters. The string value of the parameter reference is its
295295
textual JSON representation with the following rules:
296296

297297
* Leading and trailing quotes are stripped from strings
298298
* Objects entries are sorted by key
299299

300-
Multiple parameter references may appear in a single field. This case is
300+
Multiple parameter references may appear in a single field. This case
301301
must be treated as a string interpolation. After interpolating the first
302302
parameter reference, interpolation must be recursively applied to the
303303
trailing characters to yield the final string value.
304304

305305
## Expressions
306306

307307
An expression is a fragment of [Javascript/ECMAScript
308-
5.1](http://www.ecma-international.org/ecma-262/5.1/) code which is
309-
evaluated by the workflow platform to affect the inputs, outputs, or
308+
5.1](http://www.ecma-international.org/ecma-262/5.1/) code evaluated by the
309+
workflow platform to affect the inputs, outputs, or
310310
behavior of a process. In the generic execution sequence, expressions may
311311
be evaluated during step 5 (process setup), step 6 (execute process),
312312
and/or step 7 (capture output). Expressions are distinct from regular
@@ -370,7 +370,7 @@ platform's CWL implementation.
370370

371371
A CWL input object document may similarly begin with `#!/usr/bin/env
372372
cwl-runner` and be marked as executable. In this case, the input object
373-
must include the field `cwl:tool` supplying a IRI to the default CWL
373+
must include the field `cwl:tool` supplying an IRI to the default CWL
374374
document that should be executed using the fields of the input object as
375375
input parameters.
376376

v1.0/contrib.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,19 @@
11
Authors:
22

33
* Peter Amstutz <[email protected]>, Arvados Project, Curoverse
4+
* Michael R. Crusoe <[email protected]>, Common Workflow Language
5+
project
46
* Nebojša Tijanić <[email protected]>, Seven Bridges Genomics
57

68
Contributers:
79

810
* Brad Chapman <[email protected]>, Harvard Chan School of Public Health
911
* John Chilton <[email protected]>, Galaxy Project, Pennsylvania State University
10-
* Michael R. Crusoe <crusoe@ucdavis.edu>, University of California, Davis
12+
* Michael Heuer <heuermh@berkeley.edu,>,UC Berkeley AMPLab
1113
* Andrey Kartashov <[email protected]>, Cincinnati Children's Hospital
1214
* Dan Leehr <[email protected]>, Duke University
1315
* Hervé Ménager <[email protected]>, Institut Pasteur
16+
* Maya Nedeljkovich <[email protected]>, Seven Bridges Genomics
17+
* Matt Scales <[email protected]>, Institute of Cancer Research, London
1418
* Stian Soiland-Reyes [[email protected]](mailto:[email protected]), University of Manchester
1519
* Luka Stojanovic <[email protected]>, Seven Bridges Genomics

v1.0/intro.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Status of This Document
1+
# Status of this document
22

33
This document is the product of the [Common Workflow Language working
44
group](https://groups.google.com/forum/#!forum/common-workflow-language). The

v1.0/invocation.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,9 @@ elements. Strings are sorted lexicographically based on UTF-8 encoding.
5050

5151
All files listed in the input object must be made available in the runtime
5252
environment. The implementation may use a shared or distributed file
53-
system or transfer files via explicit download. Implementations may choose
54-
not to provide access to files not explicitly specified in the input object
55-
or process requirements.
53+
system or transfer files via explicit download to the host. Implementations
54+
may choose not to provide access to files not explicitly specified in the input
55+
object or process requirements.
5656

5757
Output files produced by tool execution must be written to the **designated
5858
output directory**. The initial current working directory when executing
@@ -66,8 +66,7 @@ the workflow platform immediately after the tool terminates.
6666
For compatibility, files may be written to the **system temporary directory**
6767
which must be located at `/tmp`. Because the system temporary directory may be
6868
shared with other processes on the system, files placed in the system temporary
69-
directory are not guaranteed to be deleted automatically. Correct tools must
70-
clean up temporary files written to the system temporary directory. A tool
69+
directory are not guaranteed to be deleted automatically. A tool
7170
must not use the system temporary directory as a backchannel communication with
7271
other tools. It is valid for the system temporary directory to be the same as
7372
the designated temporary directory.
@@ -79,7 +78,6 @@ specified or at user option.
7978

8079
* `HOME` must be set to the designated output directory.
8180
* `TMPDIR` must be set to the designated temporary directory.
82-
when the tool invocation and output collection is complete.
8381
* `PATH` may be inherited from the parent process, except when run in a
8482
container that provides its own `PATH`.
8583
* Variables defined by [EnvVarRequirement](#EnvVarRequirement)

0 commit comments

Comments
 (0)