riedquat - valueable resource for those who seek.
Home Blog Technical Reports Art Articles RapiDocs Coding Bugs Links Reviews Projects: CherBot Daimonin Gridarta

TODO Syntax 1.0 Specification

Riedquat Working Draft: 2011-07-16

This version:
↗http://www.riedquat.de/TR/trunk/TODO_Syntax
Latest version:
↗http://www.riedquat.de/TR/TODO_Syntax
Editors:
Christian Hujer, <✉cher@riedquat.de>
Authors:
Christian Hujer, <✉cher@riedquat.de>
Ralf Holly, TODO

Copyright © 2009 - 2011 Christian Hujer and Ralf Holly, All Rights Reserved.

Abstract

This specification describes a standardized format for TODO comments in source code or other textual information. TODO comments are a wide-spread form for documenting open issues and potential for improvement in source code and other technical documents. Its goal is to improve the communication, filtering and tracing of such issues.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current Riedquat publications and the latest revision of this technical report can be found in the ↗Riedquat technical reports index at http://www.riedquat.de/TR/.

This document is a working draft. It is published to the hacker, internet and software development communities as a request for comments. Please send your comments to <✉cher@riedquat.de>.

Table of Contents

1 Introduction

This document specifies a syntax for TODO comments. TODO comments are a wide-spread form for documenting open issues and potential for improvement in source code and other technical documents. The goal of this specification goal is to improve the communication, filtering and tracing of such issues. The idea for this document was born from reading [TODOORNOT], a BLOG entry by Ralf Holly.

Please note that this document does not make any statement about whether or not TODO comments should be used in general. This is a discussion far beyond the scope of this document. However, it proposes a standardized format for TODO comments in case such TODO comments are used.

1.1 Rationale

There are multiple dimensions about issues in source code which all need to be addressed somehow:

Either an issue exists or it does not exist. But we can agree, especially when talking about source code, that not being aware of an issue does not necessarily mean that there is no issue. That means we can describe a simplified physical issue lifecycle with the following states:

1.2 Relationship with Issue Tracking

This specification does not suggest that placing comments into source code always is the most appropriate tool for issue tracking. There are some issues with FIXME comments. A FIXME comment cannot be put in source code for issues for which the location in source code is not yet known.

1.3 Relationship with Code Maturity

The editor aims this section especially at customers, evaluators and reviewers of source code.

There is no inevitable relationship between the presence or number of TODO comments and any kind of quality or code maturity. The presence of TODO comments can mean that the source code involves technical debt. However, the absence of TODO comments cannot mean that the source code does not involve technical debt. An automatic removal of TODO comments is simple and will only remove the knowledge where technical debt may reside, but this does not pay back the technical debt itself. Therefore, it is not recommended to make absence of TODO comments a quality or acceptance requirement.

Of course, code without issues is better than code with issues. For code with issues, known issues are better than unknown issues: To address an issue, it must be known first. For known issues, it is better to know the location of the issue in source code than not to know it.

TODO comments can be of quite different nature, which the following example illustrates.

// FIXME:2009-05-04:Christian Hujer:Possible buffer overflow if length is too small.
// FIXME:2009-05-04:Christian Hujer:Fails for more than 2^16 users.

The first comment indicates a potential security problem.

The second comment indicates a potential future issue. The programmer wished to implement the code in context to handle more than 2^16 users. For some reason he decided not to do so for now, probably because it was not a requirement. The presence of that comment allows any programmer to easily spot potential problems when handling more than 2^16 users.

2 Syntax

2.1 General form of a TODO comment

[1] TodoComment ::= Keyword ':' IsoDate ':' Author ':' Text

TODO:2011-07-15:cher:3:Discuss in working group if and how whitespace should be allowed after ':'.

The following example illustrates how such a TODO comment could look like independent of a programming language:

TODO:2009-04-27:Christian Hujer:Add an example.

The following example illustrates how such a TODO comment could look like in C, C++, C#, EcmaScript, Java or similar languages.

// TODO:2009-04-27:Christian Hujer:Add an example.

2.2 Keywords for indicating TODO comment categories

[2] Keyword ::= ('TODO' | 'FIXME' | 'XXX' | 'Review')

The keyword for indicating a TODO comment is the most important part. The keyword 'TODO' is the most well known keyword for TODO comments. [SUNJAVACODE] as well as [JARGON] describe two other keywords besides 'TODO': 'FIXME' and 'XXX'. This document proposes another keyword 'Review' which identifies issues found in reviews which do not fit in one of the other categories.

List of keywords

TODO
The keyword TODO indicates something that needs work.
FIXME
The keyword FIXME indicates something that is bogus and broken.
XXX
The keyword XXX indicates something that is bogus but works.
Review
The keyword Review documents the comments of a code review.

2.3 Date

[3] IsoDate ::= Digit Digit Digit Digit Digit* '-' Digit Digit '-' Digit Digit

The date represents the timestamp at which the TodoComment was inserted into its enclosing document.

Rationale

The format of the date was chosen to be compatible with [XML_Schema_Date], which itself is designed to be compatible with [ISO_8601]. This means that the production IsoDate explicitely is designed to be year 10k safe and allows more than 4 digits for the year, just like XML Schema.

2.4 Author

[4] Author ::= Char* - (Char* ':' Char*)

2.5 Text

[5] Text ::= Char*

TODO:2011-07-15:cher:3:The spec should describe how to wrap text, or explicitely instead of implicitely omit that topic.

2.6 Supplementary Productions

[6] Digit ::= [#x0030-#x0039] | [#x0660-#x0669] | [#x06F0-#x06F9] | [#x0966-#x096F] | [#x09E6-#x09EF] | [#x0A66-#x0A6F] | [#x0AE6-#x0AEF] | [#x0B66-#x0B6F] | [#x0BE7-#x0BEF] | [#x0C66-#x0C6F] | [#x0CE6-#x0CEF] | [#x0D66-#x0D6F] | [#x0E50-#x0E59] | [#x0ED0-#x0ED9] | [#x0F20-#x0F29]
[7] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

3 Integration in Languages

4 Parsing

The syntax can be easily parsed with regular expressions in languages like ECMAScript, Java, Perl or XML technologies (e.g. XML Schema / XPath for XSLT). This chapter describes regular expressions for parsing the described TODO syntax.

4.1 AWK

The following regular expression can be used to parse TODO comments in awk (gawk).

TODO:2011-07-16:cher:3:Add gawk regex.

4.2 EMCAScript

The following regular expression can be used to parse TODO comments in ECMAScript.

(TODO|FIXME|XXX|Review):(\d{4,}-\d{2}-\d{2}):([^:]+):(.*)

4.3 grep

The following regular expression can be used to parse TODO comments in grep.

\(TODO\|FIXME\|XXX\|Review\):\([0-9]\{4,\}-[0-9]\{2\}-[0-9]\{2\}\):\([^:]\+\):\(.*\)

The following regular expression can be used to parse TODO comments in grep -E.

(TODO|FIXME|XXX|Review):([0-9]{4,}-[0-9]{2}-[0-9]{2}):([^:]+):(.*)

The following regular expression can be used to parse TODO comments in grep -P.

(TODO|FIXME|XXX|Review):(\d{4,}-\d{2}-\d{2}):([^:]+):(.*)

4.4 Java

The following regular expression can be used to parse TODO comments in Java using java.util.regex.

(TODO|FIXME|XXX|Review):(\d{4,}-\d{2}-\d{2}):([^:]+):(.*)

The following String is the regular expression for Java, escaped as Java String.

"(TODO|FIXME|XXX|Review):(\\d{4,}-\\d{2}-\\d{2}):([^:]+):(.*)

Please note that if you use this regular expression as a String in Java, that the '\'-signs need additional escaping. The following example shows how to use this regular expression in a String.

String re = "(TODO|FIXME|XXX|Review):(\\d{4,}-\\d{2}-\\d{2}):([^:]+):(.*)";

4.5 Perl

The following regular expression can be used to parse TODO comments in Perl.

(TODO|FIXME|XXX|Review):(\d{4,}-\d{2}-\d{2}):([^:]+):(.*)

4.6 sed

The following regular expression can be used to parse TODO comments in sed.

(TODO|FIXME|XXX|Review):([0-9]{4,}-[0-9]{2}-[0-9]{2}):([^:]+):(.*)

4.7 vim

The following regular expression can be used to parse TODO comments in vim.

\(TODO\|FIXME\|XXX\|Review\):\(\d\{4,}-\d\{2}-\d\{2}\):\([^:]\+\):\(.*\)

4.8 XML

The following regular expression can be used to parse TODO comments in XML Schema, XPath and XML Query. It also is suited to construct data types for XML Schema.

(TODO|FIXME|XXX|Review):(\p{Nd}{4,}-\p{Nd}{2}-\p{Nd}{2}):([^:]+):(.*)

5 Open Issues

This chapter lists open issues of this working draft.

6 Syntax summary

The syntax is given in the same [XML_Notation].

[1] TodoComment ::= Keyword ':' IsoDate ':' Author ':' Text
[2] Keyword ::= ('TODO' | 'FIXME' | 'XXX' | 'Review')
[3] IsoDate ::= Digit Digit Digit Digit Digit* '-' Digit Digit '-' Digit Digit
[4] Author ::= Char* - (Char* ':' Char*)
[5] Text ::= Char*
[6] Digit ::= [#x0030-#x0039] | [#x0660-#x0669] | [#x06F0-#x06F9] | [#x0966-#x096F] | [#x09E6-#x09EF] | [#x0A66-#x0A6F] | [#x0AE6-#x0AEF] | [#x0B66-#x0B6F] | [#x0BE7-#x0BEF] | [#x0C66-#x0C6F] | [#x0CE6-#x0CEF] | [#x0D66-#x0D6F] | [#x0E50-#x0E59] | [#x0ED0-#x0ED9] | [#x0F20-#x0F29]
[7] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

7 References

7.1 Normative References

XML_Notation
↗Extensible Markup Language (XML) 1.0 (Fifth Edition), 6 Notation. Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, François Yergeau. (See http://www.w3.org/TR/2008/REC-xml-20081126/#sec-notation)
XML_Schema_Date
↗XML Schema Part 2: Datatypes, 3.2.9 date. Paul V. Biron, Kaiser Permanente, Ashok Malhotra. (See http://www.w3.org/TR/xmlschema-2/#date)
↗XML Schema Part 2: Datatypes, D.3.3 More Than 9999 Years. Paul V. Biron, Kaiser Permanente, Ashok Malhotra. (See http://www.w3.org/TR/xmlschema-2/#morethan9999years)

7.2 Other References

JARGON
↗The Jargon File. Eric S. Raymond et al, 2003. (See http://catb.org/~esr/jargon/)
SUNJAVACODE
Sun Microsystems. ↗Code Conventions for the Java Programming Language. Peter King et al, 1997. (See http://java.sun.com/docs/codeconv/)
TODOORNOT
Approxion, ↗TODO or not TODO. Ralf Holly, 2009. (See http://www.approxion.com/?p=39)
ISO 8601
ISO (International Organization for Standardization). Representations of dates and times, 1988-06-15.
 . 
..: