Character Escaping and Placeholders
Defusing special characters in CSVfox commands
Why escape some Characters?
Hide special characters from expression resolving
There are a handful of characters that have special meanings in the context of CSVfox commands resp. command line or shell parameters.
If you need to use one of these special characters als literals in text or expression, it must be "escaped". This prevents misinterpretation of the character when the expression is resolved by the application. It is then hidden from interpretation as their special meaning.
The escaping will finally be removed, and the intended character will stay.
Escaping of Characters
Escaping a character is done by prepending a backslash (\) before it.
Any of the special characters in the table below can be "hidden" from erroneous resolving through typing a backslash in front of it.Example:
This expression:
But this expression:
For a literal backslash "\", the backslash itself must be duplicated:
Tables of Concerning Characters
There are various reasons why a character cannot be typed directly into the command.
Interpreted Characters
These characters are frequently part of the CSVfox command syntax. Escaping them is done to allow using them in literal text without interfering with the command itself. Technically, they will be "hidden" from the command interpreter and "revealed" again later.
Sequence | Character | Why? |
---|---|---|
\[ | [ | Start of column or variable identifier |
\] | ] | End of column or variable identifier |
\( | ( | Start of a numeric expression |
\) | ) | End of a numeric expression |
\{ | { | Start of a text expression |
\} | } | End of a text expression |
\< | < | Operator, see also \L |
\> | > | Operator, see also \G |
\\ | \ | Backslash, used for "escaping" |
\/ | / | Operator, command or arguments list delimiter |
\= | = | Assignment operator, command/argument separator |
\* | * | Operator |
\# | # | Operator |
\@ | @ | Operator, or argument expressions separator |
\, | , | Part of a column list enumeration, or of a comma-separated list |
\- | - | Part of a column list (from - to) |
Control Character Sequences
The following sequences can be used in any literal text, on the command line, in conditions, and in expressions. They will be converted to the respective control characters when resolving.
These sequences will NOT be converted
- on file name parameters of the commands +in, -out, %merge, %log, and %job. This is because the sequence might be a regular part of the file path.
- in the Regular Expression pattern in ±regex command and in the right operand of the Regex rule "@" of Conditions.
Sequence | Character | Name |
---|---|---|
\a | 0x07 | Bell Alert |
\b | 0x08 | Backspace |
\t | 0x09 | Tab |
\n | 0x0A | New Line |
\v | 0x0B | Vertical Tab |
\f | 0x0C | Form Feed |
\r | 0x0D | Carriage Return |
\e | 0x1b | Escape (27) |
\s | 0x20 | Space Character* |
\P | | | "Pipe" Character* |
\L | < | Redirect input* |
\G | > | Redirect output* |
* These characters are not a control character in the strict sense, but can be used on the command line to insert a space, a pipe or a redirect symbol which otherwise would split and invalidate the argument.
Back References
These character sequences occur solely in the ±regex command as the last part of "Original/Pattern/Replacement" argument.
They stand for the matching back references that will be taken from the Original according to the Pattern. For Regular Expressions, see also https://en.wikipedia.org/wiki/Regular_expression
The matched groups will be refenced in the Replacement part by:
while
Extended Character Placeholder
Purpose
In order to allow entering arbitrary literal character values that are not available on the keyboard, a special extended character placeholder is supported.
It looks like this:
where X are hexadecimal digits from "0" to "9", from "A"to "F" and from "a" to "f".
Any number of digits between 1 and 6 are allowed, but the maximum allowed hexadecimal number is 10FFFF, i.e.
This placeholder uses the Unicode code point numbering.
How to use it
Which characters can be used ultimately depends on the output file encoding. E.g. if this encoding is ASCII, there is no sense in inserting any characters beyond that.
Examples:
The value for a whitespace can be written as
The Greek "Sigma" Σ, also known as the summation operator, can be inserted using
And a chinese sign for "house" is 屋, which also can be written as
These placeholders will be replaced before any other resolving takes place. So they can be used as part as column names as well as in arbitrary text.