±regex

Modifies a CSV table field content using a regular expression

This allows modifying a field using a regular expression (regex).
See also https://wikipedia.org/wiki/Regular_expression.

The source expression, the pattern and the replacement are each separated by a separator (mostly a "/"). So this is a three-part combination.
Please note that, in order to avoiding confusion, inside each part of the combination a literal slash must be escaped with a backslash, like so: \/ .
When giving multiple Regex expressions it might also be necessary to protect any embedded comma with a backslash, like so: \, . This is because multiple Regex expressions for multiple affected fields are also concatenated using commas. So any literal comma inside an expression must be escaped.

The starting part can be a literal string, some field content, or a more complex expression. It will be resolved before Regex pattern matching takes place. If you want to just modify the current field itself, without external components, you can just use [*] as the expression part or simply omit it (so that the argument starts with /).

The second part is a common Regex pattern that follows the usual regex conventions.
Therefore, this Regex pattern will not be resolved according to the CSVfox expression resolving rules, because it has to be handled solely by the Regex automatic.

The third part describes the replacement pattern that finally will be inserted as the the new fields content. The whole captured Regex group can be back-referenced with $0 or with \0, and any captured sub-groups can be back-referenced as $1 to $9, or as \1 to \9.
Whether \ or $ should be used depends on the command line requirements, and which of them can be entered easier.
Finally, this generated replacement part will also be seen as a resolvable expression, and becomes resolved.

Modifiers
If the regex modifier /i is used, the pattern matching will be case-insensitive.
If the regex modifier /m is used, the pattern matching will use multi-line mode. The ^ and $ characters will then find the start and end of each line, instead of finding start and end of the whole text.
If the regex modifier /s is used, the pattern matching will use single-line mode. This changes the . from finding any character else than newline to any character including newline.
If the regex modifier /e is used, the pattern matching will return only explicitly named groups, instead of all groups.

Pattern

±regex[Field]=/Pattern/Replacement ±regex[Field]=Expression/Pattern/Replacement ±regex/i[Field]=Expression/Pattern/Replacement

Usage examples

some examples here ...

Correcting trailing minus signs

This moves all trailing minus signs of an amount field to the front, if the fields contain only a decimal number, a minus sign, and optionally some whitespace around:

-regex[Price]=[*]/^\s*([0-9][0-9,.]*)([-])\s*$/\2\1

It does the following:

Expression:
The first part [*] refers to the content of the current column (which is here the column "Price").

Pattern:
The regex pattern between the two slashes describes :
- the beginning of the string (^),
- perhaps some whitespace (\s*),
- a group of numbers and separators (in brackets, i.e.group 1),
- a minus sign (in brackets, i.e. group 2),
- again perhaps some whitespace (\s*),
- and the string end ($).

Replacement:
When the current field content matches this description, it will be replaced by
first with group 2 (\2, containing the minus sign)
and then group 1 (\1, containing the numbers).
So now the field will consist of first the minus sign, and then the numbers.
The whitespace has gone because it was not captured in groups and inserted back into the replacement expression.

Summing up Digits of a Number

This sums up all digits of the year of birth Birth into a field SumOfBirthDigits.

+add[SumOfBirthDigits]
+regex[SumOfBirthDigits]=[Birth]/(\d)(\d)(\d)(\d)/$1+$2+$3+$4=($1+$2+$3+$4)

Expression:
The first part [Birth] says that this field shall be used.

Pattern:
The pattern (\d)(\d)(\d)(\d) captures 4 digits in the field [Birth] and puts them into the backreference placeholders $1, $2, $3, and $4.

Replacement:
The content $1+$2+$3+$4=($1+$2+$3+$4) first displays all four backreferences concated with a plus + sign. Then after the equal sign = comes a formula which, when being resolved, adds all the values and returns the sum.

[Example: remove everything before a colon]
[Example: remove an URL]
[Example: reformat a date]