Skip to content

Win32Regex

Trac Migration edited this page May 18, 2019 · 1 revision

How to Test RegEx on Box Backup 0.10 Windows Client

Example RegEx on Windows

Before we start, here are some examples of RegEx on Windows:

BackupLocations
  {
      CDocSettings
      {
       Path = C:\Documents and Settings
		ExcludeFilesRegex = .+\\pagefile\.sys$
		ExcludeDirsRegex = .+\\pagefile\.sys$
		ExcludeFilesRegex = .+\\boot\.ini$
		ExcludeFilesRegex = .+\\NTDETECT\.COM$
		ExcludeFilesRegex = .+\\NTUSER\.DAT$
		ExcludeFilesRegex = .+\\NTUSER.DAT.LOG$
		ExcludeFilesRegex = .+\\NTUSER.INI$
		ExcludeFilesRegex = .+\\UsrClass\.dat\.LOG$
		ExcludeFilesRegex = .+\\UsrClass\.dat$
		ExcludeFilesRegex = .+\\administrativeInfo.dbf$
		ExcludeDirsRegex = .+\\System Volume Information$
		ExcludeFilesRegex = .+\\ntldr$
		ExcludeDirsRegex = .+\\temp$
		ExcludeDirsRegex = .+\\Temporary Internet Files$
		ExcludeDirsRegex = .+\\Local Settings\\.*\\Cache$
		ExcludeFilesRegex = .+\\thumbs\.db$
		ExcludeFilesRegex = .+\\~.*
		ExcludeFilesRegex = .+\\Perflib.*
		ExcludeDirsRegex = .+\\Google Desktop Search$
		ExcludeFilesRegex = .*RetroExpress.exe.*\.inuse.*
		ExcludeDirsRegex = .+\\Application Data$
	 	ExcludeFilesRegex = .+\.avi$
	 	ExcludeFilesRegex = .+\.bk[~!0-9]$
	 	ExcludeFilesRegex = .+\.iso$
	 	ExcludeFilesRegex = .+\.mpe?[2345g]$
	 	ExcludeFilesRegex = .+\.pst$
		AlwaysIncludeFilesRegex = .*backup.*\.pst$
	 	ExcludeFilesRegex = .+\.qbw$
		AlwaysIncludeFilesRegex = .+\.qbb$
	 	ExcludeFilesRegex = .+\.tif[f]$
     }
 }

Note that you need to escape directory slashes and filename dots with a slash.

Testing RegEx on Windows

The following text describes a way to test that files that are supposed to be excluded from backup actually are excluded (and also to test that this test works).

Windows has limited command-line utilities, so I used cygwin (a little bulky, but gives me lots of control with a familiar environment). (Note that googling for "grep for windows" and "cron for windows" shows some products that might be interesting.) All this has to be done as a user in the Windows Administrator group. This was done on a Windows Server 2003 client backing up to an Ubuntu Dapper store server, both Box Backup version 0.10.

To confirm that the bbackupd.conf ExcludeFilesRegex function is working, make sure that you put one or more bogus files on your client hard drive that have filenames that you would want to be excluded from backing up, for example, in a cygwin window:

echo "this is a Backup Service test file, please do not delete, <your name/email/phone here>" > /cygdrive/d/Data/Public/cookies.txt
echo "this is a Backup Service test file, please do not delete, <your name/email/phone here>" > /cygdrive/d/Data/Public/backup_test.avi

You'll have to wait for these files to have been checked (and hopefully skipped) by the hourly (default) backup file scan.

Open your bbackupd.conf file and copy the regex that you want to test, and paste it into a notepad window so you can experiment easier.

Here is what I typically use in my bbackupd.conf BackupLocations, a work in progress:

ExcludeFilesRegex = .+\.([aA][vV][iI]|[iI][sS][oO]|[mM][pP][eE]?[3gG]|[tT][mM][pP]|[bB][aAcC][kK]|[dD][bB][kK]|[bB][kK][~!1-9]|[mMtT][bB][kK]|[oO][lL][dD]|[sS][aA][vV]|[sS][wW][pP]|[xX][lL][kK]|[cC][sS][mM]|[dD][sS][kK]|[oO][bB][jJ]|[pP][aA][rR]|[dD][bB][xX]|[dD][lL][lL])$
ExcludeFilesRegex = .*\\thumbs\.db$|.*\\history\.dat$|.*\\cookies\.txt$|.*\\~.*

But you have to tweak them before this form of testing because we will be using cygwin's regex, not Windows' (or bbackupquery's, which may be a twinkle in some developer's eye).

So, next, again in a cygwin window, I find it convenient to navigate to the Box Backup directory:

cd /cygdrive/c/Program\ Files/Box\ Backup/

Then issue your version of this command (I've abbreviated my lengthy regex here for clarity, but using the entire thing does work):

./bbackupquery "list -odtsr" quit | egrep -e '.+\.([aA][vV][iI]|[bB][aAcC][kK]|qbw).$' -e '.*/cookies\.txt.$|.*/~.* '

That should all be on one line.

Notes:

  • You have to flop the directory delimiters (from \ to / ) and remove the extra \ that was used as a directory-character escape (but NOT the other escapes, like . ), because bbackupquery output flops the delimiters (a kind of bug, I guess).
  • For perhaps related reasons, we have to put a . before the line-terminator $ (it only took me about an hour to figure that out, and, btw, another hour to figure out to use egrep and not grep).
  • I salted(?) my test regex (NOT my bbackupd.conf regex) with a regex of a file that I know is in the store on the server (.qbw files for QuickBooks), just to make sure that the test is working.
  • You can combine your BackupLocation regex with multiple egrep -e flags (egrep -e 'regex1' -e 'regex2').
  • You can probably remove the leading .* from the regex, in this case (.*/cookies.txt is probably the same as /cookies.txt here).

Here's my actual run of this test, lightly redacted:

user@bbclient /cygdrive/c/Program Files/Box Backup
$ ./bbackupquery "list -odtsr" quit | egrep -e '\.([aA][vV][iI]|[iI][sS][oO]|[mM][pP][eE]?[3gG]|[tT][mM][pP]|[bB][aAcC][kK]|[dD][bB][kK]|[bB][kK][~!1-9]|[mMtT][bB][kK]|[oO][lL][dD]|[sS][aA][vV]|[sS][wW][pP]|[xX][lL][kK]|[cC][sS][mM]|[dD][sS][kK]|[oO][bB][jJ]|[pP][aA][rR]|[dD][bB][xX]|[dD][lL][lL]|qbw).$' -e '/thumbs\.db.$|/history\.dat.$|/cookies\.txt.$|/~.* '
No random device -- additional seeding of random number generator not performed.

00000004 f--o-- 2003-10-15T20:44:06 00022 DDataFinanceAccounts/Archive Copy 10-15-2003 Rl.qbw
0000009e f-X--- 2003-10-14T22:02:57 00687 DDataFinanceAccounts/archive/Backup 101503/T4.qbw
00007166 f--o-- 2006-05-09T19:29:04 00454 DDataFinanceAccounts/Hs.qbw
000071c1 f--o-- 2006-05-10T20:20:42 00236 DDataFinanceAccounts/Hs.qbw
000071ec f--o-- 2006-05-11T18:00:11 00224 DDataFinanceAccounts/Hs.qbw
00007215 f--o-a 2006-05-12T20:47:41 28446 DDataFinanceAccounts/Hs.qbw
00007244 f----- 2003-10-15T15:44:06 00876 DDataFinanceAccounts/Archive Copy 10-15-2003 Rl.qbw
0000b397 f--o-a 2006-05-16T16:06:54 26153 DDataFinanceAccounts/Hs.qbw
0000b879 f--o-a 2006-05-17T15:59:52 00370 DDataFinanceAccounts/Hs.qbw
0000b898 f--o-a 2006-05-18T16:36:35 25896 DDataFinanceAccounts/Hs.qbw
0000b8c3 f----- 2006-05-19T17:06:22 39868 DDataFinanceAccounts/Hs.qbw
00004796 -d---- 1969-12-31T19:00:00 00000 DHome/<user>/My Documents/PVSW/Bin/btcompat.bck

user@bbclient /cygdrive/c/Program Files/Box Backup
$

Note that the .bck "file" should NOT show up because .bck is in BackupLocations ExcludeFilesRegex ([bB][aAcC][kK]), but notice that it has the "d" flag. So, that "file" is in fact a directory, and since [bB][aAcC][kK] appears in the section of my BackupLocations ExcludeFilesRegex that is intended to trap filename extensions that end the path/filename.ext names, that directory is NOT excluded by my BackupLocations ExcludeFilesRegex, but, interestingly, the bbackupquery output for that "file" appears to my test regex to be a "file" that was supposed to be excluded. A trailing slash in the bbackupquery output (like so: DHome//My Documents/PVSW/Bin/btcompat.bck/ ) would have prevented this false positive. All is good.

(Note that there is a bug somewhere that causes that directory's datestamp to appear as 1969-12-31T19:00:00 when it should be 2005-03-17T11:45:15. -- this is not a bug as such. The timestamps on directories are not stored by Box Backup -- Chris)

Clone this wiki locally