ZFS VHDX Growing on Hyper-V 2019

I fiddle about with a charming little PFSense firewall at home, which I recently managed to mangle, so I decided to rebuild it on my new Windows Server 2019 Hyper-V virtualization host. Excitement!

Having thought that at least some of my problems were to do with file corruption, I did some very, very basic research, and decided this new-fangled ZFS thing sounds like the way to go!

It was great until it wasn’t! Here’s a snapshot of my drive size over time:

02/02/2019  08:27 PM     5,674,893,312 PFSense2019ZFS.vhdx
03/02/2019 02:21 PM 7,923,040,256 PFSense2019ZFS.vhdx
03/02/2019 02:30 PM 7,956,594,688 PFSense2019ZFS.vhdx
03/02/2019 02:51 PM 8,090,812,416 PFSense2019ZFS.vhdx
03/02/2019 02:52 PM 8,090,812,416 PFSense2019ZFS.vhdx
03/02/2019 02:52 PM 8,090,812,416 PFSense2019ZFS.vhdx
04/02/2019 03:19 PM 11,244,929,024 PFSense2019ZFS.vhdx
04/02/2019 03:29 PM 11,244,929,024 PFSense2019ZFS.vhdx
04/02/2019 03:59 PM 11,312,037,888 PFSense2019ZFS.vhdx
18/02/2019 03:14 PM 57,248,055,296 PFSense2019ZFS.vhdx
19/02/2019 05:40 PM 60,536,389,632 PFSense2019ZFS.vhdx

So over about 2 weeks, I’d used ~60GB of space(!) Unfortunately the drive size available on this host is only 120GB, so I’m already unable to export a copy locally. Grumble.

So I’m unsure what the moral of the story is. Sorry about that, I was hoping for a stronger conclusion too!

Does ZFS not work nicely with VHDX expanding disks? Does PFSense 2.4.4. chew disk space that it doesn’t report within the OS itself? Can I squeeze another question in before returning to my drive-switch-a-roo export and rebuild?

Surface Go: Fonts look a little heavy? Don’t forget ClearType tuning!

The title kinda says it all.

If you, like me, are the owner of a brand-new Surface Go, but also own another Surface Pro, you might find that the font weight is a little heavy by default when switching between side-by-side devices.

To fix that, just hit Start, type ClearType (OK, “Clear” is often enough) and run the ClearType text tuner. Then squint and/or answer honestly 🙂

It’s not something everyone needs to do – my Mum doesn’t have competing devices on a day-to-day basis – but it’s worthwhile to do the “ClearType eye test” for whatever eye-distance at which you actually use your new Surface buddy!

 

(This post brought to you by someone that noticed BGR on my RGB display while reading this!) Hint: Ctrl+Win+= zooms in Magnifier, and Ctrl+Win+- zooms out again… No extra software needed…

How SLAM retrieves the computer’s Local Admin password

Simple: SLAM doesn’t retrieve the computer’s Local Admin password – LAPS does!

SLAM is a Premier Operations Program offering (POP) for Securing Lateral Account Movement. It workshops credential theft mitigation (CTM) and counters lateral traversal with logon restrictions and firewall rules (among other protections)… but one key feature is deployment of LAPS, the Local Admin Password Solution.

So SLAM includes LAPS, and searching for how SLAM does something with passwords might not yield a result. (Hopefully “Until now…”). LAPS is quite well-documented, though, so answers are likely available.

POP-SLAM has been recently complemented by OA-SLAM (OA = Onboarding Accelerator), which is a more “let’s do it all in production”-style Microsoft Services offering.

How To (quickly) Tell If You’re 5 Years Out Of Date On Security Updates

There’s a fun indicator you can use to quickly evaluate whether you’ve been missing security updates for the last five years (ish) on older Operating Systems (i.e. Win2008-2008 R2), and it’s the build number. Not infallible, but then not often wrong.

Helpful Table Of Problem Versions

If you’d rather skip my rambling – and let’s face it, you should – here’s the list of build number indicators which might mean you have an update problem.

  • 10.0.14393.0 – Windows Server 2016 ships with a broken servicing stack which can’t talk to WSUS. (15 months out of date)
  • 6.3.9600.16xxx – (18xxx is current) means Windows Server 2012 R2 or Windows 8.1 without 2919355 applied (~3 years missing updates)
  • 7.6.7600 – Windows Server 2008 R2 or Windows 7 without Service Pack 1 (5 years missing updates)
  • 6.2.6001 – Windows Server 2008 Service Pack 1 (5 years missing updates)

By comparison, good build numbers (as of Sep 2017) are:

  • 10.0.14393.1670 – Win10 or Win 2016 with Sep 2017 CU (anything later than .187 is probably OK)
  • 6.3.9600.18xxx – Win2012 R2 post-2919355
  • 7.6.7601 – Win2008 R2 SP1 / Win7 SP1
  • 6.2.6002 – Win2008 SP2

You can use the WSUS console (yes, even if you’re using SCCM, though you probably have cooler methods available) to quickly evaluate build numbers across your fleet.
In a pinch, you can use AD Users and Computers if you’re just evaluating the third number in the sequence (i.e. doesn’t work for 2919355 or for Windows 2016 boxes).

Back In The Day, Build Numbers Were Even More Useful

Very helpfully, the Windows Vista era introduced incremental build numbers for Operating System versions when Service Packs were applied. So when it shipped, Windows Vista – which you’ll recall came out almost a year ahead of the server equivalent, Windows Server 2008 – shipped with the build number 6000.

Windows Server 2008 shipped with “Windows Vista” Service Pack 1 inbuilt, as it were, and so Vista SP1 and Windows Server 2008 SP1 (i.e. RTM) have the same build number, 6001.

Service Pack 2 followed, again incrementing the build number for both to 6002.

For the Windows 7 era, things were a bit more straightforward. Windows 7 and Windows Server 2008 R2 shipped at about the same time, as build 7600.

When Service Pack 1 was released for both, the build number incremented to 7601.

Quite a few of our Premier Security Assessments pull OS information using WMI from targets, and I sort by the self-reported build number to quickly identify groups of hosts which might not have a Service Pack. It’s very, very infrequently wrong. You could equally do the same by whether “Service Pack X” appears in the CSDVersion, but the build number is a nice, straightforward way of identifying this if you’re collecting it widely.

(AD Computer objects track what appears to be the same information, so querying AD might be a viable option if you’re reasonably certain that the computer objects there are still “live”).

What can you do with this information?

Well, you can say for sure that anything which self-reports as being build 7600 – i.e. not 7601 – probably hasn’t had any Windows security updates since about 2013.

The Support Lifecycle site notes that without SP1, Windows Server 2008 R2 (7600) exited support in April 2013. That’s the point after which security updates stop applying, because they require SP1 (7601), which isn’t installed.

Likewise, if you’ve a Windows Server 2008 (6001) Server, it hit End Of Support at the same time (and Service Pack 2 (6002) is required for any updates beyond that point).

If you haven’t got the relevant Service Pack approved in WSUS (or SCCM), the computers won’t even see updates beyond this point as being applicable. So it might seem like you’ve a bunch of completely updated and compliant servers, (on closer inspection finding lots of updates aren’t applicable to them) but if they haven’t taken the Service Pack, they’re only as updated as they self-report. And they know the newer updates aren’t for them.

In this case, “newer” means “pretty much everything since mid 2013”

What should you do?

So here’s what to do: Pull a report of the OS versions reported by servers within your environment. Clients too, if you think it’s possible some don’t have Win7 SP1.

You could do something like:

  • Start, Run, WinVer on a suspect PC (if it doesn’t say Service Pack X, problem)
  • PS:   get-adcomputer -Filter ‘(OperatingSystemVersion -like “*7600*”) -or (OperatingSystemV
    ersion -like “*6001*”)’ -Properties OperatingSystemVersion,OperatingSystemServicePack | export-csv NoServicePack.csv           #  (a blank NoServicePack.csv = good)
  • Or    wmic /node:servername os get version     – if WMI (RPC) is enabled to the target (in which case, extra bonus security points lost unless you’re using a PAW or management host – you should be firewalling!)
  • Or use WSUS: Turn on the Version column in the All Computers view in the WSUS console, then Group By (or just Sort by) Version and look at the build numbers reported. (Don’t forget to filter by Any)

If there are 7600s or 6001s found, check a few out, and just confirm that they’re not relevant-Service-Pack-less. (Best-case outcome: they’re being misreported.) If they are, try to work out and address the root cause – for eg, the Service Pack update wasn’t approved, or the WSUS catalog doesn’t include the update, or the PC isn’t in the right SCCM update group, or… whatever it is.

As a note, if you’re in that bucket, you’re likely to have many updates to apply, which will likely take some time and disk space to chew through. (If it’s simpler to redeploy an OS with a current build than update an older one, consider that).

And

And if you’ve found some unpatched boxes as a result of reading this, a) phew, lucky we found them now, and b) really think about that root cause. Mistakes in any human-driven process are predictable: does your process allow for mistakes and have any built-in correction for them? Update management isn’t always easy, but many update policies are geared towards fragility and failure, due to excessive process being required for an update to make it to the target box. A process failure without a corrective phase might result in updates being missed for years.

In some cases, what we hear is that some set of updates are initially rejected (or “deferred”) due to issues or concerns, which is fair enough – but then the decision doesn’t get revisited for months or years afterwards – sometimes never, until the update state is compared with Windows Update. If you don’t look back and check your assumptions – really test what updates are deployed and what you’re still vulnerable to – then things can rapidly and near-invisibly deteriorate, until suddenly, one day you’re looking back at 5 years of unpatched systems.

Core question: If the participants in your existing update process/policy had “just” been pointed directly at Windows Update and set to update weekly, how many Critical and Important updates might have been applied in the interim? Would the outcomes have been better?

And And: an afterthought for 2012 R2

I haven’t got into 2919355 yet, but it’s the 2012 R2 (and Windows 8.1) equivalent of a Service Pack, and as of late 2014, it became the mandatory update on which all other 2012 R2 (and 8.1) updates depended.

If you haven’t installed it, as with the older OSes above, updated would have stopped in – let’s say 2015. So you may be a couple of years behind by now.

I don’t know if it’s as simple as a build check for that one (it might be visible though the detailed build reported by  the WSUS console – I don’t have one to check right now), but it’s the other key update we find missing when evaluating update state using MBSA manually.

From a quick bit of KB spelunking, I figure there might be a way to tell from the WSUS reported client version (but it’d always be a “soft” confirmation) – check out the difference between the file information in the pre- and post-2919355 articles for the same update (while still in the grace period)

Pre (i.e. the version for computers without 2919355)

For all supported x64-based versions of Windows 8.1 and Windows Server 2012 R2

Post (i.e. the version information for computers with 2919355 installed already)

For all supported x64-based versions of Windows 8.1 and Windows Server 2012 R2

So I’ll hazard an ultra-hazardous guess, which is that if you have computers self-reporting in WSUS as being 6.3.9600.16xxx , they might have stalled at pre-2919355, so need 2919355 (or a descendent or prerequisite) approved, and then I assume the build number will be 17xxx or higher. MBSA can help you identify what Windows Update would think was missing, so you can search WSUS for approval states by KB ID.

Conditional Formatting Text in Excel from PowerShell

Hopefully a helpful note, as this had me confused for a while…

I wanted to add text-based conditional formatting to an Excel sheet I was creating from PowerShell – so I could colour one of the columns automatically depending on the values.

I used the technique any self-respecting dabbler would: I recorded a macro in Excel VBA and then tried to convert it over.

But no matter what I tried, I couldn’t get the !&$^$^@ FormatConditions.Add function to work.

After debugging the idiot-level mistakes (null variables; debugging without parameters 😐 ) out of the script, I was left with:

Type mismatch. (Exception from HRESULT: 0x80020005 (DISP_E_TYPEMISMATCH))
At C:\Users\tristank\Desktop\excelfmt.ps1:201 char:1
+ $newthing=$ActionColumn.FormatConditions.Add($xlTextString, $cond, $x …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationStopped: (:) [], COMException
    + FullyQualifiedErrorId : System.Runtime.InteropServices.COMException

Soo… not enough arguments, then?

The VBA reference which comes up first in Bing (and Google for that matter) wasn’t helpful – 4 arguments and I’d tried every combination of arguments I could conceive. The count seemed fine. But no FormatConditions object…

But debugging from the command line (thanks ISE), I tried punching in $ActionColumn.FormatConditions.Add( , and was stunned to see 7 arguments in the tooltip.

I added $null to all of them (natch) – and it worked! (Or at least I had a rule with the right conditions and formatting, but the condition was in the wrong spot).

So, armed with this new knowledge, I found the Excel Interop Object Reference for FormatConditions.Add … and there ya go. 7 arguments.

Object Add( XlFormatConditionType Type, Object Operator, Object Formula1, Object Formula2, Object String, Object TextOperator, Object DateOperator, Object ScopeType )

As I’d already been experimenting by the time I found the documentation, it seems like several bits might be interchangeable. I don’t know what I’m doing; I can’t warrant that this will work under any conditions except my own.

But to save someone else the time, here’s mine:

$ActionColumn = $worksheet.Range(“G:G”)
$xlTextString = [Microsoft.Office.Interop.Excel.XlFormatConditionType]::xlTextString
$xlContains = [Microsoft.Office.Interop.Excel.XlContainsOperator]::xlContains
$cond = “ACTION”
$newthing=$ActionColumn.FormatConditions.Add($xlTextString, “”, $xlContains , $cond, $cond, 0, 0) # hacky dodgy hacky hack / works

$fcs=$ActionColumn.FormatConditions.Count
$ActionColumn.FormatConditions.item($fcs).SetFirstPriority()
$ActionColumn.FormatConditions[$fcs].Font.ThemeColor = [Microsoft.Office.Interop.Excel.XlThemeColor]::xlThemeColorDark1
$ActionColumn.FormatConditions[$fcs].Font.TintAndShade = 0
$ActionColumn.FormatConditions[$fcs].Interior.Color = 255
$ActionColumn.FormatConditions[$fcs].Interior.TintAndShade = 0
$ActionColumn.FormatConditions[$fcs].StopIfTrue = $false;
   

And yes, I was naughty and used two indexing operators (during my “long script debugging” phase, when I was troubleshooting an uninitialized object and didn’t know it).

Sigh. And hope that helps!

Note: Surface Pro Volume-Up Reset

I once experienced a problem with my Surface Pro 3 where the keyboard (and screen rotation) stopped working while I was working in another State.

I can never find the instructions I used to fix it, so I thought I’d post them here.

Note: For Surface Pro devices only.

Hold Power and Volume Up for 15 seconds. The screen may do things (turn off, turn on. weird clown images appearing) during this time, but just keep pressing and holding.

After 15 seconds, release the buttons, and then press the Power button to boot the Surface again.

And check to see how things are…

The RDP Ghost is real.

Recently, when connecting to a reasonably-sleepy Windows system, I’ve seen the RDP Ghost.

An 8 bit rendition of a scary, scary ghost. It scared me.

If you’ve seen it, you’re not alone. It’s a thing.

I hope – one day – to capture a screenshot of this apparition.

The Rules appear to be that you can never be ready for a screenshot when RDP Ghost appears.

Update 22/04/17: I WAS READY! Kinda. And caught enough that I can understand the cause and the effect.

The good news is, you’re probably not hacked by a supernatural force. Probably. It’s just Windows trying to draw the sign-in guy before it’s ready!

image

Krebs’ Immutable Truths of Data Breaches

A rationale for more stringent risk assessment. Or indeed any risk assessment for internet connected assets, regardless of size or perceived value to others.

Krebs’s Immutable Truths About Data Breaches

“There are some fairly simple, immutable truths that each of us should keep in mind, truths that apply equally to political parties, organizations and corporations alike:

-If you connect it to the Internet, someone will try to hack it.

-If what you put on the Internet has value, someone will invest time and effort to steal it.

-Even if what is stolen does not have immediate value to the thief, he can easily find buyers for it.

-The price he secures for it will almost certainly be a tiny slice of its true worth to the victim.

-Organizations and individuals unwilling to spend a small fraction of what those assets are worth to secure them against cybercrooks can expect to eventually be relieved of said assets.”

Website Security Suggestion: Get rid of cruft! (script included)

Right: One of my pet hates is cruft on a production website.

Cruft is stuff – files – which has accumulated because nobody’s paying attention. Cruft includes sampleware. Developer experiments. Readmes. Sample configs. Backups of files which never get cleaned up. Just general accumulated stuff. It’s website navel lint. Hypertext hairballs.

Cruft. Has. No. Place. On. A. Production. Website!

Worst-case, it might actually expose security-sensitive information. (That’s the worst type of cruft!).

Want to find cruft? Well, easiest way to start is:

D:\WebContent> dir /s *.txt

That’s a good start. For every Readme.txt, add 10 points. For every web.config.txt, add 1000 points (why? That’s a potentially huge problem – .config is blocked by Request Filtering by default (with certain exceptions), but .config.txt: no problem! Download away.)

If you score more than 10 points, you need to rethink your strategy.

  • There is no reason for files like readme.txt to exist within your production website
    • Okay, there’s one reason and that’s when you’re providing one you know about, and have vetted, for download.
      • I mean, obviously if the site is there to provide readme.txt s for apps people are downloading, great! But if it’s the readme for some developer library which has been included wholesale, bad pussycat.
  • There is no reason for files like web.config.bak to exist within your production website.
    • Luckily, .bak files aren’t servable with the default StaticFileHandler behaviour. But that doesn’t mean an app (or * scriptmap…) can’t be convinced to hand you one…
  • If you have web.config.bak.txt files, you’re asking for trouble.
    • Change your operational process. Don’t risk leaking usernames and passwords this way.

The Core Rationale

Web developers and site designers should be able to explain the presence of every single file on your website.

I don’t care if it’s IIS or Apache or nginx or SuperCoolNewTechnologyX… the developers should be responsible for every single file deployed to production.

And before the admins (Hi!) get smug and self-satisfied (you still can, you just need to check you’re not doing the next thing…), just check that when you deploy new versions of Site X, you’re not backing up the last version of Site X to a servable content area within the new version of Site X.

For example, your content is in F:\Websites\CoolNewSite\ with the website pointed to that location…

  • It’s safe to back up to F:\Backups\CoolNewSite\2016-11-13 because it’s outside the servable website
  • It’s not cool to back up to F:\Websites\CoolNewSite\2016-11-13 because that’s part of the website.

How Do I Know If I’m Crufty?

As I do, I started typing this rant a while ago, and then thought: You know what? I should script that!

I had a bunch of DIR commands I was using, and sure, could’ve just made a CMD, but who does that these days? (Says my friend. (Singular))

Then {stuff}… but it finally bubbled to the top of my to-do list… So I wrote a first draft Get-CruftyWebFiles script.

I’ve lots of enhancement ideas from here, but wanted to get something which basically worked. I think this basically works!

Sure, there’s potential duplication if sites and apps overlap (i.e. the same file might be listed repeatedly) (which is fine; I figure you weed that out in post production), and if your site is self-referential it might get caught in a loop (hit Ctrl+C if you think/know that’s you, and *stop doing that*)

So, feel free if you want to see how crufty your IIS 7.5+ (assumed? Tested on 8.5) sites are:

The Script: https://github.com/TristankMS/IIS-Junk

Usage (roughly):

Copy to target web server. Then from an Admin PS prompt:

  • .\Get-CruftyWebFiles.ps1   # scans all web content folders linked from Sites, and outputs to .\crufty.csv
  • .\Get-CruftyWebFiles.ps1 -WebSiteName “Default Web Site”     # limits to just the one website.
  • .\Get-CruftyWebFiles.ps1 -DomainName “YOURDOMAIN”    # checks for that text string used in txt / xml files as well

Pull the CSV into Excel, Format as Table, and get sorting and filtering. Severity works on a lower-is-more-critical basis. Look at anything with a zero first.

Todo: Cruft Scoring (severity’s already in there), more detections/words, general fit and finish. Also considering building a cruft module for a security scanner, or just for the script, to check what’s findable on a website given some knowledge of the structure.

* oh! No I’m not

Sunsetting TMG 2010 with some (free!) Best Practices

Long and boring post ahead. So: KITTENS! There. Fluffy now.

As one of the Premier Field Engineers performing ISA Server Health Checks and then Threat Management Gateway (TMG) configuration reviews (by default, from my long association with Proxy 2.0 and then ISA), I was reviewing a document I put together for a customer just before shredding it, and thought:

You know what? Everyone should do these things! These recommendations are common enough that I seem to make them every time I see a TMG box… so why not generalize and recommend them here? Put them out into the wild. Get them shouted down. Give them their time in the sun.

So on the off-chance you’re a survivor of the TMG Survival Guide and you’re looking for some last-minute as-seen-in-the-real-world TMG corrective advice – and by “last minute”, I mean:

  • You know the base product is in Extended Support until 2020, then it’s going away. (sniff!)
  • You understand that Malware Scanning and Network Inspection System are already frozen at their last update level.
  • You know URL Categorization (Filtering) got turned off already so any rules using it might fail-open (or fail-closed)…

And in terms of pre-migration work

  • You’ve also been through your rule set, and tested that everything’s Least Privilege-compliant,
    • i.e. No broad “everyone can access anything/TMG/anywhere with any protocol” rules or anything like that.
      • No really, if you can connect to TMG via SMB, that’s usually not a good sign… You’re at least using Windows Update for patches, though, right?
  • Maybe you’ve performed an ISAINFO (and/or TMGBPA) export of your rule set so that you can ease the process of recreating them on the next egress device you pick? 🙂

…Because these are all fantastic first steps on the long migration path between proxies. If you haven’t done them, do put them on the list.

So before you shut down TMG that final time, and repurpose the boxes for Quake servers (or whatever you kids use spare boxes for these days)…

What best practices are available to you do in the meantime? Glad you asked!

Here’s the short list, the detail follows.

Proactively Protect The Box

  • Install the latest Windows Updates
  • Install the latest TMG Rollup Hotfix (SP2 UR5, potentially + .650 or later)
  • (Install any updates for any other software on the box)

Operating System Protection

  • Firewalling
  • De-Adminning
  • Attack Surface Reduction
  • AV exclusions

TMG Health and Perf

  • Check Tracing isn’t enabled
  • Disable/Relax Flood Prevention

And now the details…

Proactively Protect The Box

“It’s a firewall, it doesn’t need patching!(just for clarity: that’s not true)

Install the latest Windows Updates

  • If you’re not installing Windows updates, um, I don’t know what to tell you?

You understand that unpatched vulnerabilities win over security settings, permissions and antivirus, right? Any on-box control is potentially circumvent-able by an unpatched (bad) vuln?

And you’re still thinking it’s optional? Well! That’s nice! I hope you’ve a mitigation strategy in place, and an incident response plan for when that one fails.

TMG defends itself pretty heavily against network attack ( (a: by default) (b:to an extent; it still leverages OS components for certain chunks of functionality)), but lots of people end up creating rules which – paraphrased – allow the Internal network to hit any port on the TMG computer. Because reasons!

This is the same pathology which leads people to not patch their CAs, or not to use firewalling between hosts on their internal network – it’s the opposite of a defence in depth approach!

Anyway, back to updates:

  • When I check the update state of a box, I do so by running MBSACLI (the command-line version of MBSA) using the current WindowsUpdate CAB if the box doesn’t have Internet connectivity.

mbsacli /xmlout /nvc /nd /wi /catalog .\wsusscn2.cab /unicode > %computername%-MBSA.xml

    • I actively avoid using the default customer WSUS catalog, because it’s completely possible to be 100% compliant with the WSUS approval policy and have unapproved updates missing from five years ago, which were skipped for a good reason, but then that decision was never revisited.
  • It is uncommon in my experience to find that servers are up to date. For a security appliance at the edge of the network, used as an ingress or egress point by thousands of clients, this is suboptimal.

 

Windows Server 2008 R2 Service Pack 1 is needed for Security Updates

Keep in mind that some updates require the presence of a Service Pack or other major update.

  • So the first thing I’d check is WinVer.
  • If WinVer says you’re on Windows 2008 R2 version 7600 and doesn’t mention a Service Pack, you need to get to 7601 (Service Pack 1) pronto, and then start applying all the updates which have required SP1 – say, the last 4-5 years’ worth, which includes many Critical updates.
  • Windows 2008 should be at SP2. If it’s not at SP2, same thing applies as above.

This, again, is sadly not uncommon.

 

If You Found You Had Something Missing: Why Not Just Use Windows Update?

  • If you find they’re missing updates because { ¯\_(ツ)_/¯ }, my standard remediation suggestion is: just point them at public WindowsUpdate and specify your schedule. Let them pop out through a proxy, or go direct if they’re edge devices.

Yep. I’m serious. Better a security-sensitive device which is up to date by automatic patching at 3am on a Thursday than one which is out of date at all times by policy.

See also: Least Privilege Rule Set. If an attacker can’t hit the vulnerable port, you don’t have that problem.

 

Install the latest TMG Rollup Hotfix

Now, don’t misunderstand me: TMG isn’t the simplest thing in the universe to update (unlike its predecessor ISA Server, which was a positive dream by comparison). But if you’re reading this, you probably work in IT, so that’s not actually an excuse not to do it! 🙂

Yes, it’s a pain going from RTM to SP1 to SP1 + U1 to SP2 to SP2 Rollup 5, but… you should do it. You need to do it. If you’re one rollup behind, you’re actually 12-18 months of updates out of date. With hundreds of builds in between. Many issues have been fixed over the years, including hangs, crashes, and possibly a security update or two, if memory serves.

  • The latest rollup version I’m aware of is TMG Service Pack 2 with Update Rollup 5. If Help/About in the TMG MMC shows you a version earlier than 7.0.9193.644, well – that update was from 2014.
  • There’s one post-rollup hotfix I’ve seen (which is for SNI websites with HTTPS inspection enabled, but it provides a version bump to .650 for many core components too) which gets us to April 2015: https://support.microsoft.com/en-us/kb/3058679 .

 

Operating System Protection

Lifecycle and post-Lifecycle Firewalling

In April 2020, TMG exits Extended Support and is no more.

But by a quirk of the Support Lifecycle, Windows Server 2008 (and R2) actually exits Extended Support in January 2020, so a TMG box running down the clock will potentially be partially unprotected from an OS security updates perspective between January and April. (Unless a Custom Support Agreement is available, but it’s probably more costly than the alternative). So it’s not a terrible assumption that you’ve basically got until Dec 31, 2019 to get everything sorted out.

  • I don’t mind restating the obvious, so I will: You should have migrated away from TMG before the end of 2019. Please!
    • That’s still 3 years from now to plan and execute your migration
    • So if you haven’t already started, please add it to yourTo Do: 2017” list now.
  • If you do still have some TMG kicking around at that point, consider hardening the TMG Firewall policies (including the System policies) to limit all nonessential connectivity to the TMG hosts by any other computer.
    • In fact, think about doing that anyway, particularly if you actually had work items pending from the “Install Windows Updates” item above. Because that’s an attack surface exposure compounded with known vulnerabilities. That’s a poor combination for a security device.

If you’re planning to run beyond the end of support, don’t!

But if you do find yourself there: also think about defence in depth approaches. The sort you’d want to take with a Windows 2000 machine on your network if some business unit decided it needed to be added this year: isolate, put external firewalls in front of and behind it, so you seriously limit the ingress and egress paths available to it in case of compromise. Yes, TMG’s a firewall, but trusting {the actions of an on-box firewall which isn’t receiving security updates any more (in 2020)} on {an operating system which also isn’t receiving security updates any more} seems like it’s a bad bet compared to an external security device which is presumably still getting updates. Yah?

 

De-Adminning

  • Just check the membership of any groups who have Admin permission to the box.
  • Then eliminate any local admins except one (if you don’t fully de-admin boxes), and remove any Domain groups you can.

Then, unless you’re sure (I mean certain, i.e. you’ve checked, not “I assume it’s quite unlikely”) that a) there’s only one local Admin account, and b) the password for that local Admin account is already unique and not known to anyone unauthorized, reset the remaining Admin password to a unique value (unless you’re already a LAPS shop, or use other password management tools… but please, check whether TMG’s part of the LAPS group, don’t just assume it is… that’s how SUS patching doesn’t work too!)

 

Basic Attack Surface Reduction

Most TMG boxes seem to have management agents for something or another installed on them. Actually, as a related observation, it’s not uncommon for me to find servers with multiple management agents for multiple generations of monitoring systems on them. Often disused ones. These are pure attack surface additions, and often running with privileged access levels. Very often with known vulnerabilities.

In short: Either kill ‘em, or at least make sure they can’t be contacted over the network (using Firewall policy).

  • If you have looked at them in the last 6 months, you can be excused from this item.
  • If not, check to see what the file dates of the EXEs are. If they’re over 3 years old, they’re probably a liability and almost certainly aren’t being updated, and simply represent an increased attack surface, so consider removing them.

 

Antivirus

Observe the exclusions needed for Antivirus when running on a TMG host. If you don’t exclude the right stuff, it can get a bit jammed up.

 

TMG Health

Tracing?

This one’s much less common than the above few.

  • Run RESMON for a short while, look at the Disk IO area, and sort by Bytes Total/sec. Note any files which have lots of IO over a 2 minute period.
    • (The idea is to try to minimize IO where possible)
  • If activity to ISALOG.BIN is chewing through a megabyte or more per second, TMG may be tracing something
    • or still tracing something – this has been seen when TMGBPA is used to run a diagnostic trace but for whatever reason it doesn’t terminate cleanly.
    • It might also indicate a diagnostic logging session is in progress (just check the console under Troubleshooting –> Diagnostic logging and hit Disable if it isn’t already disabled).

Note that in my experience, some minimal isalog.bin activity (say under 64K/sec) is normal.

If that’s the case, run the ISA Data Packager again, and open the tracing options, then untick everything.

 

Flood Mitigation

Going to say something a bit controversial here: You might want to experiment with turning off or massively increasing the defaults for flood prevention, particularly for outbound scenarios.

The defaults for this feature haven’t changed since it was introduced in 2004, but wow, Internet surfing patterns sure have.

So I say:

  • Try a 10X increase in the numbers, particularly for HTTP and TCP connections, and see how you go.
    • If it stops the constant alerting about “infected clients”, and you’ve got burnout from chasing them down only to find it was Bruce in Marketing opening eighteen instances of FireFox to their brand new multi-pronged CDN-driven site manually, it might be a welcome change, and reduce grumbling (nothing like a paused connection to cause a user to get grumpy about “the )(@$& Proxy”)…

 

And that, believe it or not, covers the most common TMG practices I’d suggest. Minimal TMG, maximal patching and defence in depth.

So there you have it. The most common Stuff I’ve seen over the years with TMG. Now go work out how you’re going to migrate egress to something else… (I assume Azure AD App Proxy will take care of the HTTP stuff, and/or Load Balancer Of The Year for the non-http bits…)