Background
5 offices, 1 domain and full mesh VPN connetivity. 90 users total. DC in each site.
Each office has a directory which will be replicated to the DR site. All 5 offices have read/write access to this directory.
Win 2k8 R2 everywhere and domain based name space
DR location has users in its AD site
I'm setting up DFSR for offsite data replication (DR site, which is another office). No budget to get software that addresses distributed file locking.
Primary replica - main site
Read/write
DR replica
Read only via Share permissions. Too many annoyances with DFSR read only.
DFSN referral disabled to prevent people in the same site from accessing this replica
Challenge
One primary site goes offline (ie prolonged internet outage) & we want to be able to provide the data in the DR site to use for the other 4 offices, AND take a backup prior to fail back. In the site that is offline they can continue to work
with their local file server. What I'm thinking about is doing this.
Fail over:
- In active AD site enable the referral to the DR data, disable the referral to the Primary copy. Replicate AD & get users to clear PKT cache
- Change share permissions so everyone can read/write to the DR data
- On all reachable DCs, block ports for AD replication with the DC that's currently isolated
- DFSR custom port and block replication between DR & isolated primary using firewall
Fail back (off hours):
- Take backup of both DR & primary site replica data
- Fix DFSR referrals, DR share permissions, and replicate AD & have clients update cache
- Check for files modified since outage in primary and DR (script)
- Unblock AD replication to isolated DC and DFSR traffic so everything goes back to normal.
- Analyze logs, check conflict/deleted, etc. Address conflicts as necessary.
I have done a lot of testing with 50+ different use cases, and noticed that not all the time do files get moved to the conflict & deleted, or pre-existing, etc. folders. This is the reason for manual fail over & fail back. Due to the nature
of our business I cannot take this chance. I realize there is a lot of manual work, but I don't fully trust that DFSR is where it needs to be for us.
Does anyone see any issues with this or have any other ideas? I don't anticipate being offline for more than a day and our AD changes are very minimal. I still perform traditional backups to tape, and also use VSS. Please keep in mind
we are a small business.