Hi all,
I've tried what I could find online and nothing seems to help. You can read about my setup and what has been tried so far here:
Now, after all above was done, I decided to wait it out for a week to replicate all changes. The backlog went down steadily over a few days, on our graph it looks like almost flat downward line. Then, at some point when there were about 500K changes left to replicate at each server, the graph line went into free fall and DFSR replicated all that within 3 hours. Great I thought, it must have been overloaded somehow and when it went below threshold it did what it was supposed to. Not quite - one of the folders simply never replicated changes, the backlog kept growing to this day despite everything that has or hasn't been done.
I observed a sudden increase in backlog at midnight, almost every day, where backlog goes from about 135K to 2 million or something, then it goes down over a period of some hours, back to 135K (folder that won't replicate no matter what). I have no idea why this is happening, there are no changes to any of our DFS folders at or around midnight, and certainly nothing that would cause 2 million changes to show up in backlog.
Restart or midnight will cause backlog to go up. What is happening internally with DFSR at that time that will cause this? Are the changes not committed once replicated, and reboot shouldn't have caused this? Or is the backlog reading giving me false data?
On the folder that won't replicate : DC01 has 135K outgoing changes, other 3 servers have 135K/3 incoming changes. I've tried running
Wmic /namespace:\\root\microsoftdfs path dfsrreplicatedfolderinfo get replicationgroupname,replicatedfoldername,state
And on DC01 for that folder I get state 2, initial replication. DC03 is primary member for this folder. All other DCs, 02, 03 and 04 show state 4 for the same folder, so they appear to be fine. It's just DC01 that for some reason thinking initial replication hasn't finished yet and is patiently waiting for something. I've tried setting primary member again to DC03 with
dfsradmin membership set /RgName:group /RfName:folder /MemName:DC03 /IsPrimary:True
Everything executes fine, ran all the AD updates to force / replicate / update / poll etc. Hours later DC03's DFSR event log still doesn't show event relevant to initiating or completing initial replication. Querying
dfsradmin membership list /RgName:group /Attr:MemName,RfName,IsPrimary
returns DC03 as primary member so that's set as it should be.
I'm at a loss what's going on here, any help and ideas appreciated.
Thanks