This blog entry doesn't discuss any code, because I ended up doing very little work on the software. I was busy running it, and the app tends to take over the UI, making programing difficult. For the most part, it functioned as expected, but a couple things changed in how I actually used it.
I added a text box to restrict scanning for MXD files to a subdirectory. This new feature allowed me to set the source root and destination root directories, and then type the subdirectory to be processed. The text box could also be left blank.
I used this feature a lot.
The main feature I stopped using was the scheduler. The software was so slow that it failed to use the network 100%. The point of having the program pause at night was to let backups happen faster, to avoid competing for the network.... but since the job appeared to be CPU-bound, I had was a big disincentive to be a good network citizen. The most efficient thing to do was let the app run all the time, even if it impacted the network -- every time the app was "nice" to the network, the length of time to complete it's task would grow by that amount plus the additional CPU time it took to complete the task.
Another way to say it: "donating a dime cost a dollar."
Another way to look at it: I had to use the shared network resource first, to be able to use the private CPU resource, and the CPU resource was in "shorter supply" compared to the cost of real-time-spent.
What happened, instead, was that I used the subdirectory box to partition the entire task into long and short jobs (a job being a batch of files). I also had a list of active and dormant projects.
During the day, I tried to run dormant projects and short jobs. During the night, I tried to run active projects and long jobs. (I did all this via remote access.)
The daytime strategy was to increase downtime. I could start a job, and leave, using the network for an hour or two, and then possibly forgetting to start another job. The lost productivity on my side was balanced by faster access for the people in the office.
The nighttime strategy was to reduce downtime and move active projects (so people wouldn't see files vanishing as they worked on them). If the big batch of files had no errors, it would run all night, slowly, due to the backups also running, but it would eventually get done. Built into the backup system was a 3-hour buffer between 5am and 8am, so the risks of swamping the network could be mitigated.
As I noted in the previous post, bad data files caused problems. These exceptions tended to be isolated to subdirectories, so that some subdirectories had a lot of errors, and others had few or none. By deferring processing of troublesome directories, I could batch those up when I could get some face-time with the computer, to ease the tool through the rough spots.
During this effort to transfer files, there were around 120 bad files out of 1,400 (mxd files) total. They were concentrated in five directories, and within those, they tended to cluster. It's probabaly a bad linked file or something.
The problem with the bad files must be addressed, because they make the entire effort run longer. This 5-day effort went on 9 days. That's the effect of a pretty small fraction of bad files. On the other hand, if you have good files (and you know this because you don't get errors opening files when you use them), there won't be many (or any) problems running batches.