My main complaint with both ETAP and SKM PTW is that neither company seems to have done much to improve the basic load flow/short circuit analysis models from the early days when they were written in Fortran77. They both still carry the same bus matrix size limitation as they did in the mid-1980s when an engineering workstation had only 640kB of RAM and a 20MB hard drive was considered excessive. OK - they max out at 5000 buses now, when 20 years ago they could only handle 3000, but hey..
PTW's limitations are now based on the size of your wallet. If you want to go to 10,000 buses, just pay more money.
The issue is that the math has not changed from twenty years ago. Whether the problem space is load flow or short circuit, ultimately what you have is a matrix that has to be solved and depending on the type of power system analysis it can require an iterative calculation. Although matrix problems can be parallelizable, iterative calculations by nature are not since each iteration depends on the previous iteration. The general solution is obviously O(N^2) but in practice with extremely large matrices (over about 50+ rows & columns) the FFT algorithm can be used to reduce it to O(N log N). Also in practice the matrix is sparse and in most cases banded so it can be reduced to O(N) or very close to O(N) in practice but due to the way that small changes affect the overall results, it pushes for recalculating the entire matrix after every change which is computationally intensive. At best we may recognize that most of the nodes have no branches and converting each of these nodes into a single node so that the general matrix contains only buses with branches which requires the more complex general solver with the unbranched nodes materialized and solved separately. This would significantly reduce the overall size of the calculations involved.
To get better we would have to first accept that the results are all approximations anyways and given what we are doing, this should be a no-brainer. Then we can set a target of say <1% error for approximation purposes and calculate a shadow matrix that represents the effect that parameter changes to the matrix have on the overall result. Then as changes are made the algorithm can automatically determine the extent to which the change may affect the entire network. If the change results in less than 1% error then the change is not propagated. Otherwise, it propagates to the next part of the network. In this larger matrix the same approximation check is made again, which determines whether or not to 'bubble up' the result. The overall result would be a much more responsive system that could potentially even be implemented on lower end hardware (e.g. cell phones).
PTW allows you to do something like this in practice by doing a 'windowed' calculation where the recalculation is done on only a small part of a user-selected portion of the overall matrix. Doing a windowed update on say a single MCC occurs almost instantly compared to 20-30 minutes calculation time to update the short circuit estimate for a network of 3,000 nodes.
An inherent problem with what you are describing though, 'breaking it into smaller networks' is that you are manually doing what I'm describing. For instance if I have a large operation with a very stiff main/utility bus I can first convert the loads into lumped parameter models and simulate the 'main' by itself. Then I can treat each area branching off from the main/utility bus as separate networks.