Everything related to Maxwell network rendering systems.
By numerobis
#351088
I don't know when this problem occured the first time - maybe after the 2.5 update - but almost every time i start a coop job with 4 or 5 nodes one or two of them are transfering the mxs and dependencies VERY slow and these nodes are starting up to 50min later then.
It seems that it begins to copy the mxs (the mxs appears in the temp directory of the node). The node is showing "New job order received"; "Connection for scene file reception." and then the mxs-transfer seems to be paused. Then, after maybe 15-30min (sometimes earlier) the mxs-transfer is completed and the texture transfer starts, but still much too slow - it takes 20-30min to copy 200-300MB of textures.
It's not the same node all the time, but changing.

:?:

manager Win7x64, nodes WinXPx64, Kaspersky Internet Security running on all machines
all with onboard lan 1Gbit
switch Netgear GS716T 1Gbit, all energy saving settings disabled


since it looks like i'm the only one facing this problem, are there any known network settings that should be made?!? :?
#351169
yes... no difference :(

but it seems that i'm the only one with this problem... so i think it must be network related

btw. all energy saving settings of the network cards are disabled as well.
#351658
I had a similar issue and it took me a while to get to the root of it. I am using all win 7 x64 machines and have a serverless win home group setup running. Only after switching to static IP addresses did the problem with the nodes resolve. (was on DHCP before). Perhaps that helps. Make sure the firewalls are off.

Cheers
#351676
I spoke too soon...sigh. Network was only stable for 24 hours... My nodes are erratic again...I can see them, all access paths are clear, yet one of 3 nodes always doesn't receive the .mxs file and stays there. Worst is that it's not consistently the same node. It keeps changing...
#351679
so we are already two... welcome to the club! ;)

i have the whole network running with static ip addresses. And disabling KIS didn't make a change. Maybe there is still something running in the background... i don't know. i think i'll try to disable all kaspersky related services or maybe uninstall it.
But i don't think this will help, because it was working for 5-6 months with KIS enabled (with 1-2 nodes in the network). As i said before, i don't remember exactly when it stopped to work... maybe after i added the third node.
#351854
I have 3 nodes too... It worked flawlessly with two before. now it does work 2 out of 10 times. Sigh. There is nothing more I can change. Real frustrating. Must be a win 7 over control issue. I can see and access all relevant folders. The nodes connect, no issues there but one of them (random, not the same all the time) won't start. No license issues either, all installed correctly.
By numerobis
#352044
i think this can't be the problem. They have 12 or 16 GB and the biggest job took only ~9-10 GB in memory. They are rendering it, but with a huge delay, depending on the size of the submitted data (~1 hour later with 660MB for scene and textures)
There must be a problem with the connection. It's like it gets throttled to 10 mbit for maxwell. But it's showing 1Gbit and the file transfer in win explorer is ok. I have bought an Intel PCIe network card which i'll try tomorrow.
User avatar
By Tim Ellis
#352047
Out of 26 nodes I have had one then two which slowed to a crawl. X64 XP Pro on each and every time it was the same two nodes. Updating these two to the latest Windows update solved it for a while but then they returned to being slow. 2.5.1 and 2.6.0 Maxwell seem to have been the same.
Here are a few things you can try:-
Defragging, disk checking, purging temp folders and files can help, as can an uninstall and reinstall of Maxwell.
Running a small co-operative render scene, with a single sphere rendered at low res to a high SL will help flush nodes of large files. If the slow nodes are not affected then it's possibly a RAM fault or could be a slow/loose network connection.
Try pinging each node and compare times, this can help identify a slow connection.
Try turning off send dependencies and pack and go your scene to a network folder accessible to all nodes. That might help too.
Check the Virtual memory settings for the slow nodes too that they are matching, however this can only help if it's the same nodes which are hanging.
Run the nodes in deep debug mode to see if theres any hidden issues not being reportred in the main console windows for the Mxnetwork rendernodes. In any Mxnetwork object's Preferences, select Deep Debug from the Minimum Log Verbosity drop down menu and restart Mxnetwork.

Tim.
#352054
For win 7 networks without server, here is what works for me. Boot all machines and manually set all clients network settings to "home network" make sure all firewalls are off. Make sure all network paths are reconnected. Test that they work. That gets all my nodes to connect correctly. Windows 7 resets/looses the network settings every time the machines are switched off and that causes problems on restart. I have consistency now. Nodes download and start correctly that way. Disconnects only happen when ram overflows in my case which requires the frame sizes to be lowered. I do render very large single frames in excess of 5k x 10k at times so ram tends to be close to capacity at 80-90%. leave some margin to avoid disconnects.

Cheers and happy render.
#367270
The problem is still there...
I have changed the Server OS from WinXP to Win7 last year and thought it was better afterwards, but now i think that nothing really changed. It can start normally for some jobs and then 1-3, sometimes up to 6 nodes are missing.
If i restart the node, it starts rendering immediately but doesn't get added to the job. It is still listed as "ready" then. Merging ignores these nodes.
And it is not possible to add them later manually to the job (which is VERY annoying btw.)
I think i'll change all nodes to win7 too and see if this helps - my win7 nodes are normally connecting without any delay (sometimes not but maybe this is another problem) :roll:
#372575
I just wanted to inform you that the problem seems to be solved (almost) after upgrading all nodes to Win7 a few months ago. Now there is only one specific node which is constantly starting delayed... i don't know why, because the OS image is the same like on two other nodes. I think i will compare the bios setting another time to see if there is something different. But at least there is only one left now... :lol:
User avatar
By polynurb
#372583
numerobis wrote:Now there is only one specific node which is constantly starting delayed...
check the nic!

i had a similar problem, one node would not accept RDP requests anymore.. it turned out the nic was broken (or a part of)
Help with swimming pool water

Hi Andreas " I would say the above "fake[…]

render engines and Maxwell

Other rendering engines are evolving day by day, m[…]