Page 1 of 2

Network Render Crashing

Posted: Mon Jan 06, 2014 7:47 pm
by kami
Hello

I would really appreciate it, if the network node (and manager) would not crash on every error message ...
For example, I am starting the render node with the computer startup. And if it does not find the license right in the beginning because the network takes a few seconds longer, it shows an error message. The only thing you could do, is to click 'ok' the node will be killed.
If it would not show that message the instance it starts, but instead retry a few minutes later, then everything would be working perfectly.

The same goes for almost all the other error messages ... most of them will close the node when you click on the message. Why is that?
Sometimes it even closes a node which is already running fine on another job ...

Re: Network Render Crashing

Posted: Tue Jan 07, 2014 5:20 pm
by polynurb
not sure about the other errors but this might help with the network delay on startup:

http://www.r2.com.au/page/products/show ... p-delayer/

also it should be possible to use the windows task scheduler with the "delay task" option after login to start the node after a few minutes or so.

create task>new trigger>

Re: Network Render Crashing

Posted: Tue Jan 07, 2014 7:43 pm
by kami
thanks, I'll try that out!

Re: Network Render Crashing

Posted: Fri Jan 10, 2014 1:20 pm
by kami
I did find a good workaround for the delay problem (I'm running a batch script to start all network connections a bit delayed and then the network render. Works fine!)

And I was a bit too soon with complaining about the network render, because so far it seems to be working much better than with V2, I have much less crashes. But I'd really like to improve the crashes that remain, especially when they are unnecessary.

So for example I had this error message on the manager machine
There are error messages in the log starting at:[08/January/2014 18:00:40] 19_hycham has failed while sending/receiving dependencies. Trying to recover.. You can find more info using the search tools in the log window.
The manager is running fine, even with that message. But once I click on the 'OK' it closes the manager completely, leading to a loss in the currently running job. I don't understand why just clicking on the error message causes a crash on the otherwise running fine manager.

Re: Network Render Crashing

Posted: Wed Jan 22, 2014 1:33 pm
by kami
I've got a regular crash on one of my nodes and have no idea, why this node is always crashing. The log shows:
Code: Select all
[22/January/2014 11:51:04] This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. Signalhandler. Code: 22 
[22/January/2014 11:51:05] ERROR: Render process failed! Exit code: 3
Do you have any idea what is going wrong there? (Maxwell 3.0.0.1)

Re: Network Render Crashing

Posted: Wed Jan 22, 2014 2:02 pm
by dariolanza
Hello Kami,

Please try the latest Beta version (3.0.0.3) and make sure you launch your jobs with the "Send Dependencies" option disabled UNLESS your network were a mix of OSX and non-OSX computers (in case or pure-OSX, pure-Windows, pure-Linux or mix-Windows-Linux the "Send Dependencies" option must be disabled).

Let me know about your tests with the 3.0.0.3 Beta.

Greetings

Dario Lanza

Re: Network Render Crashing

Posted: Wed Jan 22, 2014 5:28 pm
by kami
Hey Dario
Thank you. I'll try that. But as I have to install the latest beta on 20 computers, I am not able to do every beta update :) and wanted to wait for the next official release
Best,
Christoph

Re: Network Render Crashing

Posted: Mon Feb 10, 2014 5:51 pm
by kami
Beta is installed and this problem no longer persists. thx!
But I have another crash: For example when you launch a job, and then you immediately notice that you did something wrong when exporting, so you click stop, before the rendering has started on all the nodes. Now, it creates an error message on all the node computers which did not start rendering. when you click on that message, it kills the render node for no reason.
The other problem I have from time to time is, when one computer fails sending the mxi to the manager, the manager will never merge all the other files ... Sometimes I even had a wrong mxi file from another job in the job folder. Then he failed merging the files as well.

Re: Network Render Crashing

Posted: Tue Feb 11, 2014 6:34 pm
by kami
Another bug with the network render:

I have a job running (using 5 nodes) and I add another job with those 5 busy nodes plus three additional nodes. Sometimes it works, that he starts the new job with just the three nodes, but sometimes it queues the new job as 'pending' without starting with the free nodes. I never understood why it sometimes works, and sometimes don't.
But I have a suspicion: It does not work if there is another rendering pending and you add one ... for example:
job 1 - rendering with 5 nodes
job 2 - pending with the same 5 nodes
job 3 - pending (with the same 5 nodes + the additional 3 nodes that are not starting).

Now I did a funny test. I added a node to the job #2. Now it states:
job 1 - rendering with 5 nodes
job 2 - rendering with 1 node
job 3 - still pending

Now I clicked on the job three and added one of the three previously selected nodes, now it starts rendering, but with all 3 previously selected nodes!
job 1 - rendering with 5 nodes
job 2 - rendering with 1 node
job 3 - rendering with 3 nodes

So my assumption is, when you have at least one pending job in the queue and you add another job, it won't start to render. Even if you assign free nodes to it.

Re: Network Render Crashing

Posted: Tue Feb 11, 2014 6:55 pm
by Mihai
Just to clear things up a bit, how many nodes do you have in total?

Re: Network Render Crashing

Posted: Tue Feb 11, 2014 7:01 pm
by kami
in the example above, I spoke about 9 nodes in total.
Overall I have 21 nodes running, so 12 of them not rendering.

Re: Network Render Crashing

Posted: Tue Feb 11, 2014 7:17 pm
by Mihai
kami wrote: So my assumption is, when you have at least one pending job in the queue and you add another job, it won't start to render. Even if you assign free nodes to it.
It's a bit unclear to me from your tests above, in which case you actually did this?

Could you please do this test:
Add a job1 using 5 nodes
Add a job2, also using these same 5 nodes (now this job is set to pending, which is normal)
Add a job3, use 4 of your free nodes out of those 9, and ONLY free nodes

The job3 is still set to pending?

Re: Network Render Crashing

Posted: Wed Feb 12, 2014 11:01 am
by kami
Yes. Job 3 is still pending.
I did it twice, once using only the 4 free nodes and once all 9 nodes - no difference.

You cannot make job 3 rendering, even if you add additional nodes to it while it is pending.
But, if I add node #10 to the second job, this one will start with that node and is no longer pending. Then I add node #11 to job 3, which means job 3 starts with node #11, plus all the other previously assigned free nodes at once.

Re: Network Render Crashing

Posted: Wed Feb 12, 2014 7:54 pm
by Mihai
Aha ok, so adding a free node to a pending job, makes the other pending nodes "unstuck"? Thanks a lot for the tests, we will take a closer look...

Re: Network Render Crashing

Posted: Wed Feb 12, 2014 10:10 pm
by kami
yes, but only if there is no other job pending in front in the queue.
thanks