Cooperative rendering failure

Reply

Cooperative rendering failure#149373

By Maxer - Wed May 03, 2006 4:09 pm

- Wed May 03, 2006 4:09 pm #149373

Well I was hoping to post some results from my last attempt at cooperative rendering but I've had some problems. I was rendering with 80+ nodes and have discovered some things that make using cooperative rendering with this many machines very difficult.

-The auto merge feature will only work if all the render nodes that started the job are online and working properly. Apparently cooperative rendering is unable to bypass nodes that have had problems or have gone off line.

-It's not possible to remove or add a render node to a job once it's been started. This makes job management very difficult.

-When you create render farm groups, those groups will only exist as long as the manager is running; once it is shut down the groups disappear.

-Some of my render nodes still show that they are rendering long after the target time has expired, this holds up the entire merger process until the last machine is finished. There should be a way to bypass those machines that are either still rendering or have errored out or gone down so that you can still get a merged MXI file.

-It isn’t possible to pause a job

-Once one job is finished you have to clear out all MXI and MXS files from the folder and re-boot the node in order to render a new scene. If you don’t do this the node will only render the fist scene it received, no new scenes can be rendered.

-There is no useful data generated by mxcl –d or the mxcl –manager as far as error reports so it’s almost impossible to know why a particular job failed on any node.

-There is no way to preview the cooperative rendering job since the image file being saved to the network directory seems to only come from a single render node and the merger process only occurs after the entire job has completed rendering.

-Node folders are left with hundreds of Megs worth of useless data in them that must be manually cleaned up after every rendering.

Over all my first large scale cooperative rendering test has been a complete failure. There is no way that anyone can effectively use cooperative rendering with a rendering farm of any size. There must be some kind of job management built into the system so that problems can be dealt with manually, but there also needs to be some automatic management. I would like to know how the A-team and NL tested this system and how they set it up in order to get around these problems because from where I stand there is no way this can be considered a usable process.

Official B-Team Member

http://devinjohnston.cgsociety.org/gallery/

#149473

By Maxer - Wed May 03, 2006 8:17 pm

- Wed May 03, 2006 8:17 pm #149473

These are the results after I manually merged everything together, this was about 40 nodes worth of MXI's rendered at 3000x2250 in a 9 hour period. I believe the highest SL level was 11. There is also an old MXI mixed in with the new ones, which is why there is a glowing teapot in the upper left.

Image produced by one node:

Image produced by 40 nodes:

Official B-Team Member

http://devinjohnston.cgsociety.org/gallery/

#149483

By aitraaz - Wed May 03, 2006 8:38 pm

- Wed May 03, 2006 8:38 pm #149483

Maxer wrote: There is also an old MXI mixed in with the new ones, which is why there is a glowing teapot in the upper left.

fantastic the glowing teapot...Dunno looks like coop needs some 'fixin, I'd suggest heavy drinking/tranquilizers while you manually merge the files right now. Looks like its gots potential, though

chester, i’ve got news for you, fort minor was amazing, and you suck.

#149495

By JesperW - Wed May 03, 2006 8:58 pm

- Wed May 03, 2006 8:58 pm #149495

I'm curious as to why the celing-hanging lamps have a black blotch in them, when the reflections show a bright one....

#149496

By Maxer - Wed May 03, 2006 9:02 pm

- Wed May 03, 2006 9:02 pm #149496

Good question, I have no idea what's causing that since there is an emitter material applied and it seem to be producing light. Could this be another bug?

Official B-Team Member

http://devinjohnston.cgsociety.org/gallery/

#149502

By b-kandor - Wed May 03, 2006 9:24 pm

- Wed May 03, 2006 9:24 pm #149502

In the meantime a script to copy all the nodes mxi files to a common location, renaming them at the same time might help ease the pain? Instead of batch files I use a program called allsync to both copy and then delete mxi's out of my nodes (I must be lazier than you since I only have 3 machines and they are all within 10 feet of me...)

Kandor

#149505

By Mihai - Wed May 03, 2006 9:32 pm

- Wed May 03, 2006 9:32 pm #149505

That is strange with the black emitters....btw, how many polys is each emitter? They look pretty high rez, which will add to the noise, especially since you have so many.

Maxwellzone.com - tutorials, training and other goodies related to Maxwell Render
Youtube Maxwell channel

#149510

By Maxer - Wed May 03, 2006 9:36 pm

- Wed May 03, 2006 9:36 pm #149510

Actually I just checked and they still have an old beta emitter material added to them which is probably why they aren't working right. Each one has 288 poly's, does the number of poly's affect the noise?

Official B-Team Member

http://devinjohnston.cgsociety.org/gallery/

#149515

By Mihai - Wed May 03, 2006 9:39 pm

- Wed May 03, 2006 9:39 pm #149515

Yes

Keep emitters as low poly as possible. If you can, keep them a single plane, but in this case i think a 9 poly sphere will be enough.

Maxwellzone.com - tutorials, training and other goodies related to Maxwell Render
Youtube Maxwell channel

#149517

By Maxer - Wed May 03, 2006 9:40 pm

- Wed May 03, 2006 9:40 pm #149517

Thanks Mihai

Official B-Team Member

http://devinjohnston.cgsociety.org/gallery/

#149519

By jdp - Wed May 03, 2006 9:43 pm

- Wed May 03, 2006 9:43 pm #149519

turn they lights off, it's noon after all...

btw the render is really really nice even if the teapot ghost is scary (it remember us all the pains we passed with max

).

I strongly support your effort trying to have this damn coop working... keep up!

Do you want me to sit in a corner and rust or just fall apart where I'm standing? | Marvin da paranoid android

#149534

By Maxer - Wed May 03, 2006 10:04 pm

- Wed May 03, 2006 10:04 pm #149534

OH

good news, I've determined that you don't have to erase the cooperative.mxi files from the node directories. As long as you restart mxcl -server you will be fine and can render out a new scene.

Official B-Team Member

http://devinjohnston.cgsociety.org/gallery/

#149616

By b-kandor - Thu May 04, 2006 1:19 am

- Thu May 04, 2006 1:19 am #149616

That's good to know - thanks.

Kandor

#149663

By daros - Thu May 04, 2006 2:51 am

- Thu May 04, 2006 2:51 am #149663

one question maxer... your final image has more or less noise as this one?

RC5 rendering on 30 AMD 4200 dual core using simple RGB sum.
no postprocess. this is the original size.
150 minutes rendering + 10 seconds to merge RGB data.

This one is more complex. Emitters have metal projectors for caustics beams.
RC5 rendering on 30 AMD 4200 dual core using simple RGB sum.
no postprocess. this is the original size.
300 minutes rendering + 10 seconds to merge RGB data.

This techonology dosn't works with V1 due to the too regular sampling patterns.

#149665

By daros - Thu May 04, 2006 2:57 am

- Thu May 04, 2006 2:57 am #149665

i know, with MXI sum the result vould be more correct. In your example it is beutiful to see how the energy is summed correctly.
The RGB sum has the advantage it is totally scalable.

Reply

Page 1 of 2
21 posts

1
2

Cooperative rendering failure#149373

#149473

#149483

#149495

#149496

#149502

#149505

#149510

#149515

#149517

#149519

#149534

#149616

#149663

#149665

Help with swimming pool water

render engines and Maxwell

Useful links

Join us on Twitter @MaxwellRender