All posts relating to Maxwell Render 1.x
User avatar
By Bubbaloo
#286530
The main issue for me is that you are not getting 100% from your hardware due to the way the coop mode is designed (merging different mxis together to improve SL) so that each node anly improves quality by a percentage rather then rendering regions for example.
I believe you are incorrect on this point. 2 nodes with the same specs rendering an mxi for 1 hour is the same as rendering for 2 hours on only one of those nodes. 5 nodes at 1 hour = 1 node at 5 hours, etc. While the addition of nodes improves render speed linearly, the increasing of S.L. works exponentially, creating the false perception of merged mxi's S.L. being lower than expected.
User avatar
By frosty_ramen
#286532
So Brian your saying that merged renders have a low sl level, but the sl is not necessarily correct?
User avatar
By Bubbaloo
#286534
Nope, I'm saying that S.L. 10 + S.L. 10 = +/- S.L. 12 where the S.L. is correct but could give you the idea that it should be higher.
User avatar
By Mihai
#286539
Brian is correct! :D

If one node takes 2 hours to go from SL 10 to SL 12, two nodes each one rendering for 1 hour will also reach about SL 12 in the final merged mxi. So the speed gain is almost linear with more nodes you add.
User avatar
By deflix
#286549
Can you clarify...

Are you saying that to achieve an image at a given SL it is just as efficient to render in Coop as it would be to render seperate regions on seperate nodes and put them together?

As an example a 2000x2000 image at SL 10 with 4 nodes:

In coop you have each node rendering the full 2000x2000

and having to hit SL 7.0
(roughly - dont know the relationship)

is the same as rendering an image 500x500 at SL 10.

>> has this been confirmed/calculated?

---------------------------------------------------------------------- by the way if it is correct thanks alot for clearing that up Bubba!!
By casey
#286558
deflix wrote:Are you saying that to achieve an image at a given SL it is just as efficient to render in Coop as it would be to render seperate regions on seperate nodes and put them together?
<technical rambling>

This is actually central to the "unbiased" part of "unbiased renderer".

Unbiased renderers produce pixel samples which are averaged together to produce images with less and less noise. As long as two separate renderings were done with different random seeds, averaging them together is no different than doing a single render for twice the time.

That is the theory.

Now, granted, in practice, it may actually be the case that you see a small speed boost if you render, say, four tiles of 1000x1000 instead of four 2000x2000 images. The reason for this is not theoretical, but practical, and that is because if the tiles become small enough, they may lead to better cache coherency and reduce the degree to which Maxwell is waiting on memory accesses.

As an extreme example, suppose you rendered an image so large that it was actually larger than the core memory and had to use your systems virtual memory to store it. For example, suppose a machine had only 256 megs of memory in it, and you were rendering a 256 megabyte image. Obviously it cannot fit that, plus the OS, plus the scene, plus the textures, etc., so it is going to be swapping out to disk all the time.

This would drastically reduce the rendering efficiency, because now there's actually hard drive reads and writes involved in updating the image.

As a result, rendering that image as four 64 megabyte images would dramatically improve the rendering speed, because now there would be no VM swapping.

Make sense?

The same rules apply to CPU caches and main memory, just to a much less dramatic extent. So whether or not it would be an important speed win to do small tiles is something that we'd have to do extensive testing to determine. My intuition is that the tiles would have to be small, like 512k or one megabyte or something, before you would see a worthwhile speed improvement.

Since Maxwell is presumably keeping things around in floating point color, we're talking chunks of 128x128 or 256x256 at most.

In general, regardless of all other concerns, the entire improvement possible from cache optimization of this form would depend on the amount of time Maxwell actually spends waiting for memory accesses to read, modify, and write back pixel color computations. It is entirely possible that this amounts to a negligible amount of time (2%, 3%, 5%, etc.), in which case no matter how much of a win the tiles were, you're not going to see a speedup that you actually care about.

Since I don't work for Next Limit, I don't know what that percentage is, but someone there probably does :)

</technical rambling>

- Casey
User avatar
By Bubbaloo
#286560
In addition to that, what if you break up the rende into chunks and render over the network to computers with different cpu speeds? When the final image is merged, wouldn't you be able to see patterns in the noise from the differing sampling levels reached?
By medmonds
#286565
My main request for network rendering has to do with the workflow. I'd like to be able to be working in my 3D app of choice, initiate a render from the Maxwell plugin, and have the Maxwell manager automatically see the job, connect any available servers (which really should be called slaves or clients :) ), send it out to be rendered, and then pop up a status window displaying the latest combined image at requested intervals (SL 2,4,6,8,1o,etc), with a button to cancel the job if desired.

Has this already been requested?
User avatar
By deflix
#286676
Thanks for this - very interesting and informative.
Definately agree about the workflow not least having a 'kill' button for cancelling coop renders quickly...but was really in the dark about the efficiency of coop mode until this post so thanks again!
User avatar
By deflix
#287211
Have to raise this once again - sorry its just not feeling right still!

We have to produce high-res images here at the studio and I'm desperately trying to get Maxwell to work as an integral part of the company - faith and all that! - but so so frustrated with net rendering. Appretiate that coop DOES use all power but have discovered another reason for strip rendering implementaion instead: memory/resolution.

I.E if rendering a large image (4000x3000) for example we quickly run out of memory when using multilight - this wouldsurely be over come if each node dealt with a strip instead?

There just seems something not quite right about having so many vast MXi files to deal with (2 per node) especially if a node becomes 'lost' and they need to be manually merged.

case in point -- Today I have so far spent 3 hours copying 4.6gb mxi files from each node to the server in order to merge them together as a node was 'lost' last night. The process is mind-bendingly tedious!

Net rendering really needs to be improved. desperately. I think maxwell depends on it.

f
User avatar
By polynurb
#287222
i think a possibility to net render from the plugIn would be very helpful indeed!
Or, which i think is not possible atm. assign netrender jobs over command line via batch files.

a partial workaround:
something like this has been posted before,.. my version look like this:

using Xcopy command

http://support.microsoft.com/kb/240268

R,S,T are networkshares of the node's temp folders
E is a scratchdrive in my workstation here called NODE4
/y is force overwrite switch


..........................................................................................
xcopy R:\cooperative.mxi E:\temp\coop\NODE1.mxi /y
xcopy S:\cooperative.mxi E:\temp\coop\NODE2.mxi /y
xcopy T:\cooperative.mxi E:\temp\coop\NODE3.mxi /y
xcopy "C:\Documents and Settings\YOUR_USER\Local Settings\Temp\cooperative.mxi" E:\temp\coop\NODE4.mxi /y

..........................................................................................

save this text as file eg. " coop.bat ", when you double click on it it executes the xcopy command.

gets you all coop mxi on your local machine, which is much better for merging than the coop preview button in the mxcl, because all files are lokal on fast hdd, not network which merges quicker. also you get to use multilight feature etc. while rendering.

it is a very efficient and fast way, also for big multilight images.
My guess would be that you don't gain that much ram/speed by just decreasing the resulution (stripes), as mxcl still has to load the entire scene&textures (actually more often with stripes), and all calculations remain at the same complexity.

..ram is a problem though.. it tends to vaporize with multilight and high res.. :roll: i am just thinking about upgrading from 4 to 8 gb on the nodes..
User avatar
By tanguy
#287223
Hum, Maxwell and network rendering is not so trivial, definitively, but there are some general rules to keep in mind...


First: Generic Network Rendering Rules

the Maxwell repository (where you write data) has to be a Server core technology, because it s the only way to override the 10 simultaneous connections (5 for XP and 10 for XP Pro). Same thing for the bitmaps and data servers.
The Manager has to be ran also on a server OS.
The servers can run on XP pro or XP 64 without any probs ( for the linux part, i am not enough aware, so...)

Second: Specific to Maxwell Render

Typically, professional needs to run bid scene or big size, so it s often under a 64bit technology to bypass the 3Gig ram limitation.
So the Servers has to run under XP64 and this force the manager to be also in 64bit... Same thing for the Files servers, it s safer than a mix between 32 and 64.
The nodes have to be quite the same power ratio, like only 8 cores or 8 core and 4 core of the same power...

Our Bitmaps servers are connecting to the Render node through a fiber optics, and our Workstation through the GigaBit LAN, and the node check the bitmaps 200x faster than our Workstation...
Ex: Dual 4core XEON 54XX and Mono 4core i7 for example... Running drastic different computation power node is a wast of time and efficiency...
When the stronger finished, they have to wait for the slowest, it could take a couple of time...

Ex: Dual 4 cores XEON 54XX run under 5 hours the 15SL and have to wait 5 or 6 hours the Mono 4 cores XEONS 54XX termination.



That what we find and think here after many tests stages.
User avatar
By Bubbaloo
#287224
I thought that when you set a target S.L. for coop renders, it rendered until it reached approximately the combined preset S.L. therefore bypassing the issue of waiting on slower nodes.
User avatar
By polynurb
#287225
Bubbaloo wrote:I thought that when you set a target S.L. for coop renders, it rendered until it reached approximately the combined preset S.L. therefore bypassing the issue of waiting on slower nodes.
that is what i thought to have experienced as well... however most of the time i stop the jobs manually, or set a target render time, maybe it behaves strangely with many/different nodes... i just run 4 machines identical in speed.
OutDoor Scenery Question

Hi Ed, I wouldn't class myself as a Maxwell Pro, […]

fixed! thank you - customer support! -Ed

Hello dear customers, We have just released a new[…]

Hello dear customers, We have just released a new[…]