Everything related to Maxwell network rendering systems.
User avatar
By zparrish
#393310
Since there's no official docs on this yet, can we cover the basics here in the forums? To start, can we get the following posted from anyone who is successfully using the tech Preview and can see all of it's UIs?: As a baseline, can we tailor the info to my render farm, which only has (10) total hosts. As reference, here's the basic layout of my hosts by IP:

My current setup:
Manager:
192.168.2.39

Nodes:
192.168.2.32
192.168.2.39
192.168.2.40
192.168.2.41
192.168.2.42
192.168.2.43
192.168.2.44
192.168.2.45
192.168.2.46
192.168.2.47

Items we should address on this topic:
1.) A working copy of the mxnetwork_config.ini file - I have a good idea as to what most of the settings in there do, I just want to make sure that I'm configuring them correctly. The defaults won't work with integrating my render hosts into the same subnet with a LAN shared directory. Here's the default for reference. In my case, I need to have all the hosts working on a 192.168.2.x/24 subnet, so using teh loopback address of 127.0.0.1 won't work:

Default
Code: Select all
[Node]
period = 5000
jobman_address = 127.0.0.1
jobman_listen_port = 45445
shared = C:\Users\ZPARRISH\AppData\Local\Temp\mxnetry855k6o
extractlights = False

[NewJobman]
publish_port = 45446
listen_port = 45445
listen_address = 0.0.0.0
publish_period = 10000

[NewParams]
shared = C:\Users\ZPARRISH\AppData\Local\Temp\mxnetry855k6o
listen_address = 0.0.0.0
listen_port = 45451

[client]
jobman_listen_port = 45445
params_address = 127.0.0.1
debug = False
params_listen_port = 45451
server_threads = 4
client_port = 9999
type = web
jobman_address = 127.0.0.1
jobman_publish_port = 45446
preview_size = 100
destroy_period = 900000
shared = C:\Users\ZPARRISH\AppData\Local\Temp\mxnetry855k6o
Proposed (for Manager / Node host @ 192.168.2.39)
Code: Select all
[Node]
period = 5000
jobman_address = 192.168.2.39
jobman_listen_port = 45445
shared = Y:\Shared
extractlights = False

[NewJobman]
publish_port = 45446
listen_port = 45445
listen_address = 192.168.2.39
publish_period = 10000

[NewParams]
shared = Y:\Shared
listen_address = 192.168.2.39
listen_port = 45451

[client]
jobman_listen_port = 45445
params_address = 192.168.2.39
debug = False
params_listen_port = 45451
server_threads = 4
client_port = 9999
type = web
jobman_address = 192.168.2.39
jobman_publish_port = 45446
preview_size = 100
destroy_period = 900000
shared = Y:\Shared
Proposed (for Node host @ 192.168.2.47 and all other Node hosts)
Code: Select all
[Node]
period = 5000
jobman_address = 192.168.2.39
jobman_listen_port = 45445
shared = Y:\Shared
extractlights = False

[NewJobman]
publish_port = 45446
listen_port = 45445
listen_address = 192.168.2.39
publish_period = 10000

[NewParams]
shared = Y:\Shared
listen_address = 192.168.2.39
listen_port = 45451

[client]
jobman_listen_port = 45445
params_address = 192.168.2.39
debug = False
params_listen_port = 45451
server_threads = 4
client_port = 9999
type = web
jobman_address = 192.168.2.39
jobman_publish_port = 45446
preview_size = 100
destroy_period = 900000
shared = Y:\Shared
2.) An example URL for accessing the Monitor based on a working configuration?

3.) Some screen shots of the UIs working properly. - This will be a good target for people trying to get this off the ground.

4.) (Optional) Execution strings used to launch the components for a working setup.
- If there are specific command line flags needed to launch the components correctly so they target the correct subnet and related hosts, please include those for others to follow along.

Thanks!
#393317
Hello, lets try to make this work.

Before starting

Config files have options for both manager and node, depending on what you start up one or the other will work.

Shared parameter refers to the local mounting point of a shared storage

All listen_address can be set to 0.0.0.0 to listen on all interfaces

You can run every separate module (jobman, params, client and node) on a different computer, however the most common setup is to have a manager running everything and the additional render nodes.

If no config file is found mxnetwork creates one and fills it with default data. This default data will almost never work out of the box as the shared directory will point to a temporary file. When using the program without GUI you can override the configuration files on the command line.

Launching with GUI

I'll be using two computers, one manager+node and another additional node. One computer is linux and the other windows10

On the manager+node (Linux) the config is this:
Code: Select all
[client]
shared = /home/guillermo/shared
preview_size = 100
jobman_publish_port = 45446
jobman_listen_port = 45445
params_address = 127.0.0.1
client_port = 9999
debug = False
server_threads = 4
type = web
destroy_period = 900000
params_listen_port = 45451
jobman_address = 127.0.0.1

[NewJobman]
publish_port = 45446
publish_period = 4000
listen_address = 0.0.0.0
listen_port = 45445

[NewParams]
shared = /home/guillermo/shared
listen_address = 0.0.0.0
listen_port = 45451

[Node]
jobman_listen_port = 45445
shared = /home/guillermo/shared
extractlights = False
period = 3000
jobman_address = 127.0.0.1
The shared directory is local to this machine so I can use /home/guillermo/shared for all the components, same for the ip, where you can leave 127.0.0.1.

Executing the manager exposes the webpage at http://127.0.0.1:9999 (client_port)


On the node the configuration is:
Code: Select all
[Node]
shared = Y:
jobman_address = 192.168.0.164
jobman_listen_port = 45445
extractlights = False
period = 5000
We don't need the other components and we can copy this file to any other node that has the shared storage mounted as drive Y:

Screenshots:

Image

Image

Image


Launching with command line

The command line works in two modes:

Bundle: The commandline can launch several components at once taking the configuration from the same config file we used earlier. There are two bundles, manager that launches the jobman, params and web client and local that launches a manager+node.

Components: we can launch components overriding the configuration files with command line parameters. You can show the help for every command adding -h at the end of the line.

Summing up, to launch the manager we execute:
mxnetwork.exe manager

Then on every node we can execute:
mxnetwork.exe rendernode -jobman_address 192.168.0.164 -shared Y:

Hope this helps with your setup.

Greetings
User avatar
By zparrish
#393373
Thanks Guillermo! Thanks to your simplified config file suggestion, I was able to get this working. It's still not quite ideal in my setup since Kivy is unable to create the UIs on my render nodes since I'm launching them via an RDP connection with no hardware acceleration. Also, I did have to migrate the manager functionality from a host on my render farm to my workstation. That was needed for Kivy to generate the necessary UIs to load the Monitor's web server correctly.

I did find a strange loop hole to the RDP session vs a direct console session. Most of my render farm hosts are configured to auto-login to their local Windows Account. I have a batch file in the "Startup" folder for that local user that sets up my network shares and launches the designated Maxwell Network Tool (in this case its "C:\Program Files\Next Limit\mxnetwork-windows-0.9.3\dist\windows\node.exe"). Since the user is auto-logged in, their session is in the console and not an RDP session. This changes the graphics capabilities of the host to allow full access to any type of hardware acceleration and graphics API it has available. When I RDP into one of those auto-logged in hosts, I can see the Kivy based UI just fine. If I close the Node UI, and then relaunch the application, I just see the command shell that includes in its output a message about how it was "Unable to find any valuable Window provider at all!" followed by "sdl2 - RuntimeError: b'No matching GL pixel format available'". This has to be related to the limitation of an RDP session to fully utilize hardware acceleration. I am running one of the newest nVidia drivers on those hosts (375.70 specifically), and I know I've seen changelog lines from nVidia about extending advanced functionality to the GPU over RDP. Also, I believe I'm using the latest RDP client available for Windows 10 and I'm pretty sure that there's no update available for my Windows 7 hosts' RDP protocol support.
Code: Select all
[INFO              ] [Logger      ] Record log in C:\Users\mktgdev\.kivy\logs\kivy_16-11-23_1.txt
[INFO              ] [Kivy        ] v1.9.1
[INFO              ] [Python      ] v3.4.4 (v3.4.4:737efcadf5a6, Dec 20 2015, 20:20:57) [MSC v.1600 64 bit (AMD64)]
[INFO              ] [Factory     ] 179 symbols loaded
[DEBUG             ] [Cache       ] register <kv.lang> with limit=None, timeout=None
[DEBUG             ] [Cache       ] register <kv.image> with limit=None, timeout=60
[DEBUG             ] [Cache       ] register <kv.atlas> with limit=None, timeout=None
[INFO              ] [Image       ] Providers: img_tex, img_dds, img_gif, img_sdl2, img_pil (img_ffpyplayer ignored)
[DEBUG             ] [Cache       ] register <kv.texture> with limit=1000, timeout=60
[DEBUG             ] [Cache       ] register <kv.shader> with limit=1000, timeout=3600
coudnt change process name
Node
----------------------------------
<class 'core.rendering.node.Node'>
{'period': '5000', 'shared': 'Y:', 'extractlights': 'False', 'jobman_address': '192.168.2.32', 'jobman_listen_port': '45445'}
1st hostname:mktgCGI-RN12
[INFO              ] [OSC         ] using <thread> for socket
[INFO              ] [Window      ] Provider: sdl2
[CRITICAL          ] [Window      ] Unable to find any valuable Window provider at all!
sdl2 - RuntimeError: b'No matching GL pixel format available'
  File "site-packages\kivy\core\__init__.py", line 67, in core_select_lib
  File "site-packages\kivy\core\window\window_sdl2.py", line 138, in __init__
  File "site-packages\kivy\core\window\__init__.py", line 722, in __init__
  File "site-packages\kivy\core\window\window_sdl2.py", line 237, in create_window
  File "kivy\core\window\_window_sdl2.pyx", line 133, in kivy.core.window._window_sdl2._WindowSDL2Storage.setup_window (kivy\core/window\_window_sdl2.c:2314)
  File "kivy\core\window\_window_sdl2.pyx", line 55, in kivy.core.window._window_sdl2._WindowSDL2Storage.die (kivy\core/window\_window_sdl2.c:1483)

[INFO              ] [Text        ] Provider: sdl2
[CRITICAL          ] [App         ] Unable to get a Window, abort.
 Exception ignored in: 'kivy.properties.dpi2px'
 Traceback (most recent call last):
   File "site-packages\kivy\utils.py", line 513, in __get__
   File "site-packages\kivy\metrics.py", line 175, in dpi
   File "site-packages\kivy\base.py", line 126, in ensure_window
 SystemExit: 1
[DEBUG             ] [Cache       ] register <textinput.label> with limit=None, timeout=60.0
[DEBUG             ] [Cache       ] register <textinput.width> with limit=None, timeout=60.0
 Traceback (most recent call last):
   File "mxcloud\launchnode.py", line 12, in <module>
   File "mxcloud\gui\nodeApp.py", line 47, in __init__
 AttributeError: 'NoneType' object has no attribute 'clearcolor'
Failed to execute script launchnode
coudnt change process name
2016-11-23 08:43:09,575 - Node - DEBUG - Process started
2016-11-23 08:43:09,584 - Node - INFO - hostname: mktgCGI-RN12
2016-11-23 08:43:09,585 - Node - DEBUG - Starting event loop
I'll do some testing and report back my findings. I think it might be ignoring certain scene settings when submitting the network render too. I don't have a solid example of that yet, which is why I still need to verify this. This UI issue is still on the table though until some nice workaround can be found (if possible).

At any rate, thanks for your help Guillermo! I finally successfully submitted and completed a cooperative render in Maxwell v4!
User avatar
By dk2079
#393374
Hi Zack,

your feedack reports are always an interesting read!

I saw that you are struggling with getting HW acceleration on the gpu over RDP.
I was having a similar problem, when trying to rdp into my workstation from home and there is a little hack to still get it working, similar to what you described when the application is auto launched via the .bat file

basically you use a .bat file to disconnect and start the OGL applications on the console and then logIn again.
Haven't used it in a while, as I switched to a teradici interface card for remote access, but I did a quick google and I think the procedure seems well explained here:

https://social.technet.microsoft.com/Fo ... inserverTS

maybe it is helpful for you until there is a better fix.

cheers,
Daniel
User avatar
By zparrish
#393515
Hey DK,
I'm glad you don't mind my often long winded posts :) I like to be thorough and specific because I'm absolutely horrible at reading between the lines and making "correct" assumptions on limited information. It drives me crazy (more so than usual :D that is).

I had stumbled across the ".bat" file approach but I haven't tried it yet. It's a pretty slick approach, but I think the older I get, the less inclined I get to build my workflow around hacks and workarounds. I've been quite blessed over the past few years in gaining the trust of my employer to scale up our rendering capacity as a company (much thanks to Maxwell, and I do mean that very sincerely!). Now I'm transitioning into a more oversight role, so I need to find standardized, compatible solutions so I can proportionately spend more time planning and delegating with much less time fixing, hacking, and patching.

One of my render hosts actually fails the "GPU" check by the TP Render manager, even though it has the GT-710 installed. The reason is that its on a Supermicro X9DRT-F board (https://www.supermicro.com/products/mot ... 9DRT-F.cfm) and I need to be able to remotely manage it via IPMI / iKVM when there's an issue. To do that, I have to have the BIOS set to prioritize the onboard, BMC VGA adapter 1st, so even in a console session, it still recognizes that current display adapter doesn't meet the graphics API requirements unless I actually hook up a real monitor to the GT-710 (which I can't make permanent in our IDF where these are installed). It's not too a big of a deal as long as I can get the Node application to just run and respond to a remote manager since I can only really do CPU rendering on my farm. I can deal without the actual UI if needed.

Hopefully the networking components available for version 4 (both standard and TP) get the bugs worked out soon so I'm not perpetually screwed. Not gonna lie, the last several weeks have been majorly stressful knowing that I can't run Maxwell in any production level capacity. The saving grace to it all is that rendering is only a subset of my job duties. I have run into separate issues with 0.9.3 maintaining a pool of active hosts mid cooperative render, but I'll start a new post for that so the info doesn't get burried in this post.

Thanks again DK!
Sketchup 2024 Released

Any idea of when the Maxwell Sketchup plugin will […]

Will there be a Maxwell Render 6 ?

Let's be realistic. What's left of NL is only milk[…]