Open
Description
Expected Behavior
The load time should be optimally fast before #2666. The problematic line is this:
Lines 474 to 476 in 9f4daca
Actual Behavior
Load time is very slow, as shown in log it took a whole minute (I have added debug code). Not tested on which kind of big O growth is here, maybe exponentially.
with torch.inference_mode():
dynamic_prompt = DynamicPrompt(prompt)
is_changed_cache = IsChangedCache(dynamic_prompt, self.caches.outputs)
logging.info("Setting prompt for cache")
prompt_keys = prompt.keys()
for cache in self.caches.all:
cache.set_prompt(dynamic_prompt, prompt_keys, is_changed_cache)
cache.clean_unused()
logging.info("Setted prompt for cache")
Steps to Reproduce
Have a decently large workflow with over >50 nodes. Can test with https://perilli.com/ai/comfyui/
Debug Logs
2024-09-11 03:30:28,124- root:522- INFO- got prompt
2024-09-11 03:30:28,133- root:540- INFO- Validating prompt
2024-09-11 03:30:28,134- root:542- INFO- Validated prompt
2024-09-11 03:30:28,171- root:127- INFO- Executing prompt e81c5045-d2e6-49dd-abc2-2a22406aca33
2024-09-11 03:30:28,172- root:468- INFO- Setting prompt for cache
2024-09-11 03:31:42,298- root:473- INFO- Setted prompt for cache
2024-09-11 03:32:48,690- root:106- INFO- model weight dtype torch.float16, manual cast: None
2024-09-11 03:32:48,926- root:115- INFO- model_type EPS
2024-09-11 03:32:49,855- root:271- INFO- Using xformers attention in VAE
Other
14136356 function calls (9707273 primitive calls) in 65.405 seconds
Ordered by: cumulative time
List reduced from 221 to 50 due to restriction <50>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 65.405 65.405 {built-in method builtins.exec}
1 0.000 0.000 65.404 65.404 <string>:1(<module>)
2 0.000 0.000 65.404 32.702 caching.py:266(set_prompt)
3 0.000 0.000 65.403 21.801 caching.py:143(set_prompt)
2 0.000 0.000 65.402 32.701 caching.py:66(__init__)
2 0.008 0.004 65.402 32.701 caching.py:75(add_keys)
574 0.021 0.000 65.394 0.114 caching.py:85(get_node_signature)
18416 0.206 0.000 59.277 0.003 caching.py:93(get_immediate_node_signature)
7610 0.037 0.000 58.709 0.008 folder_paths.py:238(get_filename_list)
7610 0.247 0.000 58.608 0.008 folder_paths.py:216(cached_filename_list_)
116987 58.195 0.000 58.195 0.000 {built-in method nt.stat}
94282 0.116 0.000 50.780 0.001 <frozen genericpath>:53(getmtime)
4170 0.012 0.000 33.303 0.008 nodes.py:521(INPUT_TYPES)
3304 0.016 0.000 25.226 0.008 nodes.py:604(INPUT_TYPES)
22705 0.045 0.000 7.584 0.000 <frozen genericpath>:39(isdir)
1793074/574 1.903 0.000 5.950 0.010 caching.py:36(to_hashable)
435026/574 0.304 0.000 5.937 0.010 caching.py:44(<listcomp>)
137880/5716 0.287 0.000 4.651 0.001 caching.py:42(<listcomp>)
5125254/3109390 0.874 0.000 3.414 0.000 {built-in method builtins.isinstance}
1007932 0.453 0.000 2.746 0.000 typing.py:1304(__instancecheck__)
1007932 0.612 0.000 2.293 0.000 typing.py:1579(__subclasscheck__)
1007932 0.960 0.000 1.475 0.000 {built-in method builtins.issubclass}
1008346/1007932 0.267 0.000 0.516 0.000 <frozen abc>:121(__subclasscheck__)
1008346/1007932 0.227 0.000 0.248 0.000 {built-in method _abc._abc_subclasscheck}
574 0.001 0.000 0.143 0.000 caching.py:115(get_ordered_ancestry)
18416/574 0.062 0.000 0.143 0.000 caching.py:121(get_ordered_ancestry_internal)
18416 0.009 0.000 0.138 0.000 execution.py:37(get)
150866 0.092 0.000 0.132 0.000 graph_utils.py:1(is_link)
330/80 0.001 0.000 0.129 0.002 utils.py:346(new_func)
60 0.000 0.000 0.119 0.002 IPAdapterPlus.py:647(INPUT_TYPES)
40 0.000 0.000 0.113 0.003 execution.py:135(_map_node_over_list)
40 0.000 0.000 0.112 0.003 execution.py:149(process_inputs)
174788 0.090 0.000 0.090 0.000 {built-in method builtins.sorted}
21 0.000 0.000 0.081 0.004 nodes.py:2962(IS_CHANGED)
28032/1 0.044 0.000 0.080 0.080 copy.py:128(deepcopy)
2868/1 0.009 0.000 0.080 0.080 copy.py:227(_deepcopy_dict)
8 0.000 0.000 0.063 0.008 load.py:102(INPUT_TYPES)
5438/1182 0.008 0.000 0.061 0.000 copy.py:201(_deepcopy_list)
3 0.000 0.000 0.058 0.019 folder_paths.py:203(get_filename_list_)
7 0.001 0.000 0.056 0.008 folder_paths.py:145(recursive_search)
1 0.000 0.000 0.032 0.032 nodes.py:1374(IS_CHANGED)
53/24 0.001 0.000 0.029 0.001 <frozen os>:345(_walk)
18 0.023 0.001 0.023 0.001 {built-in method nt.scandir}
112054 0.022 0.000 0.022 0.000 <frozen _collections_abc>:315(__subclasshook__)
18804 0.021 0.000 0.021 0.000 {built-in method builtins.hasattr}
1 0.019 0.019 0.019 0.019 {method 'read' of '_io.BufferedReader' objects}
136222 0.018 0.000 0.018 0.000 {method 'append' of 'list' objects}
287 0.003 0.000 0.018 0.000 <frozen ntpath>:741(relpath)
8 0.000 0.000 0.017 0.002 nodes_upscale_model.py:18(INPUT_TYPES)
37693 0.016 0.000 0.016 0.000 graph.py:30(has_node)
Recorded by using cProfile
and wrap around the problematic lines.