ltx-2.3-spatial-upscaler distorts the results of the first generation step
Tried on workflows like FL2V and ic-lora (controlNet). After the first generation step, the last frame corresponds to the last frame, the movements correspond to the original video. After the second step, the final frame roughly corresponds to the prompt rather than the final frame. The coordinated dance movements turn into flailing arms and legs.
I strip the spatial and latent upscaling out completely. The upscaling process induces artifacts and body horror, and almost always changes the face of the subject. Once I stripped the upscaling out, these things never happen. Downscaling the image and then upscaling the video later takes the same amount of time as just generating the video at full resolution with none of the side effects. Also, to fix the slow-motion thing that LTX does, change your FPS to 25 instead of 24. I don't know why so many workflows set the FPS to 24 when the model was trained at 25 and 50. It supports 24 but you're going to get either slow or fast motion.